Convert PDF to PNG using Aspose.PDF Java Font appears garbled

zhuw · April 30, 2025, 8:53am

The font of the content in the PDF to PNG conversion is SourceHanSansHWSC-Regular.otf font, and the PNG image is garbled after conversion.
Use font package for Source Black OTF
If the font is correct when converting Word to HTML, but the file layout is incorrect, it indicates that the server can convert fonts normally
湖北大峪口化主有限责任公司大峪口磷矿160_万吨年地下采矿工程项目智能化矿山建设交流方案20250413.docx (10.9 KB)

Professionalize.Discourse · April 30, 2025, 8:54am

@zhuw

When converting a PDF to PNG using Aspose.PDF for Java, if you encounter garbled text, particularly with specific fonts like Source Black, it may be due to font embedding issues. Here are some steps you can take to address this problem:

Ensure Font Availability: Make sure that the Source Black font is properly installed and accessible on the server where the conversion is taking place. If the font is not available, Aspose may substitute it with a default font, leading to garbled text.
Embed Fonts in PDF: If possible, ensure that the PDF document has the fonts embedded. This can help maintain the correct appearance during conversion. You can check this by opening the PDF in a viewer that shows font information.

Use the Correct Conversion Code: Below is a sample code snippet for converting PDF pages to PNG images using Aspose.PDF for Java. This code ensures that the conversion is done at a specified resolution, which can help improve the output quality:

import java.io.IOException;
import com.aspose.pdf.Document;
import com.aspose.pdf.devices.PngDevice;
import com.aspose.pdf.devices.Resolution;

public class ConvertPDFPagesToPNGImages {
    public static void main(String[] args) throws IOException {
        convertAllPDFPagesToPNGImages();
    }

    public static void convertAllPDFPagesToPNGImages() throws IOException {
        // Open the PDF document
        Document pdfDocument = new Document("input.pdf");
        // Loop through all the pages of the PDF file
        for (int pageCount = 1; pageCount <= pdfDocument.getPages().size(); pageCount++) {
            // Create stream object to save the output image
            java.io.OutputStream imageStream = new java.io.FileOutputStream("Converted_Image" + pageCount + ".png");
            // Create Resolution object
            Resolution resolution = new Resolution(300);
            // Create PngDevice object with the specified resolution
            PngDevice pngDevice = new PngDevice(resolution);
            // Convert a particular page and save the image to stream
            pngDevice.process(pdfDocument.getPages().get_Item(pageCount), imageStream);
            // Close the stream
            imageStream.close();
        }
    }
}

Check for Layout Issues: If the layout is incorrect when converting from Word to HTML but correct in other formats, it may indicate that the conversion settings or the document structure needs adjustment. Review the document’s formatting and ensure that it is compatible with the conversion process.

If the issue persists after trying these steps, consider reaching out to Aspose support with specific details about the PDF file and the environment setup for further assistance.

Sources:
[1]: ConvertPDFPagesToPNGImages.java

zhuw · April 30, 2025, 8:56am

正在上传：湖北大峪口化主有限责任公司大峪口磷矿160_万吨年地下采矿工程项目智能化矿山建设交流方案20250413.pdf…

zhuw · April 30, 2025, 9:06am

9b415c9a14e803a5b44ad9e461a38130.png (301.6 KB)

zhuw · April 30, 2025, 9:08am

image.jpg (84.6 KB)

asad.ali · April 30, 2025, 7:32pm

@zhuw

Are you using Aspose.PDF for Java and have you tried with the latest version of the API? Can you please share this information and confirm if we can test using the same code that has been shared above?

zhuw · May 15, 2025, 7:45am

@asad.ali Yes, I have the same problem using the latest dependency transformation.

You can use the font package above and the code below for testing, which contains some of our business code. Could you please manually remove it

PngDevice png = null;
		PageCollection pageCollection = null;
		Integer imgResolution = sourceFile.getImgResolution();
		int resolution = imgResolution==null ? 100 : imgResolution;
		try {
			png = new PngDevice(new Resolution(resolution));
			pageCollection = pdf.getPages();
		} catch (Throwable throwable) {
			if (throwable instanceof ThreadDeath || throwable instanceof OutOfMemoryError) {
				throw throwable;
			} else {
				throw new Exception("aspose_setting", throwable);
			}
		}
		int picPageCount = 0;
		try {
			picPageCount = pageCollection.size();
		} catch (Throwable throwable) {
			if (throwable instanceof ThreadDeath || throwable instanceof OutOfMemoryError) {
				throw throwable;
			} else {
				throw new Exception("aspose_getPageCount", throwable);
			}
		}
		TicAsposeConvertProgressUtil.setConvertProgress(sourceFile.getTaskId(), picPageCount, 0);


		ConvertResultFiles convertResultFiles = new ConvertResultFiles(picPageCount);
		convertResult.getResultFilesMap().put("png",convertResultFiles);
		String pageKey_pre = "page-";
		String fileKey = null;
		IConvertFile pageConvertFile = null;
		for (int i = 1; i <= picPageCount; i++) {
			fileKey = pageKey_pre+i;
			try {
				log.debug("文件-" + sourceFile.getFileName() + "-第" + i + "页转换图片开始");
				try {
					pageConvertFile = sourceFile.newConvertFile("png", fileKey, null);
				} catch (Throwable throwable) {
					if (throwable instanceof ThreadDeath || throwable instanceof OutOfMemoryError) {
						throw throwable;
					} else {
						throw new Exception("createConvertFile", throwable);
					}
				}
				Page page = null;
				try {

//					compressImage(png.processToBufferedImage(pageCollection.get_Item(i)), sourceFile, pageConvertFile,

//							true);

					page = pageCollection.get_Item(i);
					BufferedImage bufferedImage = png.processToBufferedImage(page);
					ImageIOUtil.writeImage(bufferedImage, "png", pageConvertFile.getOutputStream());


//					if (i == 1) {

//						generateThumbnail(pageConvertFile, sourceFile);

//					}

				} catch (Throwable throwable) {
					if (throwable instanceof ThreadDeath || throwable instanceof OutOfMemoryError) {
						throw throwable;
					} else {
						throw new Exception("aspose_convert", throwable);
					}
				} finally {
					page.close();
				}
				log.debug("文件-" + sourceFile.getFileName() + "-第" + i + "页转换图片结束");
				convertResultFiles.addSuccessResult(pageConvertFile.getFileKey(),pageConvertFile.getPath());
				TicAsposeConvertProgressUtil.setConvertProgress(sourceFile.getTaskId(), picPageCount, i);
			} catch (Throwable throwable) {
				convertResultFiles.addErrorResult(fileKey,throwable.getMessage());
				throw throwable;
			} finally {
				//System.gc();
			}
		}

asad.ali · May 15, 2025, 7:13pm

@zhuw

We are not able to download the PDF file that you attached above. Can you please attach it again so that we can test the scenario in our environment and address it accordingly.

zhuw · May 16, 2025, 6:34am

aspose_test-1-3.pdf (1.6 MB)

asad.ali · May 16, 2025, 6:36pm

@zhuw

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-45006

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

zhuw · June 17, 2025, 2:26am

May I ask how to check the repair status of this defect or when it is expected to be fixed @asad.ali

asad.ali · June 17, 2025, 7:53am

@zhuw

The ticket has been logged recently in our issue tracking system and as per free support model, it will be investigated and resolved on a first come first serve basis. You can check the ticket status at the bottom of this forum thread and we will also keep you posted with the status of ticket resolution in this forum thread. Please be patient and spare us some time.

We are sorry for the inconvenience.