Hello,
We are converting HTML documents to PDF using java, and the format of images within those HTMLs is mostly SVG. As conversion of HTML -> PDF results in PDFs with unsearchable images, and the converted images are still missing letters in the latest Aspose version, we were thinking about converting SVGs to PDF separately, using SvgLoadOptions, and then inserting PDFs pages with images into the main document.
The problem we encountered is such that if we convert separately HTML -> PDF without images, and then add pages of SVG -> PDF conversion result to the end of the first PDF, the document is converted and displayed correctly with all pdf readers, fonts used in svg and html are embedded into the document:
HtmlLoadOptions htmlLoadOptions = new HtmlLoadOptions(RESOURCE_DIR);
htmlLoadOptions.setInputEncoding(StandardCharsets.UTF_8.name());
htmlLoadOptions.setEmbedFonts(true);
Document documentFromHtml = new Document(inputHtmlFile.getAbsolutePath(), htmlLoadOptions);
documentFromHtml.setEmbedStandardFonts(true);
SvgLoadOptions svgOptions = new SvgLoadOptions();
Document docWithImageFromSvg = new Document(imageSvgPath, svgOptions);
documentFromHtml.getPages().add(docWithImageFromSvg.getPages());
documentFromHtml.save();
But if we try to insert a page of the docWithImageFromSvg somewhere in the middle of the pdf from html, or create a new Document() and add some pages from html, then from svg, and again from html, the opened document displays nonreadable characters either for pages from html, or pages from svg (when scrolled to the page with the image), and displays a message like “Cannot find or create the font ‘CHHODQ+ArialBold’. Some characters may not display or print correctly.”
Such a message is displayed when viewing using the Adobe Reader, browsers and Foxit reader manage to display the content, but the Adobe Reader is important to our users. Setting ‘Use local fonts’ preference does not help.
The testing machine is a Windows machine, and the Arial and ArialBold are installed on the system, FontRepository.findFont(“ArialBold”) finds the font. Both HTML and SVG are using the Arial font, with some text bold. As the options related to font embedding are set to true, I would expect that the fonts would be embedded into the document anyway.
A shortened version of how I tried to insert the page with svg that resulted in unreadable characters of the svg image:
Document documentFromHtml = new Document(inputHtmlFile.getAbsolutePath(), htmlLoadOptions);
Document docWithImageFromSvg = new Document(imageSvgPath, svgOptions);
Page pageToInsertWithSvg = docWithImageFromSvg.getPages().get_Item(1);
documentFromHtml.getPages().insert(2, pageToInsertWithSvg);
A shortened version of how I tried to create a new document and mix pages from html and svg that resulted in unreadable characters of the html content:
Document documentFromHtml = new Document(inputHtmlFile.getAbsolutePath(), htmlLoadOptions);
Document docWithImageFromSvg = new Document(imageSvgPath, svgOptions);
Page pageWithSvg = docWithImageFromSvg.getPages().get_Item(1);
Document mergedDocument = new Document();
mergedDocument.getPages.add(documentFromHtml.getPages().get_Item(1));
mergedDocument.getPages.add(pageWithSvg);
mergedDocument.getPages.add(documentFromHtml.getPages().get_Item(2));
mergedDocument.save();
I tried various setEmbedFonts and setEmbedStandardFonts combinations, for separate documents and for the combined document, nothing helped. How can we make font to be embedded when merging pdf pages created from different sources? Is there some font setting I am missing?
I added full code samples in the attached zip.svg.zip (548.2 KB)
Arjana Bivainiene