HTML to PDF conversion issue

Hi,

We have been using licensed “aspose.pdf-11.3.0.jar” for converting HTML into PDF. Recently we faced an issue that when we added “<span style=“color: rgb(51, 51, 51); font-family: “Helvetica Neue Light”, HelveticaNeue-Light, “Helvetica Neue”, Helvetica, Arial, sans-serif; font-size: 14px; background-color: rgb(255, 255, 255);”>half-width/full-width/CJK” <span style=“background-color: rgb(255, 255, 255); color: rgb(51, 51, 51); font-family: “Helvetica Neue Light”, HelveticaNeue-Light, “Helvetica Neue”, Helvetica, Arial, sans-serif; font-size: 14px;”>characters in HTML then aspose threw below Exception:-
Exception in thread “main” class com.aspose.pdf.internal.ms.System.z10: Value cannot be null.
Parameter name: key
com.aspose.pdf.internal.p554.z2.tryGetValue(Unknown Source)
com.aspose.pdf.internal.p9.z1.m1(Unknown Source)
com.aspose.pdf.internal.p14.z5$z1.m1(Unknown Source)
com.aspose.pdf.internal.p66.z2.m1(Unknown Source)
com.aspose.pdf.internal.p55.z1.m1(Unknown Source)
com.aspose.pdf.internal.p55.z1.m1(Unknown Source)
com.aspose.pdf.internal.p55.z1.m1(Unknown Source)
com.aspose.pdf.z92.m1(Unknown Source)
com.aspose.pdf.TextFragment.m3(Unknown Source)
com.aspose.pdf.TextFragment.(Unknown Source)
com.aspose.pdf.z7.m1(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z17.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z11.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z11.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z11.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
com.aspose.pdf.internal.foundation.rendering.z30.accept(Unknown Source)
com.aspose.pdf.z7.m1(Unknown Source)
com.aspose.pdf.ApsUsingConverter.m1(Unknown Source)
com.aspose.pdf.z35.m1(Unknown Source)
com.aspose.pdf.ADocument.m1(Unknown Source)
com.aspose.pdf.ADocument.(Unknown Source)
com.aspose.pdf.Document.(Unknown Source)
Test.main(Test.java:88)

So we tried updating the aspose.pdf-11.3.0.jar with the latest “aspose.pdf-17.2.0.jar”. On updating jar, we are not getting any exception but now allignment of the generated pdf has got distorted (its width got increased by more than 1 inch) and the generated pdf contains square block for all those special characters.

Fonts used in html:-
3ds and calibri

<span style=“color: rgb(51, 51, 51); font-family: “Helvetica Neue Light”, HelveticaNeue-Light, “Helvetica Neue”, Helvetica, Arial, sans-serif; font-size: 14px; background-color: rgb(255, 255, 255);”>Sample half-width/full-width/CJK <span style=“background-color: rgb(255, 255, 255); color: rgb(51, 51, 51); font-family: “Helvetica Neue Light”, HelveticaNeue-Light, “Helvetica Neue”, Helvetica, Arial, sans-serif; font-size: 14px;”>characters in html:-
p q r s t u v w x y z { | } ~ 。 「 」 、 ⦅ ・ ヲ ァ ィ ゥ ⦆

Below are my sample test code :-

String sPDFMessage = new String(Files.readAllBytes(Paths.get(“F:/Desktop/test.html”)));
// Replacing all required values in above PDFMessage with its actual values …
sPDFMessage .replace("",“zeeshan”);

License lic = new License();
String LicensePath = “D:/apache-tomee-plus-1.6.0.1/webapps/enovia/WebClient/Aspose.Pdf.lic”;
lic.setLicense(new FileInputStream(new File(LicensePath)));
String htmlOptionsPath = “D:/apache-tomee-plus-1.6.0.1/webapps/enovia/PLD/template/”;
HtmlLoadOptions htmloptions = new HtmlLoadOptions(htmlOptionsPath);
System.out.println(“Creating Document…”);
Document doc = new Document(IOUtils.toInputStream(sPDFMessage, StandardCharsets.UTF_8), htmloptions);
doc.save(“F:/Desktop/test.pdf”);

Please look into this issue. And I would also like to know the default font used by aspose when the specified font is not available (for both normal character and CJK characters)?


Thanks
Zeeshan

Hi Zeeshan,


Thanks for contacting support.

I have tested the scenario with following code snippet and have manage to convert HTML to PDF successfully and the generated PDF does not contain any square block. I have attached the sample input/output files for your reference.

JAVA

<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>String sPDFMessage = <span class=“kwrd” style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”> String(Files.readAllBytes(Paths.get("/Users/fahadadeel/Downloads/resources/test.html")));
// Replacing all required values in above PDFMessage with its actual values …
sPDFMessage = sPDFMessage.replace("",“zeeshan”);
HtmlLoadOptions htmloptions = new HtmlLoadOptions("/Users/fahadadeel/Downloads/resources/");
InputStream stream = new ByteArrayInputStream(sPDFMessage.getBytes(StandardCharsets.UTF_8));
Document doc = new Document(stream, htmloptions );
doc.save(dataDir + “output_pdf.pdf”);

If you still face any issues, please share your sample html file and the output PDF file generated at your end. It will help us to understand your requirement exactly and address it accordingly.

We are sorry for the inconvenience.

Best Regards,
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}

Hi


Thanks for trying and providing the input and output.

Can you confirm following points
1) The version of asponse you are using. We are using aspose 11.3.0
2) The OS on which you have tested. I am working on Red Hat Linux 2.6

I have attached my sample code along with html and generated pdf. I have used “aspose.pdf-17.2.0.jar” for testing (And on using aspose.pdf-11.3.0.jar, it throws Exception).

Regards,
Zeeshan

Hi Zeeshan,


Thanks for sharing further details.

I have tested the scenario using Aspose.Pdf for Java 17.2.0 on MAC OS. I will test the scenario using your provided environment and will update you with my findings. Please spare me a little time.

We are sorry for the inconvenience.

Best Regards,

Hi Fawad,

Please take your time.


And please also test for below characters, as even on using latest aspose.pdf-17.2.0.jar, I am getting different Exception this time for below characters:-

Recently used Characters:-
Monographs (gojūon) Digraphs (yōon)
a i u e o ya yu yo
a [a]
i [i]
u [u͍]
e [e]
o [o]
K
ka [ka]
ki [ki]
ku [ku͍]
ke [ke]
ko [ko] きゃ
kya [kʲa] きゅ
kyu [kʲu͍] きょ
kyo [kʲo]
S
sa [sa]
shi [ɕi]
su [su͍]
se [se]
so [so] しゃ
sha [ɕa] しゅ
shu [ɕu͍] しょ
sho [ɕo]
T
ta [ta]
chi [ t͡ɕi]
tsu [ t͡su͍]
te [te]
to [to] ちゃ
cha [ t͡ɕa] ちゅ
chu [ t͡ɕu͍] ちょ
cho [ t͡ɕo]
N
na [na]
ni [ni]
nu [nu͍]
ne [ne]
no [no] にゃ
nya [nʲa] にゅ
nyu [nʲu͍] にょ
nyo [nʲo]
H
ha [ha]
([w͍a] as particle)
hi [çi]
fu [ɸu͍]
he [he]
([e] as particle)
ho [ho] ひゃ
hya [ça] ひゅ
hyu [çu͍] ひょ
hyo [ço]
M
ma [ma]
mi [mi]
mu [mu͍]
me [me]
mo [mo] みゃ
mya [mʲa] みゅ
myu [mʲu͍] みょ
myo [mʲo]
Y
ya [ja]
yu [ju͍]
yo [jo]
R
ra [ɽa]
ri [ɽi]
ru [ɽu͍]
re [ɽe]
ro [ɽo] りゃ
rya [ɽʲa] りゅ
ryu [ɽʲu͍] りょ
ryo [ɽʲo]
W
wa [w͍a]
i/wi [(w͍)i]
e/we [(w͍)e]
o/wo [(w͍)o] (particle)
*
n
[n] [m] [ŋ] before stop consonants;
[ɴ] [ũ͍] [ĩ] elsewhere
(indicates a geminate consonant)
(reduplicates and
unvoices syllable)
(reduplicates and
voices syllable)
Diacritics (gojūon with (han)dakuten) Digraphs with diacritics (yōon with (han)dakuten)
a i u e o ya yu yo
G
ga [ɡa]
gi [ɡi]
gu [ɡu͍]
ge [ɡe]
go [ɡo] ぎゃ
gya [ɡʲa] ぎゅ
gyu [ɡʲu͍] ぎょ
gyo [ɡʲo]
Z
za [za]
ji [d͡ʑi]
zu [zu͍]
ze [ze]
zo [zo] じゃ
ja [d͡ʑa] じゅ
ju [d͡ʑu͍] じょ
jo [d͡ʑo]
D
da [da]
ji [d͡ʑi]
zu [zu͍]
de [de]
do [do] ぢゃ
ja [d͡ʑa] ぢゅ
ju [d͡ʑu͍] ぢょ
jo [d͡ʑo]
B
ba [ba]
bi [bi]
bu [bu͍]
be [be]
bo [bo] びゃ
bya [bʲa] びゅ
byu [bʲu͍] びょ
byo [bʲo]
P
pa [pa]
pi [pi]
pu [pu͍]
pe [pe]
po [po] ぴゃ
pya [pʲa] ぴゅ
pyu [pʲu͍] ぴょ
pyo [pʲo]
V
vu/u [v(u͍)]
In the middle of words, the g sound (normally [ɡ]) often turns into a velar nasal [ŋ] and less often (although increasing recently) into the voiced velar fricative [ɣ]. An exception to this is numerals; 15 juugo is considered to be one word, but is pronounced as if it was jū and go stacked end to end: [d͡ʑu͍ːɡo].

Additionally, the j sound (normally [d͡ʑ]) can be pronounced [ʑ] in the middle of words. For example, すうじ sūji [su͍ːʑi] ‘number’.

In archaic forms of Japanese, there existed the kwa (くゎ [kʷa]) and gwa (ぐゎ [ɡʷa]) digraphs. In modern Japanese, these phonemes have been phased out of usage and only exist in the extended katakana digraphs for approximating foreign language words.

The singular n is pronounced [n] before t, ch, ts, n, r, z, j and d, [m] before m, b and p, [ŋ] before k and g, [ɴ] at the end of utterances, [ũ͍] before vowels, palatal approximants (y), consonants s, sh, h, f and w, and finally [ĩ] after the vowel i if another vowel, palatal approximant or consonant s, sh, h, f or w follows.

In kanji readings, the diphthongs ou and ei are today usually pronounced [oː] (long o) and [eː] (long e) respectively. For example, とうきょう (lit. toukyou) is pronounced [toːkʲoː] ‘Tokyo’, and せんせい sensei is [seũ͍seː] ‘teacher’. However, とう tou is pronounced [tou͍] ‘to inquire’, because the o and u are considered distinct, u being the infinitive verb ending. Similarly, している shite iru is pronounced [ɕiteiɾu͍] ‘is doing’.

For a more thorough discussion on the sounds of Japanese, please refer to Japanese phonology.

New Exception found on using above characters:-
com.aspose.pdf.internal.ms.System.z106: Specified method is not supported.
at com.aspose.pdf.internal.p68.z3.m2(Unknown Source)
at com.aspose.pdf.internal.p68.z3.(Unknown Source)
at com.aspose.pdf.internal.p81.z2.m1(Unknown Source)
at com.aspose.pdf.internal.p68.z29.m1(Unknown Source)
at com.aspose.pdf.internal.p69.z6.m1(Unknown Source)
at com.aspose.pdf.internal.p70.z1.m1(Unknown Source)
at com.aspose.pdf.internal.p71.z1.m1(Unknown Source)
at com.aspose.pdf.TextSegment.setText(Unknown Source)
at com.aspose.pdf.TextSegment.m1(Unknown Source)
at com.aspose.pdf.TextBuilder.m1(Unknown Source)
at com.aspose.pdf.TextBuilder.appendText(Unknown Source)
at com.aspose.pdf.z9.m1(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z17.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z11.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z11.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z11.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z14.accept(Unknown Source)
at com.aspose.pdf.internal.foundation.rendering.z30.accept(Unknown Source)
at com.aspose.pdf.z9.m1(Unknown Source)
at com.aspose.pdf.ApsUsingConverter.m1(Unknown Source)
at com.aspose.pdf.z90.m1(Unknown Source)
at com.aspose.pdf.ADocument.m1(Unknown Source)
at com.aspose.pdf.ADocument.(Unknown Source)
at com.aspose.pdf.Document.(Unknown Source)
at com.dassault_systemes.apps.pld.component.PDFUtilities.createPDFFile(PDFUtilities.java:86)
at com.dassault_systemes.apps.pld.component.PLDCommon.generatePDF(PLDCommon.java:1347)
at com.dassault_systemes.apps.pld.component.PLDCommon.generatePLDPDF(PLDCommon.java:1208)
at com.dassault_systemes.apps.pld.component.PLDUtilities$2.run(PLDUtilities.java:2007)
at java.lang.Thread.run(Thread.java:744)

Note: we are testing for all Full-Width, Half-Width, CJK, Chinese, Japanese, Korean, German and French characters.

Thanks
Zeeshan

Hi Zeeshan,

Thanks for your inquiry. As per my understanding it seems Aspose.Pdf for Java is unable to find fonts on your system. Please note on non-Windows OSs Aspose.Pdf for Java looks fonts in system default font path or specified local font path for custom font directory.

Please note most of the PDF documents that we convert are created by people using Windows or Mac OS operating systems with fonts that are installed with Microsoft Windows or with Microsoft Office. To resolve your issue either you can install Microsoft fonts on your system or copy fonts from your windows OS and paste to your system default font path.

Furthermore, if you want to use custom fonts from other than system default font path then you need to add that folder path into LocalFontPath as following. You can use following methods to get system folder of fonts or set font path to font folders.

  • Document.getLocalFontPath () - shows the system folder in which project will look for fonts.
  • Document.setLocalFontPath (String) - Setting font path to custom folder
// Set font folder path
String path = “/home/zeeshan/fonts/”;
// Adding a single font directory
// com.aspose.pdf.Document.addLocalFontPath(path);
// setting the user list for standard font directories
java.util.List list = com.aspose.pdf.Document.getLocalFontPaths();
list.add(path);
com.aspose.pdf.Document.setLocalFontPaths(list);

Please apply the fonts before conversion of HTML to PDF. If you still face any issue, please feel free to contact us.

We are sorry for the inconvenience caused.

Best Regards,

Hi Fahad,

Thanks for the information.

We need following fix from Aspose:-
1. If Java is unable to find fonts on any system, Aspose should not stop creating a PDF but it should show junk in place of those characters in PDF. (Ex:- If word or any other software that needs font encounters a character which is not supported by font they show junk in place of that character.)
2. On upgrading jars from aspose.pdf-11.3.0.jar to aspose.pdf-17.2.0.jar, and using normal english characters, the width of our generated PDF gets increased by about an inch for with every increase in number of lines in paragraph.(Observed about 1inch/line)

And I have following queries:-
1. Are we authorized to copy window’s fonts into linux and use freely? Are window’s fonts free (open source) to use any where freely?
2. I once tried to copy “Arial Unicode MS” font from windows to linux’s ‘font’ directory and it worked for many characters but not for all. Can you please suggest me some font name that support’s all characters (Full-Width, Half-Width, CJK, Chinese, Japanese, Korean, German and French characters.) and can be freely used?
3. we have jar and license for aspose.pdf-11.3.0.jar. On upgrading jars, do we also need to get new license?


Regards
Zeeshan

HI Zeeshan.


Thanks for your inquiries.

1. If Java is unable to find fonts on any system, Aspose should not stop creating a PDF but it should show junk in place of those characters in PDF. (Ex:- If word or any other software that needs font encounters a character which is not supported by font they show junk in place of that character.)

p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}

I have logged an enhancement ticket as PDFJAVA-36625 in our issue tracking system. We will further look into the details of this enhancement and will keep you posted on the status of its implementation. Please be patient and spare us little time.


2. On upgrading jars from aspose.pdf-11.3.0.jar to aspose.pdf-17.2.0.jar, and using normal english characters, the width of our generated PDF gets increased by about an inch for with every increase in number of lines in paragraph.(Observed about 1inch/line)


p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 12.0px 'Helvetica Neue'; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}

I will appreciate if you please share your sample html string along with output file. It will help us to understand your requirement exactly and address it accordingly.


1. Are we authorized to copy window's fonts into linux and use freely? Are window's fonts free (open source) to use any where freely?


Yes, using Microsoft fonts on linux is legal. You can read EULA here: https://www.microsoft.com/typography/fontpack/eula.htm The key parts of the license are, "The SOFTWARE PRODUCT is licensed, not sold." and "You may install and use an unlimited number of copies of the SOFTWARE PRODUCT."


2. I once tried to copy "Arial Unicode MS" font from windows to linux's 'font' directory and it worked for many characters but not for all. Can you please suggest me some font name that support's all characters (Full-Width, Half-Width, CJK, Chinese, Japanese, Korean, German and French characters.) and can be freely used?


you can try some font from the list of all unicode supported fonts https://en.wikipedia.org/wiki/Unicode_font


3. we have jar and license for aspose.pdf-11.3.0.jar. On upgrading jars, do we also need to get new license?


I have created a separate forum thread for this query using email to post feature and you will be notified on your email address. https://www.aspose.com/community/forums/832610/showthread.aspx#832610 is the link for the forum thread.


If you need further assistance, please feel free to contact us.


We are sorry for the inconvenience.


Best Regards,

Hi Fahad,


Regarding:
2. On upgrading jars from aspose.pdf-11.3.0.jar to aspose.pdf-17.2.0.jar, and using normal english characters, the width of our generated PDF gets increased by about an inch for with every increase in number of lines in paragraph.(Observed about 1inch/line)

I have attached the sample codes and both pdf generated by using aspose.pdf-11.3.0.jar and aspose.pdf-17.2.0.jar respectively.

It was working fine with aspose.pdf-11.3.0.jar.
As per my observation, nested table might be causing problem with aspose.pdf-17.2.0.jar. The width of the pdf increased on increasing the number of words in paragraph.


Regards
Zeeshan

HI Zeeshan.


Thanks for sharing further details.

I have tested the scenario with your provided HTML file and have managed to reproduce the problem that while converting from HTML to PDF, the width of the PDF got increased. For the sake of correction, I have logged it as PDFJAVA-36628 in our issue tracking system. We will notify you within this forum thread as soon as the issue is resolved. Please be patient and spare us little time.

We are sorry for this inconvenience.

Best Regards,
p.p1 {margin: 0.0px 0.0px 0.0px 0.0px; font: 13.0px Arial; -webkit-text-stroke: #000000} span.s1 {font-kerning: none}

Hi Fahad


Thanks for reply.
We will be much obliged, if we get both the issue PDFJAVA-36625 and PDFJAVA-36628 resolved as soon as possible.

Can you please share us the Aspose documentation for HTML to PDF conversion in java as we are unable to find any documentation for that on Aspose website.

Thanks
Zeeshan

zeeshan3ds:
Hi Fahad

Thanks for reply.
We will be much obliged, if we get both the issue PDFJAVA-36625 and PDFJAVA-36628 resolved as soon as possible.
Hi Zeeshan,

Thanks for contacting support.

The issues as resolved in first come first serve basis and as soon as we have some definite updates regarding their resolution, we will update you within this forum thread.
zeeshan3ds:
Can you please share us the Aspose documentation for HTML to PDF conversion in java as we are unable to find any documentation for that on Aspose website.
Please visit the following link for required information on Convert PDF to HTML format

Hi Shahbaz,


Thanks for the reply.

We have some concern regarding the fonts that we have used in HTML for PDF generation. We want to make sure whether we need to get license for those fonts or not and if license is needed then what kind of license we need. So can you please clear our following concerns:-

1. Whether those fonts get embedded in the PDF. Suppose I have used font “Code2000.ttf” in HTML so whether generated PDF will have this font embedded into it ?
2. And is it possible to prevent fonts from getting embedded in the generated PDF ? If yes How ?

Hi Zeeshan,


Thanks for your inquiry.

We want to make sure whether we need to get license for those fonts or not

Yes, you need to buy fonts if you are using a commercial font not free.

Whether those fonts get embedded in the PDF

Yes, fonts get embedded while converting HTML to PDF.

And is it possible to prevent fonts from getting embedded in the generated PDF

Yes, you can use UnembedFonts method to remove all embedded fonts. This decreases the document size but the document may become unreadable if correct font is not installed. Please see following code snippet for reference

C#

<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>pdfDocument.OptimizeResources(<span class=“kwrd” style=“font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>new<span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”> Document.OptimizationOptions()
{
UnembedFonts = true
});
dataDir = dataDir + “OptimizeFileSize_out.pdf”;
// Save output document
pdfDocument.Save(dataDir);

If you need further assistance, please feel free to contact us.

Best Regards,

Hi Fahad,


Regarding unembeding fonts. The code that <span style=“background-color: rgb(255, 255, 255); font-family: “Courier New”, Consolas, Courier, monospace; font-size: small; white-space: pre;”>you have given is for C# not Java. So I tried <span style=“background-color: rgb(255, 255, 255); white-space: pre; font-family: “Courier New”, Consolas, Courier, monospace; font-size: small;”>below code for unembedding fonts but it didn’t work.
I can still see those fonts embeded in properties of generated PDF.

InputStream stream = new ByteArrayInputStream(sPDFMessage.getBytes(StandardCharsets.UTF_8));
Document doc = new Document(stream, htmloptions);
Document.OptimizationOptions optOption = new Document.OptimizationOptions();
optOption.setUnembedFonts(true);
doc.optimizeResources(optOption);


And I have following queries:-
1. What do you mean by “document may become unreadable on removing fonts”. Will the character using those unembedded fonts be lost?
2. The code generating Document object (Document doc = new Document(stream, htmloptions):wink: is taking a lot of time. How can we optimize this process ?
3. The size of the generated PDF file just containing 5 pages is too big. How can we minimize the file size ?

Hi Zeeshan,


Thanks for sharing further details.

In the above code you are not saving the PDF file after optimizeResources.

What do you mean by “document may become unreadable on removing fonts

If a PDF document contains some font say “abc.otf” and when you un embed this font, then if that font is not installed on the machine where you are viewing the PDF document, then it might become unreadable.

2. The code generating Document object (Document doc = new Document(stream, htmloptions):wink: is taking a lot of time. How can we optimize this process ?
3. The size of the generated PDF file just containing 5 pages is too big. How can we minimize the file size ?

I will appreciate if you please provide the sample html file including its resources. It will help us to understand your requirement exactly and address it accordingly.

We are sorry for the inconvenience.

Best Regards,

Hi Fahad,


Thanks for the reply.

In the above code you are not saving the PDF file after optimizeResources.
I have saved the PDF file after optimizeResources. I had not given you my full code in previous query.


For all previous queries, I am attaching the code and sample PDF generated using different fonts (Unifont, Noto Sans CJK SC Regular, Code2000 and calibri) in p and span elements.



Regards
Zeeshan



Hi Zeeshan,


Thanks for sharing further details.

I am looking into it in detail and will get back to you with my findings.

We are sorry for the inconvenience.

Best Regards,

Hi Zeeshan,


I have further look into it in detail.

Please see following code snippet, it will remove the embedded fonts from the PDF files and will result into less PDF file size.

JAVA

String sPDFMessage = new String(Files.readAllBytes(Paths.get(dataDir+ “test.htm”)));
sPDFMessage = sPDFMessage.replaceAll("", “XYZ”);
sPDFMessage = sPDFMessage.replaceAll("", “Zeeshan”);
sPDFMessage = sPDFMessage.replaceAll("
", “Details for above person”);
System.out.println("before: " + sPDFMessage);

System.out.println("After: " + sPDFMessage);
String templatePath = dataDir ;
HtmlLoadOptions htmloptions = new HtmlLoadOptions(templatePath);
MarginInfo marginInfo = new MarginInfo(50, 72, 40, 65);
htmloptions.getPageInfo().setMargin(marginInfo);
System.out.println(“Creating Document…”);

InputStream stream = new ByteArrayInputStream(sPDFMessage.getBytes(StandardCharsets.UTF_8));
Document doc = new Document(stream, htmloptions);

for(int counter=1; counter <= doc.getPages().size(); counter++) {
Page pdfPage = doc.getPages().get_Item(counter);
for (Font pageFont : (Iterable) pdfPage.getResources().getFonts()) {
if (pageFont.isEmbedded()) {
pageFont.setEmbedded(false);
}
if (pageFont.isSubset()) {
pageFont.setSubset(false);
}
}
}


// Setting Header
ImageStamp headerImg = new ImageStamp(dataDir + “test.jpg”);
headerImg.setHeight(64);
headerImg.setWidth(595);
headerImg.setTopMargin(0);
headerImg.setVerticalAlignment(VerticalAlignment.Top);
headerImg.setBackground(true);

TextStamp headerTxt = new TextStamp(“Welcome”);
headerTxt.setTopMargin(20);
headerTxt.setHorizontalAlignment(HorizontalAlignment.Center);
headerTxt.setVerticalAlignment(VerticalAlignment.Top);
//set text properties
Font font = null;
try {
font = FontRepository.findFont(“Sans-Serif”, true);
} catch(Exception e) {
System.out.println(e.getMessage());
}
if(font == null) {
font = headerTxt.getTextState().getFont();
System.out.println(“Font:”+font.getFontName());
}
headerTxt.getTextState().setFont(font);
headerTxt.getTextState().setFontSize(23.0F);
headerTxt.getTextState().setForegroundColor(Color.getWhite());

doc.getPages().get_Item(1).addStamp(headerImg);
doc.getPages().get_Item(1).addStamp(headerTxt);
doc.save(dataDir + “abcTest12.pdf”);

System.out.println(“Done…”);

If need further assistance, please feel free to contact us.

We are sorry for the inconvenience.

Best Regards,

The issues you have found earlier (filed as PDFJAVA-36628) have been fixed in Aspose.PDF for Java 18.3. This message was posted using BugNotificationTool by @asad.ali