Our web application supports saving a document both to Word and PDF, using Aspose.Words for Java. For Word, we have no problem. But, for PDF, it is rendering the “disc” bullets used for an unordered list in a way that they appear in the PDF document as a rectangle instead of a solid circle. The call we are using is Document.save() while providing the PDF options as the second arg.
I have done much searching on this issue and understand it requires the host system to have TrueType fonts that contain the bullet character (in this case U+2022, the default “disc” bullet). The system that generates the PDF needs the font available to insert into the PDF because it must be viewable standalone without the fonts installed, unlike a Word document that will use the fonts on the system where they are being viewed.
However, the installed fonts look fine as far as I can tell. Our code defaults to attempting to use Calibri which is not installed by default. So, on macOS it’ll substitute Arial Unicode MS. And, on our Docker containers running CentOS it’ll substitute DejaVuSans. All systems have these fonts, and I confirm these fonts also know about U+2022 bullet character. Additionally, I made use of the IWarningCallback during the save() call to print out any font-related warnings. When substitutions happen, it does print them out. But, I see no other font-related information that looks suspect. I also tried setting it to several other fonts, such as Courier New and Arial… while this changes the font, it doesn’t fix the bullets.
If I nest the unordered lists, the bullets under the top level (2+ levels deep) look fine. For example, the hollow circle bullet renders with no problem. It is only the top-level bullet, U+2022, the one you get if you specify in HTML list-style = “disc” that has the issue. If I override that and force it to use any other (like “square”) it works fine even for the top level list.
And, some systems still work fine, while others don’t. We’re not sure what the difference is, but I have a hypothesis the issue may be activated by small differences in how the Docker images are being generated. There are small JVM and Docker version differences between those that have the issue and those that don’t. But, the Docker images have no font differences.
I analyzed the generated PDFs using Acrobat Pro. When it works, I can see the bullets in the PDF are U+2022. When it doesn’t work, I can see the PDF generation that happened on our Docker containers saved it as U+2023, the “triangle bullet”. But, these fonts all render the triangle bullet as a rectangle, which I guess is “correct” as far as rendering them goes, and thus the issue is actually that this bullet somehow got saved as U+2023.
Now, on macOS it is even stranger. Instead of saving the bullet as U+2023, it saves it as these 3 bytes:
EF BF BF
which apparently represents “non character”.
Here’s what the affected part of the PDF looks like:
Here’s Acrobat compare feature showing how a PDF that works compares with one that doesn’t.
This small Java program will reproduce the issue on macOS, but not on our Docker containers that have the issue (no idea why). Our real code is of course more complex than this, but I think it is a fair representation of how we are using your API:
Main.7z (594 Bytes)
And we can also reproduce this using DocumentBuilder.insertHtml() to add UL/LI elements to a Document; I made use of “builder.getListFormat().setList()” in that example to rule out possible problems unique to insertHtml. In that code, if I change ListTemplate.BULLET_DEFAULT to any other bullet, it starts working.
Thanks!