Unable to convert PDF to Powerpoint on Linux Operating systems

We have found that a specific PDF we use for testing no longer works on the Ubuntu flavour of Linux.

In October 2022, we started using the test PDF with Aspose PDF 19.3 with Adopt OpenJDK 11.08 and the test passed, producing valid powerpoint output. I am not sure what the version of Ubuntu was at that time.

Since then we have updated the version of ubuntu to 20.01 and the version of the JDK to Eclipse Temurin 11.20 and see the same result. I have also reproduced the issue on Ubuntu 22.04 using the latest Eclipse Temurin. I have also tried with and without the jdk16 classifier and both have the same problem.

The code I am using to reproduce the issue is trivial:

        License LICENSE = new License();
        LICENSE.setLicense(new ByteArrayInputStream(Files.readAllBytes(Path.of("Aspose.Pdf.lic"))));
        byte[] pdfBytes = Files.readAllBytes(Path.of(args[0]));
        FileOutputStream outputStream = new FileOutputStream(args[1]);
        Document document = new Document(new ByteArrayInputStream(pdfBytes));
        document.save(outputStream, SaveFormat.Pptx);

And the following command to run it:

java -cp "aspose-pdf-24.1-jdk16.jar" com.test.PPTTestApplication testPDF.pdf testPDF.pptx

I get the following stack trace:

Exception in thread "main" java.lang.NullPointerException
        at com.aspose.pdf.internal.l22h.lb.lI(Unknown Source)
        at com.aspose.pdf.internal.l22h.lc.lI(Unknown Source)
        at com.aspose.pdf.internal.l22h.lj.lI(Unknown Source)
        at com.aspose.pdf.internal.l22l.lt.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lj.lf(Unknown Source)
        at com.aspose.pdf.internal.l93t.lj.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.l0if.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93v.lc.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93v.lh.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93v.ly.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93v.le.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93v.lf.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.lk.lI(Unknown Source)
        at com.aspose.pdf.internal.l93t.le.lI(Unknown Source)
        at com.aspose.pdf.l15l.lI(Unknown Source)
        at com.aspose.pdf.l15l.lI(Unknown Source)
        at com.aspose.pdf.ADocument.lj(Unknown Source)
        at com.aspose.pdf.ADocument.lf(Unknown Source)
        at com.aspose.pdf.ADocument.lI(Unknown Source)
        at com.aspose.pdf.Document.lI(Unknown Source)
        at com.aspose.pdf.ADocument.save(Unknown Source)
        at com.aspose.pdf.Document.save(Unknown Source)
        at com.test.PPTTestApplication.main(PPTTestApplication.java:24)

Where line 24 is the call to document.save in the code example above.

This issue does not occur on Windows for the same file, using the same JDK.

This issue does not occur with other PDFs that I have tested with, but without knowing the reason behind the issue, it remains a concern.

I have attached the PDF in question.
testPDF.pdf (70.0 KB)

I feel like I could probably diagnose the issue myself if I was able to de-obfuscate the code or add in my own logging to the classes (which I can’t due to the signed nature of the Jar). Please let me know if there is anything else I can do to diagnose the issue further.

EDIT:
I have seen that there is now a 24.2 library that is available. The same issue still occurs in that release, with the following stack trace:

Exception in thread "main" java.lang.NullPointerException
        at com.aspose.pdf.internal.l22k.lb.lI(Unknown Source)
        at com.aspose.pdf.internal.l22k.lc.lI(Unknown Source)
        at com.aspose.pdf.internal.l22k.lj.lI(Unknown Source)
        at com.aspose.pdf.internal.l22p.lt.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lj.lf(Unknown Source)
        at com.aspose.pdf.internal.l93u.lj.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.l0if.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93j.lc.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93j.lh.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93j.ly.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93j.le.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lv.lI(Unknown Source)
        at com.aspose.pdf.internal.l93j.lf.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.lk.lI(Unknown Source)
        at com.aspose.pdf.internal.l93u.le.lI(Unknown Source)
        at com.aspose.pdf.l15l.lI(Unknown Source)
        at com.aspose.pdf.l15l.lI(Unknown Source)
        at com.aspose.pdf.ADocument.lj(Unknown Source)
        at com.aspose.pdf.ADocument.lf(Unknown Source)
        at com.aspose.pdf.ADocument.lI(Unknown Source)
        at com.aspose.pdf.Document.lI(Unknown Source)
        at com.aspose.pdf.ADocument.save(Unknown Source)
        at com.aspose.pdf.Document.save(Unknown Source)
        at com.test.PPTTestApplication.main(PPTTestApplication.java:24)

@plovell

Have you made sure that all Windows fonts are properly installed and accessible in the Linux system? Please try to install msttcorefonts package and see if it helps. We will further proceed accordingly.

Thank you Asad - adding the ms fonts did indeed resolve the null pointer exception.

Would it be possible to change the error such that the reason for the problem was more obvious?

Secondly is font substitution possible with the library or a feature that could be added to the road map?

@plovell

We will definitely look into it. However, in Linux like or non-Windows operating systems, such errors occur due to missing fonts.

Yes, you can use below code snippet to add font substitutions for as many fonts as you want before initializing the Document object:

FontRepository.getSubstitutions().add(new SimpleFontSubstitution("GillSans-Bold", "Arial"));

Hi Asad,

I had hoped that I could use font substitution to resolve the issue, but it does not seem to work.

The document contains three embedded fonts as listed by:

document.getFontUtilities().getAllFonts()

These are the fonts:

[Calibri-Light, Calibri, ArialMT]

And this agrees with what Acrobat thinks. All of these fonts are marked as embedded.

I have also added a font substitution for every font in the mscorefonts package:

    private static final List<String> FONTS_TO_SUBSTITUTE = Arrays.asList(
            "Andale Mono",
            "Arial Black",
            "Arial",
            "Arial Bold",
            "Arial Italic",
            "Arial Bold Italic",
            "Comic Sans MS",
            "Comic Sans MS Bold",
            "Courier New",
            "Courier New Bold",
            "Courier New Italic",
            "Courier New Bold Italic",
            "Georgia",
            "Georgia Bold",
            "Georgia Italic",
            "Georgia Bold Italic",
            "Impact",
            "Times New Roman",
            "Times New Roman Bold",
            "Times New Roman Italic",
            "Times New Roman Bold Italic",
            "Trebuchet",
            "Trebuchet Bold",
            "Trebuchet Italic",
            "Trebuchet Bold Italic",
            "Verdana",
            "Verdana Bold",
            "Verdana Italic",
            "Verdana Bold Italic",
            "Webdings",
            "Calibri",
            "Calibri-Light",
            "ArialMT"
    );
        Font defaultFont = FontRepository.findFont("Liberation Sans");
        for (String font : FONTS_TO_SUBSTITUTE) {
               FontRepository.getSubstitutions().add(new SimpleFontSubstitution(font, defaultFont.getFontName()));
        }

And finally set the following:

        Document document = new Document(new ByteArrayInputStream(pdfBytes));
        document.setAbsentFontTryToSubstitute(true);
        document.save(outputStream, SaveFormat.Pptx);

Without the fonts package installed, I still get the original error. Is there anything else I can do to get font substitution working?

@plovell

Is above font present in the system? Can you please make sure that it is present in the default font directory? Please let us know about your findings so that we can log an investigation ticket in our issue management system for further analysis and share the ID with you.

Hi Asad,

Yes the font does exist on the VM which I determined using the following code.

This printed out:
Found: Liberation Sans

I also found it on the OS using the command:

fc-list | grep Liberation Sans

Which returned a large list of options - Regular, Italic, Bold, etc.

@plovell

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-43673

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Thanks for your help Asad. :slightly_smiling_face: