Exception "Thread state Runnable" while converting PDF to HTML using Aspose.PDF (Java)

I’m trying to debug why on one particular machine, Aspose hangs in Document.save() trying to save a PDF to HTML. I was able to get a few thread dumps, but it’s difficult to figure out what it’s doing through the obfuscation.

I’m not asking for a fix, I’d just like to know what Aspose is trying to do here. Is it searching for a font? Rendering a shape? Accessing the network? Any help would be appreciated…

Dump 1:
“dcs-worker-0” #15 daemon prio=4 os_prio=0 cpu=40607.55ms elapsed=1765.05s tid=0x00007f73bc92b000 nid=0x6b5c runnable [0x00007f73e4ed2000]
java.lang.Thread.State: RUNNABLE
at com.aspose.pdf.internal.l16t.lj.lI(Unknown Source)
at com.aspose.pdf.internal.l16if.lI.lf(Unknown Source)
at com.aspose.pdf.internal.l16if.lI.lj(Unknown Source)
at com.aspose.pdf.internal.l16if.lb.lt(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lj(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15y.lj.lI(Unknown Source)
at com.aspose.pdf.internal.ld.lh.lf(Unknown Source)
at com.aspose.pdf.l6n.lI(Unknown Source)
at com.aspose.pdf.l6n.lI(Unknown Source)
at com.aspose.pdf.ADocument.lI(Unknown Source)
at com.aspose.pdf.ADocument.save(Unknown Source)
at com.aspose.pdf.Document.save(Unknown Source)
at com.centra.dcs.handlers.PdfHandler.handleConvertToHtml(PdfHandler.java:191)

Dump 2:
“dcs-worker-0” #15 daemon prio=4 os_prio=0 cpu=288943.33ms elapsed=2019.16s tid=0x00007f73bc92b000 nid=0x6b5c runnable [0x00007f73e4ed2000]
java.lang.Thread.State: RUNNABLE
at com.aspose.pdf.internal.l16t.lj.lh(Unknown Source)
at com.aspose.pdf.internal.l16t.lj.lI(Unknown Source)
at com.aspose.pdf.internal.l16t.lj.lu(Unknown Source)
at com.aspose.pdf.internal.l16if.lb.lI(Unknown Source)
at com.aspose.pdf.internal.l16if.lb.ld(Unknown Source)
at com.aspose.pdf.internal.l16if.lb.lt(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lj(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
at com.aspose.pdf.internal.l15y.lj.lI(Unknown Source)
at com.aspose.pdf.internal.ld.lh.lf(Unknown Source)
at com.aspose.pdf.l6n.lI(Unknown Source)
at com.aspose.pdf.l6n.lI(Unknown Source)
at com.aspose.pdf.ADocument.lI(Unknown Source)
at com.aspose.pdf.ADocument.save(Unknown Source)
at com.aspose.pdf.Document.save(Unknown Source)
at com.centra.dcs.handlers.PdfHandler.handleConvertToHtml(PdfHandler.java:191)

Environment:
Java: OpenJDK 11.0.2
Aspose: aspose.pdf-19.10.jar
OS: CentOS Linux release 7.6.1810 (Core)

Thanks,
Tom

@thegg,

Can you please share source files along with sample code so that we may further investigate to help you out. Also please try to use Aspose.PDF latest version on your end before sharing requested information.

IEEE-paper.pdf (4.7 MB)
Reminder: The problem is only reproducible on one machine in our lab. I’m not looking for a fix, I’m just interested in what Aspose is doing when it hangs.

I’ve repro’d my hang with the attached document (IEEE-paper.pdf) with the latest aspose.pdf-20.1.jar using the following code:

    public static void main(String args[]) {
        try {
            // licenseStuffNotIncluded();

            Document pdfDocument = new Document("IEEE-paper.pdf");
            HtmlSaveOptions saveOptions = new HtmlSaveOptions();
            saveOptions.setFixedLayout(true);
            saveOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
            pdfDocument.save("IEEE-paper.html", saveOptions);
        } catch (Throwable t) {
            t.printStackTrace();
        }
    }

It hangs within pdfDocument.save(). Here are two stack traces from when it hangs:

Trace 1:

       java.lang.Thread.State: RUNNABLE
            at com.aspose.pdf.internal.l16t.lj.lI(Unknown Source)
            at com.aspose.pdf.internal.l16if.lI.lf(Unknown Source)
            at com.aspose.pdf.internal.l16if.lI.lj(Unknown Source)
            at com.aspose.pdf.internal.l16if.lb.lt(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lj(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15y.lj.lI(Unknown Source)
            at com.aspose.pdf.internal.ld.lh.lI(Unknown Source)
            at com.aspose.pdf.l7l.lI(Unknown Source)
            at com.aspose.pdf.l7l.lI(Unknown Source)
            at com.aspose.pdf.ADocument.lI(Unknown Source)
            at com.aspose.pdf.ADocument.save(Unknown Source)
            at com.aspose.pdf.Document.save(Unknown Source)

Trace 2:

       java.lang.Thread.State: RUNNABLE
            at com.aspose.pdf.internal.l16t.lj.lI(Unknown Source)
            at com.aspose.pdf.internal.l16t.lj.le(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lI(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lf(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lI(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lI(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lI(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lI(Unknown Source)
            at com.aspose.pdf.internal.l16if.lf.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lj(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15n.lI.lI(Unknown Source)
            at com.aspose.pdf.internal.l15y.lj.lI(Unknown Source)
            at com.aspose.pdf.internal.ld.lh.lI(Unknown Source)
            at com.aspose.pdf.l7l.lI(Unknown Source)
            at com.aspose.pdf.l7l.lI(Unknown Source)
            at com.aspose.pdf.ADocument.lI(Unknown Source)
            at com.aspose.pdf.ADocument.save(Unknown Source)
            at com.aspose.pdf.Document.save(Unknown Source)

@thegg,

Thanks for contacting support.

I have observed your comments and created investigation ticket with ID PDFJAVA-39141 in our issue tracking system to investigate and resolve this issue as soon possible.

Any update on this? As I said, I’m not looking for a fix, I’d just like to get an idea of what it’s doing. For example: .woff file creation?

@thegg,

I like to inform that Aspose.PDF is performing its operation which is specified using given code. Also i like to inform that some issues are document specific which needs to be investigated in details. Only than we can tell you exactly what is the reason for this issue. I request for your patience and we will share good news with you soon.

@thegg

According to the investigation and provided stack traces we found the following:

  • in Trace 1 - code is creating and analyzing Composite Page Elements, that were created by Html Document Builder Factory.
  • in Trace 2 - code is working on Hierarchy Building and Document Analyzing during Html Document Builder Factory process.

We could not reproduce the issue in either of the mentioned version or in the last version, maybe more heap memory required.

We need more information about the system environment where hanging was noticed. You can try increasing heap memory size and if the issue still persists, please let us know about your environment details. The earlier logged ticket is closed now.