Exceptions with multiple threads

Hello,

when converting PDF to JPEGs and extracting text, with multiple threads, I get different Exceptions.

(Source code and Exceptions follow)

Thanks & Regards

This is the source code:

import com.aspose.pdf.Document;
import com.aspose.pdf.Page;
import com.aspose.pdf.PageCollection;
import com.aspose.pdf.TextFragmentAbsorber;
import com.aspose.pdf.TextFragmentCollection;
import com.aspose.pdf.TextSearchOptions;
import com.aspose.pdf.devices.JpegDevice;
import com.aspose.pdf.devices.Resolution;


public class Test {
private static String basePath = “C:\path\to\directory\”;

private static String inputPath = basePath + “TestSearch.pdf”;

private static int counter = 0;

private static Document document = null;

public static void main(String[] args) throws Exception {
while (true) {
document = new Document(inputPath);
convertDocument();
while (counter > 0) {
Thread.sleep(100);
}
document.close();
}
}

private static void convertDocument() {
runTask(new Runnable() {
@Override
public void run() {
PageCollection pages = document.getPages();

for (int j = 0; j < 2; j++) {

Page page = pages.get_Item(j + 1);

TextFragmentAbsorber absorber = new TextFragmentAbsorber(
“\S+”, new TextSearchOptions(true));

page.accept(absorber);

TextFragmentCollection fragmentsOfPage = absorber.getTextFragments();
}
}
});
for (int i = 0; i < 2; i++) {
final int pageNumber = i + 1;
runTask(new Runnable() {
@Override
public void run() {
Resolution res = new Resolution(150);
JpegDevice jpegDevice = new JpegDevice(res, 90);
String outputPath = basePath + “page” + pageNumber + “.jpeg”;
jpegDevice.process(document.getPages().get_Item(pageNumber), outputPath);
}
});
}

}

private static void runTask(final Runnable runnable) {
increaseCounter();
new Thread(new Runnable() {
@Override
public void run() {
try {
runnable.run();
} catch (Exception e) {
e.printStackTrace();
}
decreaseCounter();
}
}).start();
}

private static synchronized void increaseCounter() {
counter++;
}

private static synchronized void decreaseCounter() {
counter–;
}
}

These are the exceptions which occur after some iterations:

Exception in thread “Thread-18” Exception in thread “Thread-16” java.lang.ClassCastException: com.aspose.pdf.internal.p595.z27 cannot be cast to com.aspose.pdf.internal.p595.z8
at com.aspose.pdf.internal.p548.z26.m10(Unknown Source)
at com.aspose.pdf.internal.p548.z26.m3(Unknown Source)
at com.aspose.pdf.internal.p548.z26.m2(Unknown Source)
at com.aspose.pdf.internal.p548.z26.m1(Unknown Source)
at com.aspose.pdf.PageCollection.getUnrestricted(Unknown Source)
at com.aspose.pdf.PageCollection.m1(Unknown Source)
at com.aspose.pdf.PageCollection.get_Item(Unknown Source)
at Test$1.run(Test.java:59)

Exception in thread “Thread-6” class com.aspose.pdf.exceptions.CrossTableNotFoundException: Cross reference table or cross refference stream not found
com.aspose.pdf.internal.p548.z26.m3(Unknown Source)
com.aspose.pdf.internal.p548.z26.m2(Unknown Source)
com.aspose.pdf.internal.p548.z26.m1(Unknown Source)
com.aspose.pdf.PageCollection.getUnrestricted(Unknown Source)
com.aspose.pdf.PageCollection.m1(Unknown Source)
com.aspose.pdf.PageCollection.get_Item(Unknown Source)
Test$2.run(Test.java:79)

Exception in thread “Thread-2” class com.aspose.pdf.internal.p348.z72: class com.aspose.pdf.internal.p348.z109: Cannot access a disposed object.
Object name: ‘MemoryStream’.
com.aspose.pdf.internal.p363.z32.m1(Unknown Source)
com.aspose.pdf.internal.p363.z32.read(Unknown Source)
com.aspose.pdf.internal.p565.z7.m1(Unknown Source)
com.aspose.pdf.internal.p565.z7.m1(Unknown Source)
com.aspose.pdf.devices.z1.m1(Unknown Source)
com.aspose.pdf.devices.z1.m1(Unknown Source)
com.aspose.pdf.devices.ImageDevice.m1(Unknown Source)
com.aspose.pdf.devices.JpegDevice.processInternal(Unknown Source)
com.aspose.pdf.devices.PageDevice.process(Unknown Source)


java.lang.NullPointerException
at com.aspose.pdf.internal.p589.z11.m2(Unknown Source)
at com.aspose.pdf.internal.p589.z11.m7(Unknown Source)
at com.aspose.pdf.internal.p589.z13.m1(Unknown Source)
at com.aspose.pdf.internal.p589.z13.m1(Unknown Source)
at com.aspose.pdf.internal.p589.z13.m6(Unknown Source)
at com.aspose.pdf.internal.p589.z13.(Unknown Source)
at com.aspose.pdf.internal.p589.z13.(Unknown Source)
at com.aspose.pdf.TextFragmentAbsorber.visit(Unknown Source)
at com.aspose.pdf.Page.accept(Unknown Source)

java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at com.aspose.pdf.internal.p495.z3.m1(Unknown Source)
at com.aspose.pdf.internal.p495.z3.m2(Unknown Source)
at com.aspose.pdf.internal.p363.z28.read(Unknown Source)
at com.aspose.pdf.internal.p363.z1.m8(Unknown Source)
at com.aspose.pdf.internal.p363.z1.m18(Unknown Source)
at com.aspose.pdf.internal.p630.z7.m10(Unknown Source)
at com.aspose.pdf.internal.p630.z6.m10(Unknown Source)
at com.aspose.pdf.internal.p630.z7.m13(Unknown Source)


I attached the PDF-document which I used.

Hi Alexander,

Thanks for your inquiry. I have tested your scenario with shared code and document using Aspose.Pdf for Java 11.1.0 and managed to observe the reported exception. For further investigation, I have logged an issue in our issue tracking system as PDFNEWJAVA-35447 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

We are sorry for the inconvenience caused.

Best Regards,

Hi Alexander,


Thanks for your patience. We have investigated the issue and would like to suggest you to to use separate Document instances for text extraction and image conversion. Please find attached sample code snippet. It will help you to accomplish the task.

Please feel free to contact us for any further assistance.

Best Regards,