Java Virtual threads : sometimes results in com.aspose.words.FileCorruptedException

Hi – We’ve discovered that using a ExecutorService of virtual threads, will sometimes lead to
exception : “com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.”

gist of the exceptions: https://gist.githubusercontent.com/alexworkorb/81d81003197a974afe008c8ddbf71cfd/raw/c77219ad551ad9bbc07f123f7825790063f91ccb/asp1

Our Java Word aspose version is 25.6 . Our java version is “OpenJDK 64-Bit Server VM Temurin-21.0.2+13 (build 21.0.2+13-LTS,”

Sample code:



  void doAsposeDocsTest() throws Exception {
    ExecutorService executorService =
      Executors.newThreadPerTaskExecutor(Thread.ofVirtual().name("aspose-v-", 0).factory());

    // Using the below executor -- parsing works fine
//    ExecutorService executorService = Executors.newFixedThreadPool(30);

    // put a few Word documnents files  in this folder:
    String docsFolder = "/tmp/docs/";
    Set<Path> files = Files.list(Path.of(docsFolder)).collect(Collectors.toSet());
    for (int repeat = 0; repeat < 30; ++repeat) {
      for (Path docPath : files) {
        log.info("doc: {}", docPath.getFileName());
        final byte[] fileBytes = Files.readAllBytes(docPath);
        Runnable nr = new Runnable() {
          @Override
          public void run() {
            try {
              byte[] byteCopy = new byte[fileBytes.length];
              System.arraycopy(fileBytes, 0, byteCopy, 0, fileBytes.length);
              Document adoc = new Document(new ByteArrayInputStream(byteCopy));
              log.info("There are {} pages in the document {}", adoc.getPageCount(), docPath.getFileName());
            } catch (Exception e) {
              log.error("Cannot load doc {}, error", docPath.getFileName(), e);
            }
          }
        };
        executorService.submit(nr);
      }
    }

    executorService.shutdown();
    executorService.awaitTermination(1L, TimeUnit.HOURS);
  }

@alex987654321 Aspose.Words does support multi-threading. The only thing you need to make sure is that always use separate Document instance for each thread. One thread should use one Document object. Since virtual threads allow unbounded concurrency , you should restrict how many docs are processed simultaneously.

@vyacheslav.deryushev
Please see the sample code we provided . Each virtual thread is using it’s own instance of Document

Aso if we use a regular thread pool executor (not backed by virtual threads) (also in our code above)

//    ExecutorService executorService = Executors.newFixedThreadPool(30);

then the test code works fine.

Also please check the stack trace. https://gist.githubusercontent.com/alexworkorb/81d81003197a974afe008c8ddbf71cfd/raw/c77219ad551ad9bbc07f123f7825790063f91ccb/asp1 The failure with virtual threads is always consistently:

com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.
	at com.aspose.words.FileFormatUtil.zzYqr(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.zzWod(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
...
Caused by: java.lang.NullPointerException: Cannot invoke "com.aspose.words.internal.zzZAL.getLocale()" because the return value of "com.aspose.words.internal.zzY8g.zzXIJ()" is null
	at com.aspose.words.internal.zzWyb.zzRg(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzWSH.zzWfk(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]


@alex987654321

Please attach sample files with which the problem can be reproduced.

@alexey.maslov
Any word file will trigger this issue with virtual threads . I have prepared some samples.
samplefiles.zip (22.3 KB)

Again our Aspose Java version is 25.6 , our JVM is:

openjdk version "21.0.2" 2024-01-16 LTS
OpenJDK Runtime Environment Temurin-21.0.2+13 (build 21.0.2+13-LTS)
OpenJDK 64-Bit Server VM Temurin-21.0.2+13 (build 21.0.2+13-LTS, mixed mode, sharing)

Sample code:

void doAsposeDocsTest() throws Exception {
    ExecutorService executorService =
      Executors.newThreadPerTaskExecutor(Thread.ofVirtual().name("aspose-v-", 0).factory());

    // NOTE:  Using the  executor below code will work fine,
    //    ExecutorService executorService = Executors.newFixedThreadPool(30);

    Object readLock = new Object();
    // put a few .doc files here:
    String docsFolder = "/tmp/docs/";
    Set<Path> files = Files.list(Path.of(docsFolder)).collect(Collectors.toSet());
    for (int repeat = 0; repeat < 30; ++repeat) {
      for (Path docPath : files) {
        log.info("doc: {}", docPath.getFileName());
        final byte[] fileBytes = Files.readAllBytes(docPath);
        Runnable nr = new Runnable() {
          @Override
          public void run() {
            try {
              byte[] byteCopy = new byte[fileBytes.length];
              System.arraycopy(fileBytes, 0, byteCopy, 0, fileBytes.length);
              Document adoc = null;
             //  You can remove this synchronized() block -- the FileCorruptedException issue will still happen 
              synchronized (readLock) {
                adoc = new Document(new ByteArrayInputStream(byteCopy));
              }
              if (adoc != null) {
                log.info("Current thread ID: {}", Thread.currentThread().getId());
                log.info("There are {} pages in the document {}", adoc.getPageCount(), docPath.getFileName());
              }
            } catch (Exception e) {
              log.error("Cannot load doc {}, error", docPath.getFileName(), e);
            }
          }
        };
        executorService.submit(nr);
      }
    }

    executorService.shutdown();
    executorService.awaitTermination(1L, TimeUnit.HOURS);
  }

Sample exception:
image.jpg (299.5 KB)

Cannot load doc doc4.docx, error

com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.
	at com.aspose.words.FileFormatUtil.zzYqr(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.zzWod(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.<init>(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.<init>(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.<init>(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at ai.lynx.staging.commands.TestLimeCommand$1.run(TestLimeCommand.java:2233) ~[classes/:na]
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:572) ~[na:na]
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) ~[na:na]
	at java.base/java.lang.VirtualThread.run(VirtualThread.java:309) ~[na:na]
Caused by: java.lang.NullPointerException: Cannot invoke "com.aspose.words.internal.zzZAL.getLocale()" because the return value of "com.aspose.words.internal.zzY8g.zzXIJ()" is null
	at com.aspose.words.internal.zzWyb.zzRg(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzWSH.zzWfk(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzWSH.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzZWj.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzXgX.zzZy7(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzXgX.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zz7T.zzWod(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zz7T.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzZoM.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzY.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.internal.zzY.zzXjs(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.zzWpe.zzi4(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.zzXi4.zzYcP(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.zzZJo.zzYcP(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	at com.aspose.words.Document.zzWod(Unknown Source) ~[aspose-words-25.6-jdk17.jar:25.6.0]
	... 8 common frames omitted

@alex987654321
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSJAVA-3120

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.