URGENT! Doc Document Conversion Issue

Hi,

We are facing the issue while extracting the text from the document. It is not converting and not returning any error.
Code :

License license = new License();
InputStream streamLicense = licenceStream();
license.setLicense(streamLicense);
LoadOptions opts = new LoadOptions();
opts.setResourceLoadingCallback(new HandleResourceLoadingCallback());
Document doc = new Document(filedat, opts);
streamLicense.close();
filedat.close();
//getpages
pageCount = doc.getPageCount();

then extracting text as per the pages

Kindly check it on top priority, as it is on our production
Aspose Version - 22.9

PFA…
Sample.zip (195.8 KB)

@rchilli As I can see the text is properly extracted from your document. Here is code I have used for testing:

Document doc = new Document("C:\\Temp\\in.doc");
for (int i=0; i<doc.getPageCount(); i++)
{
    Document page = doc.extractPages(i,1);
    page.save("C:\\Temp\\page_"+i+".txt");
}

Or this, if you need to extract text as a string:

Document doc = new Document("C:\\Temp\\in.doc");
for (int i=0; i<doc.getPageCount(); i++)
{
    Document page = doc.extractPages(i,1);
    System.out.println(page.toString(SaveFormat.TEXT));
    System.out.println("-----------------------------------------");
}

Can you please share the response you are getting - Text File?

Also, I checked at Online Aspose Document Conversion, it is returning result over there

@rchilli Sure, here is the output produced on my side using the above mentioned code: out.zip (2.7 KB)

I have used the latest 22.12 version of Aspose.Words for Java for testing.

Might be issue with the fonts

Now it is working on one server but not on another server.

We are using the below command for installing fonts on server

yum install -y fontconfig && fc-cache

But not working on another server, can you provide the solution?

@rchilli Could you please answer the following questions? This will help us to understand the problem:

  1. What Linux distribution is used on the server, which does not work?
  2. Do you see any errors or exceptions in the application logs or in the output?
  3. Does the problem occur only with the attached document or it occurs with any document on the problematic server?
  4. Could you please try the code I have provided earlier in a simple console application for testing purposes?
  1. What Linux distribution is used on the server, which does not work? - Centos 8
  2. Do you see any errors or exceptions in the application logs or in the output? - No
  3. Does the problem occur only with the attached document or does it occur with any document on the problematic server? - Only with this resume
  4. Could you please try the code I have provided earlier in a simple console application for testing purposes? - Yes tried on few of the servers we have, but not working on that specific server

@rchilli Thank you for additional information. But it is difficult to say what is going wrong on the problematic server without an ability to reproduce the problem on our side. What output do you get when run the above mentioned code on the problematic server? Does your application hang or crashes or something?

What output do you get when run the above mentioned code on the problematic server? - It is not returning anything, just stucked. Also I checked more and found out that it got stucked while getting the page count (doc.getPageCount())

Does your application hang or crashes or something? - No, just stucked. No response nothing

It is very crucial for us, client is waiting for the solution

@rchilli Unfortunately, I still cannot reproduce the problem on my side. I have tried to recreate your environment in Docker, but still the problem is not reproducible. Here is my testing Dockerfile:

FROM centos:centos8

USER root

RUN cd /etc/yum.repos.d/
RUN sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-*
RUN sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*

RUN yum install -y \
   java-1.8.0-openjdk \
   java-1.8.0-openjdk-devel

ENV JAVA_HOME /usr/lib/jvm/java-1.8.0-openjdk/

RUN yum install -y fontconfig && fc-cache

COPY ./out/artifacts/TestJava_jar/ /tmp
WORKDIR /tmp
ENTRYPOINT ["java","-jar","TestJava.jar"]

I have tested with both 22.9 and the latest 22.12 versions of Aspose.Words for Java and both versions works fine. Have you tried with the latest 22.12 version of Aspose.Words on your side?

Hi

Tested using latest aspose, but it is still not working over there

@rchilli Is it possible to recreate the problematic environment in Docker? This will help us to understand and analyze the problem. Unfortunately, I was unable to reproduce the problem on my side and it is impossible to tell what is going wrong on your side with ability to reproduce the same problem on our side.

Let me check with DevOps team and get back to you

@rchilli Thank you, we will wait for your inputs.

We just changed the version to the latest and reinstalled the fonts, now it is parsing fineThanks for your support

1 Like