Arabic text extraction from PDF

Hi , We are using aspose to extract arabic text from pdf files .

The problem is the extracted text looks encrypted , our code :
public String getString() throws Exception {
com.aspose.pdf.Document pdfDocument =null;
String extractedText = “”;
try {
if (inputStream == null) {
pdfDocument = new com.aspose.pdf.Document(this.path);
}
else {
pdfDocument = new com.aspose.pdf.Document(this.inputStream);
}
com.aspose.pdf.TextAbsorber textAbsorber = new com.aspose.pdf.TextAbsorber();
pdfDocument.getPages().accept(textAbsorber);
extractedText = textAbsorber.getText();
}
finally {
pdfDocument.freeMemory();
pdfDocument.dispose();
pdfDocument.close();
pdfDocument=null;
}

return extractedText;
}
Attached the Result of text extraction with sample pdf file.
Could you please assist us to solve this issue .
Thanks in advance.

Hi Feras,

Thanks for your inquiry. I have tested your scenario with your shared document using Aspose.Pdf for .NET 10.2.0 and managed to observe the reported issue. For further investigation, I have logged an issue in our issue tracking system as PDFNEWNET-38416 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

Please feel free to contact us for any further assistance.

<span style=“font-size:10.0pt;line-height:115%;font-family:“Arial”,“sans-serif”;
mso-fareast-font-family:Calibri;mso-ansi-language:EN-US;mso-fareast-language:
EN-US;mso-bidi-language:AR-SA”>Best Regards

Please to note that we are using JAVA platform as we submitted an example of source code.

thanks.

Hi Feras,


Thanks for the acknowledgement.

I have tested the scenario with Aspose.Pdf for Java 10.1.0 and have managed to reproduce same issue that Arabic text is not properly being extracted from PDF file. For the sake of correction, I have logged it in
our issue tracking system as PDFNEWJAVA-34769. We
will investigate this issue in details and will keep you updated on the status
of a correction.

We apologize for your inconvenience.


PS, As Aspose.Pdf for Java is an autoported version from Aspose.Pdf for .NET, so first the fix will be made in Aspose.Pdf for .NET and then same fix will be ported to Java version.


Kindly , any update ?

Hi Feras,


Thanks for your inquiry. I am afraid your reported issue is still not resolved. As we have noticed it recently and It is pending for investigation due to other issues already under investigation and resolution. We will notify you as soon as we made some significant progress towards issue resolution.

We are sorry for the inconvenience caused.

Best Regards,
Dears,
The point PDFNEWJAVA-34769 is very important to us.
based on the priority support we have, can you please share a delivery date?

Regards,
AlainRUSSIER:
Dears,
The point PDFNEWJAVA-34769 is very important to us.
based on the priority support we have, can you please share a delivery date?
Hi Feras,

Please note that as a normal rule of practice, issues are resolved in first come and first serve basis; but the problems logged/reported under Enterprise or Priority support model, have high precedence in terms of resolution, as compare to issues under normal/free support model.

In case you need to have your issue prioritized, you may consider opting for Enterprise or Priority support options. Nonetheless, note that ES/PS support does not guarantee any immediate resolution of issues (because it might be dependent on other issues or feature which needs to be implemented) but under this model, the development team starts investigating the problem on high priority. For further details, please visit Support Options.

Dear,

The issue is reproduced with EVER TEAM. i beleive we have the purchased the priority support.
can you tell where i can post the problem in order to give it a high priority.
Regards,

Hi Alain,


Thanks for your inquiry. You may post your request to raise the priority of this in Priority Support forum, using Aspose id with priority support privileges. We will address this issue accordingly.

Best Regards,

we will post in the Priority Support forum.

meanwhile, please inform us if you have new updates.

thanks.


Hi Alain,


Thanks for your patience.

Both the issues are pending for review and I am afraid they are not yet resolved. However once you have raised the issue in Priority support forum, the investigation process will be expedited and then we will be able to share any possible news regarding their resolution.

The issues you have found earlier (filed as PDFNEWJAVA-34769) have been fixed in Aspose.Pdf for Java 10.6.0 .


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.
(3)