Converting Hebrew PDF to HTML

Hi.

I am trying to convert Hebrew PDF to HTML file.
After the convert was completed, i looked at the html file and saw that has all the letters reversed.
i am using this code:

Document pdfDocument = new Aspose.Pdf.Document(“c:\1\4160617.pdf”); pdfDocument.Save(“c:\1\4160617.html”, SaveFormat.Html);

The original pdf file is attached.
Please help.

Thanks

Hi Alex,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for considering Aspose.Pdf.

I am unable to see any attachment with your post. Please re-attach the template PDF file to help us test the issue at our end.

Sorry for the inconvenience,

Hi.

I attached the file.

Thanks.

Hi Alex,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for sharing the template file.

I am able to reproduce your mentioned issue after an initial test using your template file. Your issue has been registered in our issue tracking system with issue id: PDFNEWNET-34073. You will be notified via this forum thread regarding any updates against your mentioned issue.

Sorry for the inconvenience,

The issues you have found earlier (filed as PDFNEWNET-34073) have been fixed in Aspose.Pdf for .NET 7.4.0.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

Hi , i tested the new version , it still has some problems.

I attached 2 pdf files with explanations , please let me know what can be done about them.
Thanks.

617067.pdf
  • Text is aligned to left instead of being aligned to right
  • Parentheses are inverted
  • Quotation marks are added in some words, although there were no quotation marks at first place.
  • Underline words are saved in an outer image file instead of being saved as CSS.

4160617.pdf
  • All the problems from 617067.pdf
  • Words in the converted text are separated : קביע ה שיפו טית ב מקר ה דנן supposed to be קביעה שיפוטית במקרה דנן



Hi Alex,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for sharing the details and sample files.

alexanderorlovsky:
Text is aligned to left instead of being aligned to right

I am able to reproduce your mentioned issue and issue has been registered in our issue tracking system as PDFNEWNET-34399.

alexanderorlovsky:
Parentheses are inverted

I am able to reproduce your mentioned issue and issue has been registered in our issue tracking system as PDFNEWNET-34400.

alexanderorlovsky:
Quotation marks are added in some words, although there were no quotation marks at first place.

Could you please identify where this issue is coming as I am not able to see any such issue in the generated HTML file.

alexanderorlovsky:
Underline words are saved in an outer image file instead of being saved as CSS.

I am able to reproduce your mentioned issue and issue has been registered in our issue tracking system as PDFNEWNET-34401.

alexanderorlovsky:
Words in the converted text are separated : קביע ה שיפו טית ב מקר ה דנן supposed to be קביעה שיפוטית במקרה דנן

I am able to reproduce your mentioned issue and issue has been registered in our issue tracking system as PDFNEWNET-34402.

We will keep you posted regarding any updates as per your above reported issues.

Sorry for the inconvenience,

Hi.

Can you please estimate the fix time for these bugs?
We would like to work with this component, but those fixes are crucial for our purposes.

Thank You.


Hi Alex,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for being patient.

I have requested the development team to analyze the issues and share the ETA regarding the resolution of the issues. As soon as I get a feedback from them, I will update you via this forum thread.

Sorry for the inconvenience,

Hi.

Can you please update me regarding the status of this?

Thanks.

Hi Alex,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Thank you for being patient,

The development team is working on your reported issues, however, I don’t have any update at the moment regarding the ETA. I have requested the development team to share the updates and ETA regarding the resolution. As soon as I get a feedback, I will update you via this forum thread.

Sorry for the inconvenience,

The issues you have found earlier (filed as PDFNEWNET-34400) have been fixed in Aspose.Pdf for .NET 9.6.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.

The issues you have found earlier (filed as PDFNEWNET-34401) have been fixed in Aspose.Pdf for .NET 10.9.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.

The issues you have found earlier (filed as PDFNET-34399) have been fixed in Aspose.PDF for .NET 18.4. This message was posted using BugNotificationTool from Downloads module by asad.ali

@alexanderorlovsky

Would you kindly try to use latest version of the API i.e. Aspose.PDF for .NET 20.6 as we tested the scenario using this version and issue was not there. In case you face any issue, please feel free to let us know.