Reconstructing PDF -> HTML -> PDF

@anubha16

We did the following to test the scenario at our end:

  1. Converted PDF to XLSX using Aspose.PDF for Java
  2. Opened output XLSX in MS Excel
  3. Converted it to HTML and save
  4. Opened the output HTML in browser
  5. Inspected the element highlighted by you in the attached image
  6. Each individual character in the row was having separate font tag

It seems like the issue is related to PDF to XLSX conversion by Aspose.PDF. Therefore, an issue as PDFJAVA-40227 has been logged in our issue tracking system for the sake of correction. We will further look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

Thank you for looking into this issue.

For this same use case (pdf -> xlsx -> html), I need to make changes to the html and then reconstruct back to pdf.

Attached is the new zip , SDAspose12cellsencoding-new.zip (17.0 KB)

  1. changed SDAspose12cellsencoding_files/sheet001.htm (in a couple of places just added “MODIFIED”

  2. re-zipped the SDAspose12cellsencoding.html and SDAspose12cellsencoding_files to create SDAspose12cellsencoding-new.zip

  3. Use the online converter html -> pdf (please see the attached pdf) - SDAspose12cellsencoding.pdf (6.1 KB)

Please advise what is the process to recreate the pdf file.

@anubha16

It seems like you are converting the HTML to PDF using Aspose.HTML. We are checking from this perspective and will get back to you shortly.

@anubha16

We tested the scenario using Aspose.HTML for Java 20.12 while converting your HTML into PDF and faced java.lang.NullPointerException. Therefore, have logged an issue as HTMLJAVA-748 in our issue tracking system. Also, while converting HTML into PDF using Aspose.HTML for .NET, we received the same output PDF that you have shared for our reference.

A separate ticket as HTMLNET-3041 has been logged in our issue tracking system for the incorrect output PDF. We will further look into details of these logged tickets and keep you posted with the status of their rectification. Please be patient and spare us some time.

We apologize for the inconvenience.

The issues you have found earlier (filed as PDFJAVA-40227) have been fixed in Aspose.PDF for Java 21.3.

Thanks for the update.

The issues you have found earlier (filed as WORDSNET-21875) have been fixed in this Aspose.Words for .NET 21.4 update and this Aspose.Words for Java 21.4 update.