Convert PDF to HTML and HTML to PDF using Aspose.PDF for .NET - formatting is lost

@agrawaltejas

Please note that the investigation against earlier logged ticket is currently underway. HTML file in the “Ravi Paladiya Resume_SeniorMSDynamicsCRMDeveloper_6 Years.pdf.zip” archive is different in appearance from the original PDF and does not contain several graphic elements.

If all the graphics are not needed, you can please use the following code snippet.

Aspose.Pdf.HtmlSaveOptions saveOptions = new Aspose.Pdf.HtmlSaveOptions
{
    FixedLayout = false, // specifies Flow page mode, removes all images
    RasterImagesSavingMode = Aspose.Pdf.HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground,
    PartsEmbeddingMode = Aspose.Pdf.HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml,
    SplitIntoPages = false,
    // FontSavingMode = Aspose.Pdf.HtmlSaveOptions.FontSavingModes.DontSave, // uses system fonts instead of embedded PDF fonts
};
wrdf.Save(htmlOutput, saveOptions);

We attached an archive with 2 HTML files (fixed and flow layout) and 2 PDF files edited and created in Tidy using the Print button in the Chrome browser.
Fixed_Flow_Html_and_Tidy_Edited_Pdf.zip (428.0 KB)

In case the above suggestions do not satisfy your needs. Please provide the following information so that we can continue the investigation from that perspective.

Please clarify exactly what you expect from an HTML file converted using the Aspose API.

  • We need a list of forbidden tags and styles, required elements for HTML,
  • as well as an example of an edited HTML file and an example of a PDF file created on its basis.

Hi Asad,

The Fixed HTML file is the not good when we convert that to pdf from our Kendo RichTextEditor. But the other file with Flow one is good when we convert to pdf but that does not have the formatting like colors, images and all that we had in source pdf.

So, can you provide the solution that we can have the html that has the formatting like source but in flow format that you send in the attached zip.

@agrawaltejas

We will further investigate the ticket against this requirement and feedback and will get back to you as soon as we have some results to share. Please spare us some time.

Hi asad,

Any updates on this, we are eagerly waiting for this fix.

@agrawaltejas

We definitely value your concerns however, ticket is not yet resolved regretfully due to other high priority tasks in the queue. We will surely inform you as soon as we have some definite updates in this matter. Please spare us some time.

We are sorry for the inconvenience.

May we please have an update to this?

@agrawaltejas

Regretfully, Aspose.PDF is currently unable to convert PDF to HTML in Flow mode while preserving the original formation and images. For the sake of implementation, we have logged a feature request as PDFNET-48010 in our issue tracking system. We will investigate the feasibility of this feature and keep you informed about the status of its availability. Please spare us some time.

We are sorry for the inconvenience.

how much time do you need? we reported this problem over 2 months ago. We’d like to purchase your product but this is a problem.

@agrawaltejas

We surely value your concerns and would like to share with you that new features take certain amount of time for implementations depending upon how many priority tasks are there in the queue and how many internal API components will be affected by them. Regretfully, we are not in a position to share any reliable ETA. However, we will inform you as soon as we make some significant progress in this regard. We greatly appreciate your patience in this matter.

We are sorry for the inconvenience.