Html to pdf conversion css issue

pshea · October 2, 2017, 11:56am

Hello,

We are trying to convert HTML file with bootstrap and other css into a PDF file using Aspose.Pdf (dll version - 17.1.0.0).
We have included external css using HtmlLoadOptions class.But the generated pdf is not like its HTML page and looks distorted.
Please refer to the attached sample input html file and generated pdf file.

input_html.zip (362.3 KB)
Subform_.pdf (860.7 KB)

Please look into the issue and let us know.

Thanks

asad.ali · October 2, 2017, 2:20pm

@pshea

Thanks for contacting support.

We have tested the scenario while using following code snippet with Aspose.Pdf for .NET 17.9 and observed that the generated PDF by this version was better than that by version 17.1.0. For your reference, we have attached generated output and used code snippet as well.

var loadoptions = new HtmlLoadOptions(dataDir);
loadoptions.PageInfo.Margin = new MarginInfo(10, 10, 10, 10);
Document doc = new Document(dataDir + "Subform_.html", loadoptions);//MergePDF(docList);
doc.Save(dataDir + "Subform_.pdf");

Subform_.pdf (883.6 KB)

Furthermore, we have observed that the content was misaligned in the resultant PDF document and for the sake of correction, we have logged an issue as PDFNET-43438 in our issue tracking system. We will further look into the details of the issue and keep you posted with the status of its correction. Please be patient and spare us little time.

We are sorry for the inconvenience.

pshea · March 19, 2018, 9:15am

Hello Sir,
GeneratedPDF_using_Aspose_18.1.pdf (888.8 KB)
GeneratedPDF_using_Aspose_17.1.pdf (968.7 KB)

We have tried with Aspose latest library ( 18.1 ) for html to pdf conversion. It has improvement but generated pdf still having desing issues.
here I am attaching both the pdf, one is created from earlier Aspose version (17.1) and another is created from latest Aspose version ( 18.1 ).

Thanks

asad.ali · March 19, 2018, 5:20pm

@pshea

Thanks for contacting support.

We have tested the HTML to PDF conversion while using Aspose.PDF for .NET 18.3 (which is latest version) and observed that the output PDF still had issues related to misalignment and overlapping of content. Please note that earlier logged issue PDFNET-43438 is not yet resolved and pending for the analysis, due to other pending issues in the queue.

However in every release of the API, we include improvements and enhancement, due to which conversion results may look better. We will definitely provide a fix against logged issue after resolving previously logged issues and as soon as we make some progress towards resolution of the issue, we will inform you. Please be patient and spare us little time.

We are sorry for the inconvenience.

pshea · October 8, 2018, 12:55pm

Have your team find any luck regarding this issue.

asad.ali · October 8, 2018, 4:34pm

@pshea

Thanks for your inquiry.

As the issue was logged under free support model, it has low priority and we regret to share that it is not yet resolved due to other pending issues in the queue. Please note that issues under free support model get resolved on first come first serve basis and there is large number of pending issues in the queue reported prior to yours. We will definitely let you know in case we have some certain updates regarding issue resolution.

Furthermore, you may also check our paid support option where issues have high priority and are resolved on urgent basis. If your issue is a blocker, you may please consider reporting issue ID in paid support to escalate its priority.

We are sorry for the inconvenience.

Jayanna · December 21, 2018, 4:39pm

we have used aspose SDK to convert from PDF to HTML but it is taking more time.
can i know standard time to convert from PDF to HTML ?
300 pages how much time it will take ? in our system it took 20 min to convert 300 pages pdf to html

asad.ali · December 21, 2018, 7:30pm

@Jayanna

Thanks for contacting support.

There is no such specifications specified for the API regarding time cost which it will take while processing conversion process. However, larger PDF documents require more time and memory in order to get converted into HTML. Also, it depends upon the structure and complexity of document how much time and memory cost will be required.

Furthermore, memory consumption and performance of the API is improved in each revision and in case you are experiencing any issue regarding time taken by the API, please share your sample PDF document along with your environment details. We will test the scenario in our environment and address it accordingly.

PS: In case your document is of larger size, you may please upload it to Dropbox or Google Drive and share the the link with us.

Jayanna · January 5, 2019, 7:24am

Thank you for your reply.

Conversion processing time is very important in our application.
Aspose is taking 20 mins for 300 pages document but others are taking less than 1 mins for any structure and complexity of document.
Document conversion from PDF to HTML. [Aspose.PDF for Java]

will conversion processing time is less in any of ASPOSE product ?

Jayanna · January 5, 2019, 7:27am

There should be a time limit, I am not happy with your standard answer that it depends upon the structure and complexity of document.
please look at the conversion processing time and let me know the calculation.
what is the approximate time taking for processing each document depends on size or pages?

asad.ali · January 5, 2019, 2:58pm

@Jayanna

Thanks for writing back.

We have tested the scenario in our environment using Aspose.PDF for Java 18.12 by converting one of our sample PDFs into HTML and were unable to notice that much time in conversion process. As requested earlier, would you please share your sample PDF document and environment details with us. This will help us replication the issue in our environment and address it accordingly.

Jayanna · January 7, 2019, 9:11am

Environment details of conversion. please let us know if required any corrections.

Operating System: Windows 32 bit

Development Environment: Java

Source code used:

import com.aspose.pdf.Document;
import com.aspose.pdf.HtmlSaveOptions;
import com.aspose.pdf.LettersPositioningMethods;

public class PDFToHTMLSingleHTMLWithAllResourcesEmbedded {

            public static void main(String[] args) {
                            // Load source PDF file
                            Document doc = new Document("input.pdf");
                            // Instantiate HTML Save options object
                            HtmlSaveOptions newOptions = new HtmlSaveOptions();
                            // Enable option to embed all resources inside the HTML
                            newOptions.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;
                            // This is just optimization for IE and can be omitted
                            newOptions.LettersPositioningMethod = LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;
                            newOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;
                            newOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats;
                            // Output file path
                            String outHtmlFile = "Single_output.html";
                            // Save the output file
                            doc.save(outHtmlFile, newOptions);
            }

}

asad.ali · January 7, 2019, 3:57pm

@Jayanna

Thanks for providing these details.

Please also share which JDK Version you are using and in which IDE you are developing your application e.g. Eclipse, Intelli J Idea, etc. Also, please share your PDF document with us so that we can test the scenario in our environment and try to replicate the issue.