Aspose.Pdf could not convert html to pdf

rsamveda.ramsoft · November 8, 2015, 10:56pm

Hi,

On using the Aspose.pdf component to convert html to pdf. The title is displayed vertically on the pdf.

Code used:

Aspose.Pdf.Generator.Pdf ReportPdf = new Aspose.Pdf.Generator.Pdf();

Aspose.Pdf.Generator.Section section = ReportPdf.Sections.Add();

Aspose.Pdf.Generator.Text CurPage = new Aspose.Pdf.Generator.Text(HtmlContentStringItem);

CurPage.IsFirstParagraph = true;

CurPage.IsHtmlTagSupported = true;

CurPage.UseTextInfoStyle = true;

CurPage.TextInfo.LineSpacing = 1;

section.Paragraphs.Add(CurPage);

ReportPdf.Save(convertedObjstream);

Please find the attachment of input html file and Output pdf document.

tilal.ahmad · November 9, 2015, 12:41am

Hi Robert,

Thanks for your inquiry. Please use new DOM approach for HTML to PDF conversion, it is more improved and efficient it will resolve the issue.

Please feel free to contact us for any further assistance.

Best Regards,

rsamveda.ramsoft · November 9, 2015, 12:28pm

Hi Tilal,

We are in need of a custom page stitch and that is the reason we use Aspose.Pdf in addition to Aspose.Words which is used for other pdf conversion.
The following is the code we use to stitch the html fragments as pages in to the pdf. But this code fails in conversion of normal html which has been shown on the previous attachments.

Please suggest how to handle both the scenario.

Note: Attached is an example html text we use to convert to pdf with page breaks manually handle by the following code.

Code we used:

Code

MemoryStream convertedObjstream = new MemoryStream();

try

{

Aspose.Pdf.Generator.Pdf ReportPdf = new Aspose.Pdf.Generator.Pdf();

Aspose.Pdf.Generator.Section section = ReportPdf.Sections.Add();

string[] stringSeparators = new string[] { MammoTrackingStringSeperator };

string[] HtmlContentStringList;

HtmlContentStringList = HTMLStr.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);

foreach (string HtmlContentStringItem in HtmlContentStringList)

{

Aspose.Pdf.Generator.Text CurPage = new Aspose.Pdf.Generator.Text(HtmlContentStringItem);

CurPage.IsFirstParagraph = true;

CurPage.IsHtmlTagSupported = true;

CurPage.UseTextInfoStyle = true;

CurPage.TextInfo.LineSpacing = 1;

section.Paragraphs.Add(CurPage);

}

// This operation is put in try-catch block to handle situations when operation fails for some reason.

try

{

LogInfo(“DocumentConversion.ConvertHTMLStringToPDF - Before : wordDoc.Save(convertedObjstream, saveOptions);”);

ReportPdf.Save(convertedObjstream);

convertedObjstream.Position = 0;

_outPutStream = new byte[convertedObjstream.Length];

LogInfo(“DocumentConversion.ConvertHTMLStringToPDF - Before : convertedObjstream.Read(_outPutStreamBytes, 0, (int)convertedObjstream.Length);”);

convertedObjstream.Read(_outPutStream, 0, (int)convertedObjstream.Length);

if (convertedObjstream != null && convertedObjstream.Length > 0)

{

success = true;

}

catch (Exception e)

{

_error = e.Message;

// trap the exeception

string errorMsg = "DocumentConversion - C# ConvertHTMLStringToPDF: Exception :\n\n" + e.Message + "\n\nStack Trace:\n" + e.StackTrace;

LogInfo(errorMsg);

}

finally

{

convertedObjstream.Dispose();

GC.Collect();// this is not a good practice but any how since we do a single operation and we need to free up memory we call this method

}

Thanks,

Robert

tilal.ahmad · November 10, 2015, 9:20am

Hi Robert,

Thanks for your feedback. If you want add HTML text page breaks, you can add html text into a new Page object using HtmlFragment object it will automatically add a page break. Please check following code snippet for the purpose, it will help you to accomplish the task.

Document doc = new
Document();<o:p></o:p>

StreamReader r = File.OpenText("Test 1.html");

String html = r.ReadToEnd();

string[] stringSeparators = new string[] { "#$NP" };

string[] HtmlContentStringList;

HtmlContentStringList = html.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);

foreach (string HtmlContentStringItem in HtmlContentStringList)

{

Aspose.Pdf.Page page = doc.Pages.Add();

HtmlFragment htmlFragment = new HtmlFragment(HtmlContentStringItem);

page.Paragraphs.Add(htmlFragment);

}

doc.Save("htmlfragment.pdf");

Please feel free to contact us for any further assistance.

Best Regards,

rsamveda.ramsoft · November 10, 2015, 1:50pm

Hi Tilal,

Based on your suggestion, we modified our code and works good.

thank you for the suggestion.

Robert

tilal.ahmad · November 10, 2015, 10:10pm

Hi Robert,

Thanks for your feedback. It is good to know that suggestion worked for you.

Please keep using our API and feel free to contact us for any further assistance.

Best Regards,

rsamveda.ramsoft · November 26, 2015, 10:41am

Hi Tilal,

This is another issue we are facing of using the Aspose.Pdf for converting HTML to PDF, which is not seen if we use Aspose.Words.

Not working good:

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();

StreamReader r = File.OpenText("Test 1.html");

String html = r.ReadToEnd();

string[] stringSeparators = new string[] { "#$NP" };

string[] HtmlContentStringList;

HtmlContentStringList = html.Split(stringSeparators, StringSplitOptions.RemoveEmptyEntries);

foreach (string HtmlContentStringItem in HtmlContentStringList)

{

Aspose.Pdf.Page page = doc.Pages.Add();

HtmlFragment htmlFragment = new HtmlFragment(HtmlContentStringItem);

page.Paragraphs.Add(htmlFragment);

}

doc.Save("htmlfragment.pdf");

Working good:

SaveOptions saveOptions = SaveOptions.CreateSaveOptions(SaveFormat.Pdf);
saveOptions.PrettyFormat = true;
MemoryStream docStream = new MemoryStream(_inputDocumentStream);
MemoryStream convertedObjstream = new MemoryStream();

Aspose.Words.Document wordDoc = new Aspose.Words.Document(docStream);

wordDoc.Save(convertedObjstream, saveOptions);

Please find the attached source html file (CCDA file.html), the out put pdf file converted using Aspose.pdf (CCDA file converted using Aspose.Pdf Not Good.pdf) and the out put pdf file converted using Aspose.Words (CCDA file converted using Aspose.Words Good.pdf)

Please suggest us the solution as we use Apose.Pdf for html to pdf conversion.

Thanks

Robert

codewarior · November 27, 2015, 9:21am

Hi Robert,

Thanks for contacting support.

I have tested the scenario using Aspose.Pdf for .NET 11.0.0 and as per my observations, the table on second page is being truncated and moving beyond right page margin. For the sake of correction, I have logged this problem
as PDFNEWNET-39762 in our issue tracking system. We will
further look into the details of this problem and will keep you updated on the
status of correction. Please be patient and spare us little time. We are sorry
for this inconvenience.

aspose.notifier · April 12, 2021, 7:01pm

The issues you have found earlier (filed as PDFNET-39762) have been fixed in Aspose.PDF for .NET 21.4.