Formatting to PDF from HTML takes forever and timing out in our application

Hi Ahmad,
Thanks for the reply.
I have used the properties that you have provided and I found the following issues :

1. Page number is getting printed on the text of the PDF by using your approach . Can you please provide a cleaner approach to add page number to the very end of the all the pages.(please see the image attached )

2. When the text content is more than the page size, content is getting wrapped. (please see the image attached) .

3.When I print the file using the A4 height and width, the content as out side the printable area. (please see the image attached )

4.How can I change the font size and font style of the text that is being exported for all the pages.

Please note, I am looking for some property that would change the font size and style for all the pages.


Hi Vennila,


Thanks for your inquiry. I will appreciate it if you please share your source HTML here, I will test the scenario and will guide you accordingly.

Best Regards,
Hi,
I am using Aspose.pdf latest DLL that is 11.2.0.0. I am using this approach as given below to export to PDF.

Aspose.Pdf.Generator.Pdf pdfDocument = new Aspose.Pdf.Generator.Pdf();

pdfDocument.HtmlInfo.BadHtmlHandlingStrategy = Aspose.Pdf.Generator.BadHtmlHandlingStrategy.TreatAsPlainText;

pdfDocument.HtmlInfo.ShowUnknownHtmlTagsAsText = true;

pdfDocument.HtmlInfo.CharSet = "UTF-8";

pdfDocument.IsLandscape = true;

pdfDocument.IsPageNumberForDocument = true;

pdfDocument.IsFontNotFoundExceptionThrown = false;

pdfDocument.IsImageNotFoundErrorIgnored = true;

pdfDocument.IsAutoFontAdjusted = true;

pdfDocument.PageNumberFormat = Aspose.Pdf.Generator.PageNumberFormatType.EnglishLower;

pdfDocument.PageSetup.Margin = new Aspose.Pdf.Generator.MarginInfo() { Left = 30, Top = 30, Right = 10, Bottom = 10 };

pdfDocument.Author = "me";

if (objectType == 19)

pdfDocument.Subject = "User Details";

else if (objectType == 4)

pdfDocument.Subject = "Details";

else if (objectType == 2)

pdfDocument.Subject = "Company Details";

pdfDocument.Title = string.Format("{0} - {1}", returnValue.Id, returnValue.Name);

try

{

pdfDocument.BindHTML(requestHtml.ToString(), HsCommon.clsShared.HSTempFolder);

using (MemoryStream pdfStream = new MemoryStream())

{

pdfDocument.Save(pdfStream);

pdfStream.Flush();

returnValue.Data = pdfStream.ToArray();

if (pdfDocument.PageCount > 0)

returnValue.Tag = pdfDocument.PageCount;

}

}

I have seen from the previous posts that "Aspose.Pdf.Generator.Pdf " is a legacy approach. When I use this approach I am still able to export to PDF.

What is recommended ? Will the old approach using the pdf generator work for a while.


I am also facing issue with XERF table when exporting to PDF(Please review the screen shot).Can a solution or work around be provided soon?

Thanks,


Hi ,
We are having a another problem with exporting to PDF. I am attaching the HTML to this reply with the output we have received on our end. I am also proving you with the code I am using in the latest aspose pdf component.

Please note:
As you see from the attached PDF (d.PDF) in the second page the horizontal lines extend way beyond in the pdf. When you view the same in the HTML viewer like in http://scratchpad.io/unsuitable-friends-3413. The content is displayed correctly.
We need the content to be displayed as you see in the HTML in a PDF. Please advise the necessary changes to be made to aspose code. I have pasted the code in this reply and we use the latest control and dll from aspose. this is the approach aspose was suggested to export to pdf.

Aspose.Pdf.HtmlLoadOptions options = new Aspose.Pdf.HtmlLoadOptions(embeddedImageLocation);
options.PageInfo.Margin = new Aspose.Pdf.MarginInfo { Left = 30, Right = 10, Top = 30, Bottom = 10 };
options.PageInfo.Width = Aspose.Pdf.PageSize.A4.Width;
options.PageInfo.Height = Aspose.Pdf.PageSize.A4.Height;
options.PageInfo.IsLandscape = false;
byte[] byteArray = Encoding.UTF8.GetBytes(requestHtml.ToString());
MemoryStream stream = new MemoryStream(byteArray);
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(stream, options);
doc.Info.Title = string.Format("{0} - {1}", returnValue.Id, returnValue.Name);
doc.Info.Subject = "Request Details";
doc.Info.Author = "me";
Aspose.Pdf.HeaderFooter footer = new Aspose.Pdf.HeaderFooter();
Aspose.Pdf.Text.TextFragment fTxt = new Aspose.Pdf.Text.TextFragment("$p / $P ");
fTxt.TextState.Font = Aspose.Pdf.Text.FontRepository.FindFont("Arial");
fTxt.TextState.FontSize = 10;
fTxt.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Right;
footer.Paragraphs.Add(fTxt);
foreach (Aspose.Pdf.Page page in doc.Pages)
{
page.Footer = footer;
}

try
{
using (MemoryStream pdfStream = new MemoryStream())
{
doc.Save(pdfStream);
pdfStream.Flush();
returnValue.Data = pdfStream.ToArray();
if (doc.Pages.Count > 0)
returnValue.Tag = doc.Pages.Count;
}
}

Other Major Issues:

1)I have issues with printing, if you can see the attached PDF to this reply you might notice there is a lot of empty scape to the right and some times the content gets cut-off when printing.

2) Can we set the page size to A4 and to wrap all the content into the set page size.

3) Can you please update on the XREF exception.

We had no such issues in the previous version of aspose.

Can you please review and the provide the solution.


Thanks,












Hi Vennila,


Vennila:
Hi,
I am using Aspose.pdf latest DLL that is 11.2.0.0. I am using this approach as given below to export to PDF.

Aspose.Pdf.Generator.Pdf pdfDocument = new Aspose.Pdf.Generator.Pdf();

pdfDocument.HtmlInfo.BadHtmlHandlingStrategy = Aspose.Pdf.Generator.BadHtmlHandlingStrategy.TreatAsPlainText;

pdfDocument.HtmlInfo.ShowUnknownHtmlTagsAsText = true;

pdfDocument.HtmlInfo.CharSet = "UTF-8";

pdfDocument.IsLandscape = true;

pdfDocument.IsPageNumberForDocument = true;

pdfDocument.IsFontNotFoundExceptionThrown = false;

pdfDocument.IsImageNotFoundErrorIgnored = true;

pdfDocument.IsAutoFontAdjusted = true;

pdfDocument.PageNumberFormat = Aspose.Pdf.Generator.PageNumberFormatType.EnglishLower;

pdfDocument.PageSetup.Margin = new Aspose.Pdf.Generator.MarginInfo() { Left = 30, Top = 30, Right = 10, Bottom = 10 };

pdfDocument.Author = "me";

if (objectType == 19)

pdfDocument.Subject = "User Details";

else if (objectType == 4)

pdfDocument.Subject = "Details";

else if (objectType == 2)

pdfDocument.Subject = "Company Details";

pdfDocument.Title = string.Format("{0} - {1}", returnValue.Id, returnValue.Name);

try

{

pdfDocument.BindHTML(requestHtml.ToString(), HsCommon.clsShared.HSTempFolder);

using (MemoryStream pdfStream = new MemoryStream())

{

pdfDocument.Save(pdfStream);

pdfStream.Flush();

returnValue.Data = pdfStream.ToArray();

if (pdfDocument.PageCount > 0)

returnValue.Tag = pdfDocument.PageCount;

}

}

I have seen from the previous posts that "Aspose.Pdf.Generator.Pdf " is a legacy approach. When I use this approach I am still able to export to PDF.

What is recommended ? Will the old approach using the pdf generator work for a while.

As we already suggested above number of times, old generator(Aspose.Pdf.Generator) is legacy and it is obsolete. Now we are not making any changes or issue fixing in it but new generator(Aspose.Pdf). It is recommended to use new generator(Aspose.Pdf).


Vennila:

I am also facing issue with XERF table when exporting to PDF(Please review the screen shot).Can a solution or work around be provided soon?

Thanks,



I am afraid we can not suggest you anything without replicating the issue at our end. Please share your sample input/output documents and code here. We will look into it and will guide you accordingly.

We are sorry for the inconvenience caused.

Best Regards,
Hi Mr.Ahmad ,
I have attached the attachments to the previous reply . I am again attaching the attachments as asked.
As per your reply:
"As we already suggested above number of times, old generator(Aspose.Pdf.Generator) is legacy and it is obsolete. Now we are not making any changes or issue fixing in it but new generator(Aspose.Pdf). It is recommended to use new generator(Aspose.Pdf)."

We have used the new approach and I have pasted the code we are using to export to PDF in my previous post. I am again pasting the same code below.

Aspose.Pdf.HtmlLoadOptions options = new Aspose.Pdf.HtmlLoadOptions(embeddedImageLocation);
options.PageInfo.Margin = new Aspose.Pdf.MarginInfo { Left = 30, Right = 10, Top = 30, Bottom = 10 };
options.PageInfo.Width = Aspose.Pdf.PageSize.A4.Width;
options.PageInfo.Height = Aspose.Pdf.PageSize.A4.Height;
options.PageInfo.IsLandscape = false;
byte[] byteArray = Encoding.UTF8.GetBytes(requestHtml.ToString());
MemoryStream stream = new MemoryStream(byteArray);
Aspose.Pdf.Document doc = new Aspose.Pdf.Document(stream, options);
doc.Info.Title = string.Format("{0} - {1}", returnValue.Id, returnValue.Name);
doc.Info.Subject = "Request Details";
doc.Info.Author = "me";
Aspose.Pdf.HeaderFooter footer = new Aspose.Pdf.HeaderFooter();
Aspose.Pdf.Text.TextFragment fTxt = new Aspose.Pdf.Text.TextFragment("$p / $P ");
fTxt.TextState.Font = Aspose.Pdf.Text.FontRepository.FindFont("Arial");
fTxt.TextState.FontSize = 10;
fTxt.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Right;
footer.Paragraphs.Add(fTxt);
foreach (Aspose.Pdf.Page page in doc.Pages)
{
page.Footer = footer;

doc.Pages[page.Number].SetPageSize(597.6, 842.4);
}

try
{
using (MemoryStream pdfStream = new MemoryStream())
{
doc.Save(pdfStream);
pdfStream.Flush();
returnValue.Data = pdfStream.ToArray();
if (doc.Pages.Count > 0)
returnValue.Tag = doc.Pages.Count;
}
}

My Question:
1. I need to achieve the page number to appear in the printable area. I have attached the sample before and I am attaching the sample now. (Note: I know how to print the page number , I am not able to print them in the print able are as you seen the PDF).

2.I am setting the page size using the code below.
foreach (Aspose.Pdf.Page page in doc.Pages)
{
doc.Pages[page.Number].SetPageSize(597.6, 842.4);
}
How can I warp the content or change the content size to fit into the page size? ( this is not an issue in previous legacy code) .Please see the attached HTML and Sample PDF for reference.


3..For XREF exception , I am attaching the PDF .We have this exception in all the PDFs that are exported to PDF. We use PDFXChange Viewer to view the PDF documents.
Hi Vennila,


Vennila:

1. I need to achieve the page number to appear in the printable area. I have attached the sample before and I am attaching the sample now. (Note: I know how to print the page number , I am not able to print them in the print able are as you seen the PDF).

2.I am setting the page size using the code below.
foreach (Aspose.Pdf.Page page in doc.Pages)
{
doc.Pages[page.Number].SetPageSize(597.6, 842.4);
}
How can I warp the content or change the content size to fit into the page size? ( this is not an issue in previous legacy code) .Please see the attached HTML and Sample PDF for reference.


Please try following code snippet, it will add page number in printable area instead page contents.

Furthermore please note we can not set width/height of resultant PDF document less than minimal width/height of a html page. If the A4 width is 595, the the minimal width of your html page is 2458.15px because of html footer, you can fix in your source HTML document. As another work around we can resize the contents of resultant PDF document using ResizeContents() method of PdfFileEditor class, but fidelity of resultant PDF document of wide HTML document will be sacrificed.

Aspose.Pdf.HtmlLoadOptions options1 = new Aspose.Pdf.HtmlLoadOptions("E:/data/");

Aspose.Pdf.Document doc1 = new Aspose.Pdf.Document("E:/data/HTML_headerfooter.html", options1);

Aspose.Pdf.Page page1 = doc1.Pages[1];

Aspose.Pdf.Document doc = new Aspose.Pdf.Document();

HtmlFragment html = new HtmlFragment(new StreamReader("E:/data/HTML_headerfooter.html").ReadToEnd());

html.HtmlLoadOptionsOfInstance = new Aspose.Pdf.HtmlLoadOptions("E:/data/");

Aspose.Pdf.Page page = doc.Pages.Add();

//set page footer

page.OnBeforePageGenerate += OnPageGenerate;

page.PageInfo.Margin = new Aspose.Pdf.MarginInfo { Left = 30, Right = 10, Top = 30, Bottom = 30 };

page.SetPageSize(page1.PageInfo.Width,page1.PageInfo.Height);

page.Paragraphs.Add(html);

doc.Info.Title = string.Format("{0} - {1}", "1234", "Venneila");

doc.Info.Subject = "Request Details";

doc.Info.Author = "me";

doc.ProcessParagraphs();

//resize contents of the resultant PDF

int[] page_cnt = new int[doc.Pages.Count];

for (int i = 0; i < doc.Pages.Count; i++)

page_cnt[i] = i + 1;

PdfFileEditor pfe1 = new PdfFileEditor();

pfe1.ResizeContents(doc, page_cnt, PdfFileEditor.ContentsResizeParameters.PageResize(Aspose.Pdf.PageSize.A4.Width, Aspose.Pdf.PageSize.A4.Height));

doc.Save("E:/data/HtmltoPDFDOM_htmlfragment.pdf");

-----

public static void OnPageGenerate(Aspose.Pdf.Page page)

{

page.Footer = new Aspose.Pdf.HeaderFooter();

page.Footer.Margin = new Aspose.Pdf.MarginInfo();

TextFragment footerText = new TextFragment();

TextSegment footerSegment = new TextSegment("$p / $P ");

footerSegment.TextState.FontSize = 10;

//footerSegment.TextState.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Right;

footerText.Segments.Add(footerSegment);

footerText.TextState.HorizontalAlignment = Aspose.Pdf.HorizontalAlignment.Right;

page.Footer.Paragraphs.Add(footerText);

}


Vennila:

3..For XREF exception , I am attaching the PDF .We have this exception in all the PDFs that are exported to PDF. We use PDFXChange Viewer to view the PDF documents.

We are looking into the issue and will update you as soon as possible.

Best Regards,
Hi Vennila,

Vennila:


3..For XREF exception , I am attaching the PDF .We have this exception in all the PDFs that are exported to PDF. We use PDFXChange Viewer to view the PDF documents.

Thanks for your patience. I have noticed your reported XREF error with Aspose.Pdf generated PDF documents in PDF-EChange viewer and logged an issue in our issue tracking system as PDFNEWNET-40263 and also linked your request to it. We will keep you updated via this thread regarding the issue status.

We are sorry for the inconvenience.

Best Regards,