HTML Table breaking wrongly over page break after PDF conversion

Hi,

I’ve just started using Aspose PDF, and I’m experiencing a problem where a table built up in HTML, then passed through the Aspose PDF generator, breaks over a page. See attached image for the issue - I want the table to have one clean straight line along the bottom before continuing on the next page.

Original HTML is as follows:

<table style="text-align: center; vertical-align: middle; white-space:nowrap; overflow-wrap: break-word;"><br><br>            <tr><br>                <td colspan="2" class="subheader">Subheader: subheading</td><br>                <td class="detailsheader">Comments</td><br>                <td class="detailsheader"></td><br>            </tr><br>        <br>            <tr><br>                <td class="attributeheader">Attribute 1</td><td class="detail">Attribute 1 Value</td><br>                <td class="comments" rowspan="6">Fusce ultricies sagittis quam, vitae ornare felis vestibulum in. Nulla ut accumsan quam. Pellentesque eget nulla sed eros molestie volutpat. Quisque maximus gravida egestas. Aenean scelerisque tellus nec sollicitudin consectetur. Integer massa sapien, consectetur sed dignissim vitae, hendrerit placerat lorem. Duis sem felis, tristique nec augue aliquam, sollicitudin dictum mi. Sed aliquet orci mi, quis commodo turpis </td><br>                <td class="emptycell" rowspan="6"></td><br>            </tr><br>            <tr><br>                <td class="attributeheader">Attribute 2</td><td class="detail">Attribute 2 Value</td><br>                <td>&nbsp;</td><br>                <td>&nbsp;</td><br>            </tr><br>            <tr><br>                <td class="attributeheader">Attribute 3</td><td class="detail" >Attribute 3 Value</td><br>                <td>&nbsp;</td><br>                <td>&nbsp;</td><br>            </tr><br>            <tr><br>                <td class="attributeheader">Attribute 4</td><td class="detail" >Attribute 4 Value</td><br>                <td>&nbsp;</td><br>                <td>&nbsp;</td><br>            </tr><br>            <tr><br>                <td class="attributeheader">Attribute 5</td><td class="detail" >Attribute 5 Value</td><br>                <td>&nbsp;</td><br>                <td>&nbsp;</td><br>            </tr><br>            <tr><br>                <td class="attributeheader">Attribute 6</td><td class="detail" >Attribute 6 Value</td><br>                <td>&nbsp;</td><br>                <td>&nbsp;</td><br>            </tr><br><br>    </table>

Our .NET code that parses the HTML is as follows. Our HTML is in the ‘paragraph’ element.

private static byte[] ConvertHtmlToPdf(DocumentMetadata metadata)
{
// memory strem for pdf
var ms = new MemoryStream();

// Create new pdf
var pdf = new Pdf();

// margin
pdf.PageSetup.Margin.Left = 25;
pdf.PageSetup.Margin.Right = 25;
pdf.PageSetup.Margin.Top = 30;

// Create the main document
var documentSection = new Section(pdf);

documentSection.FirstPageInfo = new PageSetup(documentSection);

foreach (var paragraph in metadata.Paragraphs)
{
if (paragraph.PageBreak == PageBreak.Before)
{
documentSection.Paragraphs.Add(new Text(CPAGEBREAK));
}

var documentTextReader = new StringReader(paragraph.Content);
var documentText = new Text(documentTextReader.ReadToEnd());

documentTextReader.Close();
documentTextReader.Dispose();

documentText.IsHtmlTagSupported = true;
documentText.IsKeptTogether = true;
documentText.IsKeptWithNext = true;

documentSection.Paragraphs.Add(documentText);

if (paragraph.PageBreak == PageBreak.After)
{
documentSection.Paragraphs.Add(new Text(CPAGEBREAK));
}
}

if (metadata.ShowHeaderOnPage != ShowHeaderOnPage.None)
{
// Create a header
var header = new HeaderFooter();

// Read the header
var htmlReader = new StringReader(metadata.HeaderContent);
var headerText = new Text(htmlReader.ReadToEnd());

htmlReader.Close();
htmlReader.Dispose();

headerText.IsHtmlTagSupported = true;
header.Paragraphs.Add(headerText);
header.Margin.Top = metadata.HeaderMarginTop;
header.Margin.Bottom = metadata.HeaderMarginBottom;

// Add the header to each page
documentSection.EvenHeader = header;
documentSection.OddHeader = header;

if (metadata.ShowHeaderOnPage == ShowHeaderOnPage.AllExceptFirstPage)
{
// Do not show on the first page
documentSection.OddHeader.IsSubsequentPagesOnly = true;
}
}

if (metadata.ShowPageNumbers)
{
documentSection.StartingPageNumber = 1;

var footer = new HeaderFooter(documentSection);
documentSection.OddFooter = footer;
documentSection.EvenFooter = footer;

var footerText = new Text(“Page $p of $P”);
footerText.TextInfo.Alignment = AlignmentType.Center;
footer.Paragraphs.Add(footerText);
}

pdf.Sections.Add(documentSection);

// Save the pdf to the memory stream
pdf.Save(ms);

var msArray = ms.ToArray();

ms.Close();
ms.Dispose();

return msArray;
}

Hi Stewart,


Thanks for your inquiry. Please use new DOM approach for adding HTML text into PDF document. It will resolve the issue. Moreover please note Aspose.Pdf.Generator is old technique, we will discontinue this in near future. So it is recommended to use new generator Aspose.Pdf, it can be used for both creating a PDF from scratch or manipulating existing one.

Please feel free to contact us for any any further assistance.

Best Regards,

Hi Tilal,

Thanks for getting back to me.

I had a quick look at the new HtmlLoadOptions. We’re licensed, and currently use version 6.3.0.0. Should we just be able to upgrade to the latest version?

I looked at the trial version of 10.4.0.0, and I can see how the examples convert from html text into PDF. However, if you look at the code sample I included above, we do quite a bit of work using the Aspose Section class, including setting Headers and Footers, using Page Breaks.

Do you have an example how to do this using the new HtmlLoadOptions?

Thanks

Hi Stewart,

Lumo:


I had a quick look at the new HtmlLoadOptions. We're licensed, and currently use version 6.3.0.0. Should we just be able to upgrade to the latest version?


You can check subscription expiry date from your license and you are entitled to upgrade to any version released before the subscription date. However if it is too old then you need to renew your subscription to use latest version of Aspose.Pdf.

Lumo:


I looked at the trial version of 10.4.0.0, and I can see how the examples convert from html text into PDF. However, if you look at the code sample I included above, we do quite a bit of work using the Aspose Section class, including setting Headers and Footers, using Page Breaks.

Do you have an example how to do this using the new HtmlLoadOptions?


Please note Page object is alternative of Section in new DOM and you can set header and footer of PDF document using new DOM as well. However for page break you can you please share some details/requirements?

However if you are using Html text instead Html file you can use HtmlFragment to add HTML text in PDF and can use In reference to page break you use new IsNewPage property to force page break.


Best Regards,

Hi Tilal,

tilal.ahmad:

You can check subscription expiry date from your license and you are entitled to upgrade to any version released before the subscription date. However if it is too old then you need to renew your subscription to use latest version of Aspose.Pdf.


We are licensed up to versions on 12th December 2012. From the resources pages on the the Aspose Website it looks like this allows us to use version 7.5.0 at this page:

http://www.aspose.com/community/files/51/.net-components/aspose.pdf-for-.net/full.aspx?ppage=7

However, when I tried to download the DLLs, it took me to the download page for the latest downloads.

Can you please check this link?

Thanks


Hi Stewart,


Thanks for your feedback. I am afraid as per policy we do not maintain a release older than a year. You are looking for a quite older version, we will recommend you to download and try latest version of Aspose.Pdf for .NET. It has a lot of fixes, enhancements and new features. You may request a temporary license for evaluation and renew your subscription accordingly.

Please feel free to contact us for any further assistance.

Best Regards,