Table width= 100% of html is not working

Hi Team,

I am having issues with Html table width which I set to 100% but when I convert it to pdf it is covering only around 60% of the page width.

We are sure that issue is with the pdf conversion as we found other aspose customers are also facing identical issue.
Please refer (Table Formatting Issues Saving to PDF)

I am attaching html and the output pdf having issue.Width Issue.zip (44.5 KB)

Regards,
Mukesh Singh

@msingh02

Thanks for your inquiry. We have tested the scenario using Aspose.Words for .NET 17.10 and unable to notice any issue. Please note Aspose.Words mimic MS Word behavior. If you render shared HTML file to PDF using MS Word, then you will get same results.

Aspose-Test_AW1710.pdf (43.8 KB)
TestHtml_msw.pdf (68.5 KB)

I am able to replicate this issue in 17.10 version also. To replicate the issue output.pdf (50.2 KB)
use below code

using (var memoryStream = new MemoryStream(Encoding.UTF8.GetBytes(HtmlToPdfApplyCssFixes(File.ReadAllText(@"C:\Dev\Mukesh\TestHtml.html")))))
{
    Aspose.Words.Document docw = new Aspose.Words.Document(memoryStream);

    foreach (Aspose.Words.Section sec in docw.Sections)
    {

        Aspose.Words.PageSetup ps = sec.PageSetup;
        //  1 inch equals 72 points
        ps.TopMargin = 0;
        ps.RightMargin = 0;
        ps.BottomMargin = 0;
        ps.LeftMargin = 0;
        ps.PageWidth = Aspose.Pdf.Generator.PageSize.LetterWidth;
        ps.PageHeight = Aspose.Pdf.Generator.PageSize.LetterHeight;
    }
    foreach (Aspose.Words.Tables.Table table in docw.GetChildNodes(NodeType.Table, true))
    {

        table.PreferredWidth = PreferredWidth.FromPercent(100);
    }
    docw.Save("d:\\output.pdf");
}

It looks ok in your pdf as you are using the default margins. Please let me know the fix or work around for it.

Regards,
Mukesh Singh

@msingh02

Thanks for sharing the additional information. Please set AutoFit property of table as following, it will help you to accomplish the task.

.....
foreach (Aspose.Words.Tables.Table table in docw.GetChildNodes(NodeType.Table, true))
{
    table.PreferredWidth = PreferredWidth.FromPercent(100);
    table.AutoFit(AutoFitBehavior.AutoFitToContents);
} 
.....

Thanks @tilal.ahmad

The code which you suggested is working on a portion of the document. But after applying this code I am getting that heading is getting disturbed, which was working without this AutoFit method.

I am attaching the HTML and pdf document with and without AutoFit.
Testing.zip (85.4 KB)

Thanks,
Mukesh Singh

@msingh02

Thanks for your feedback. Please use AutoFitToWindow value of AutoFitBehavior, it will help you to get expected results.Aspose-Test_margin0.pdf (27.4 KB)

//table.PreferredWidth = PreferredWidth.FromPercent(100);
table.AutoFit(AutoFitBehavior.AutoFitToWindow);

Thanks @tilal.ahmad It worked for given case. We have some HTML where we have chart images and the content is getting cropped on the right-hand side. Do you have any method or setting to explicitly tell in those cases reduce the size of html content and accommodate it in within the width of the page.

@msingh02

Thanks for your inquiry. We will appreciate it if you please share the problematic HTML file here as ZIP file. We will look into it and will guide you accordingly.

Please find the problematic HTML as the attachment Testing 2.zip (869.3 KB)
. few of the image in this HTML are rendering outside of the PDF page. Please suggest if you have any setting to restrict image to be rendered within the page width.

@msingh02

Thanks for your inquiry. Please check following sample code snippet for the purpose. Hopefully it will help you to adjust the image as per page width.

foreach (Aspose.Words.Tables.Table table in docw.GetChildNodes(NodeType.Table, true))
{

    //table.PreferredWidth = PreferredWidth.FromPercent(100);
    table.AutoFit(AutoFitBehavior.AutoFitToWindow);
}
UpdateShapes(docw);
docw.UpdatePageLayout();
.....
.....
public static void UpdateShapes(Aspose.Words.Document doc)
{
    LayoutCollector collector = new LayoutCollector(doc);
    LayoutEnumerator enumerator = new LayoutEnumerator(doc);

    foreach (Section section in doc.Sections)
    {
        PageSetup ps = section.PageSetup;
        double pagewidth = ps.PageWidth;
        foreach (Shape shape in doc.GetChildNodes(NodeType.Shape, true))
        {
            enumerator.Current = collector.GetEntity(shape);

            float width = enumerator.Rectangle.Width;
                        
            // Resize Image
            if (width > ps.PageWidth)
            {
                ResizeLargeImage(shape);
                        
            }
        }

    }
}

public static void ResizeLargeImage(Shape image)
{
    // Return if this shape is not an image.
    if (!image.HasImage)
        return;

    // Calculate the free space based on an inline or floating image. If inline we must take the page margins into account.
    PageSetup ps = image.ParentParagraph.ParentSection.PageSetup;
    double freePageWidth = image.IsInline ? ps.PageWidth - ps.LeftMargin - ps.RightMargin : ps.PageWidth;
    double freePageHeight = image.IsInline ? ps.PageHeight - ps.TopMargin - ps.BottomMargin : ps.PageHeight;

    Boolean exceedsMaxPageSize = image.Width > freePageWidth || image.Height > freePageHeight;

    if (exceedsMaxPageSize)
    {
        // Calculate the ratio to fit the page size based on which side is longer.
        Boolean widthLonger = (image.Width > image.Height);
        double ratio = widthLonger ? freePageWidth / image.Width : freePageHeight / image.Height;

        // Set the new size.
        image.Width = image.Width * ratio;
        image.Height = image.Height * ratio;
    }
}

@tilal.ahmad It worked for the oversized image. Can you help me with the overlapping images also? If you see the price chart image it is not displayed fully. I appreciate all your efforts to help us.

Thanks,
Mukesh

@msingh02

Thanks for your feedback. We will appreciate it if you please share your expected PDF document here. It will help us to address your requirement exactly.

@tilal.ahmad Please find ExpectedExpectedOutput.pdf (576.0 KB)
and actualActualOutput.pdf (571.3 KB)
output pdf as an attachment. There is the difference in both pdf document is Price chart image. In this pdf, you can see the complete image. If you could provide a complete example to get our expected result then it would be great.

Thanks,
Mukesh

@tilal.ahmad Another issue which you can see on right-hand side content is getting cut. First-page table and 4th and 5th-page content and the border are also getting cut.

@msingh02

Thanks for sharing the additional information. We have tested the scenario and noticed the reported issues. We have logged following tickets in our issue tracking system for further investigation and rectification. We will notify you as soon as these issues are resolved. We are sorry for the inconvenience.

WORDSNET-16159: Table data trim issue
WORDSNET-16160: Tables missing border issue
WORDSNET-16161: image overlapping issue

@msingh02

Thanks for your patience. We have investigated the issue WORDSNET-16159, table data trimming in HTML to DOCX conversion. We have found it is not a bug but expected behavior. The text blocks in the source HTML document are placed in fixed-width table cells (width is specified via CSS rules) that do not auto-resize, when pages in the resulting document are too narrow to accommodate all contents. Browsers also demonstrate the same behavior (the table is cropped), in case browser window is too narrow, as shown in attached ChromeReport.png.

If you need to convert HTML to DOCX then you can either decrease widths of table elements in the source HTML document or make pages of the resulting DOCX document wide enough via PageSetup.

Furthermore, we will keep you updated about resolution progress of other issues logged against HTML to PDF conversion within this forum thread.

The issues you have found earlier (filed as WORDSNET-16160) have been fixed in this Aspose.Words for .NET 23.10 update also available on NuGet.