Convert XLSX to HTML | Insert HTML in DOCX & finally Preserve Table Layout when Converting DOCX to PDF using C# .NET

Hi ASPOSE Team,

The PDF formatting issues are still exist. Although, we have updated the ASPOSE version to NET 20.5 and still experiencing the same problems. Please see below:

*After setting the column width for excel sheet to AutoFitColumns(options). This solution works perfectly when there’s an empty column after the column has data. However, when there is not, the text data got cut off and pushed down all other data.

I also tried to use AutoFitRows() and seems isn’t helping.

Looking forward to hearing back from you soon.

Thank you.

@DevD2020,

To evaluate your issue accurately we need your template file, output PDF file and sample code (runnable). So, please provide us your source xls/xlsx file and the output PDFs. We will look into this issue and update you soon. Also, it would be better if you could create new threads based on each issue. This will help us really to evaluate your issues precisely and to consequently figure these out soon.

PS. please zip the files prior attaching.

PDF_Format_Issues.zip (84.4 KB)

Please see attached . The arrows in red are the error we’re experiencing and the text are not being formatted well in PDF as the way in Word file. We basically converting the Excel into HTML and to Word then save as PDF…If you can create a new thread please do so otherwise not sure how to create one.

Thanks!

@DevD2020,

After an initial test with the latest (20.5) version of Aspose.Words for .NET, we were unable to reproduce this issue on our end. Please see the output PDF document that we generated from “WordFile.docx” on our end by using the following simple C# code:

C# Code:

Document doc = new Document("E:\\PDF_Format_Issues\\WordFile.docx");
doc.Save("E:\\PDF_Format_Issues\\20.5.pdf");

Please also create a standalone simple and runnable Console Application (source code without compilation errors) that helps us to reproduce your current problem on our end and attach it here for further testing. Please do not include Aspose.Words and Aspose.Cells DLL files in it to reduce the file size. Thanks for your cooperation.

I’m not sure of “Unable to reproduce”? Does it mean that the issue isn’t fixable? PDF_Format_Issue.zip (11.3 KB)

Can you please try to upload the Excel file (see attached) and let me know the Word/PDF results? I also included how the file is being saved as a PDF after conversion from Excel to Word. Do we need to specify where file should be saved?

I have created a new thread and not sure if someone is currently looking into it?

@DevD2020,

We have produced a PDF (20.5.pdf (55.9 KB)) document by using the following code on our end:

private static void ConvertExcelToHtml()
{
    DocumentBuilder builder = new DocumentBuilder();
    builder.PageSetup.Orientation = Aspose.Words.Orientation.Landscape;

    Workbook excelWorkbook = new Workbook(@"E:\Temp\PDF_Format_Issues\Excel_Error.xlsx");

    // force margins and orientation to the excel workbook
    Aspose.Cells.PageSetup ps = null;
    bool visibleSheet = false;

    var lastSheetIndex = excelWorkbook.Worksheets.Where(x => x.IsVisible).LastOrDefault()?.Index ?? 0;

    // set the options so that we export the active worksheet one at a time, and only the print area
    Aspose.Cells.HtmlSaveOptions opts = new Aspose.Cells.HtmlSaveOptions(Aspose.Cells.SaveFormat.Html);
    opts.ExportActiveWorksheetOnly = true;
    opts.ExportPrintAreaOnly = true;
    var printOptions = new ImageOrPrintOptions();
    var wsMax = excelWorkbook.Worksheets.Count();

    for (int wsIdx = 0; wsIdx < wsMax; wsIdx++)
    {
        var ws = excelWorkbook.Worksheets[wsIdx];
        //Set auto fit columns in existed worksheet which solves PDF formatting issues for not recognizing column width
        AutoFitterOptions options = new AutoFitterOptions();

        options.AutoFitMergedCells = true;

        ws.AutoFitColumns(options);
        ws.AutoFitRows();

        // set this sheet as active so that it is the one exported
        excelWorkbook.Worksheets.ActiveSheetIndex = ws.Index;
        if (ws.IsVisible)
        {
            visibleSheet = true;
            ps = ws.PageSetup;


            var pageBreaks = ws.GetPrintingPageBreaks(printOptions);
            int p = 0;
            foreach (var pb in pageBreaks)
            {
                // Aspose will not export columns/rows that are that do not have data and are the last column/row.
                var endCol = pb.EndColumn <= ws.Cells.MaxDataColumn ? pb.EndColumn : ws.Cells.MaxDataColumn;
                var endRow = pb.EndRow <= ws.Cells.MaxDataRow ? pb.EndRow : ws.Cells.MaxDataRow;

                // stops horizontally positioned empty pages from attempting to render
                if (pb.StartColumn <= endCol)
                {
                    // set the print area before getting the html
                    ps.PrintArea = string.Format("{0}:{1}", CellsHelper.CellIndexToName(pb.StartRow, pb.StartColumn), CellsHelper.CellIndexToName(endRow, endCol));

                    using (var stream = new MemoryStream())
                    {
                        // insert the excel sheet html into the word document
                        excelWorkbook.Save(stream, opts);
                        builder.InsertHtml(System.Text.Encoding.UTF8.GetString(stream.ToArray()));

                        File.WriteAllText(@"E:\Temp\PDF_Format_Issues\html_" + p + ".html", System.Text.Encoding.UTF8.GetString(stream.ToArray()));

                        // insert a page break between each image, but not after the last one
                        if (!(ws.Index == lastSheetIndex && p == pageBreaks.Length - 1))
                        {
                            builder.InsertBreak(BreakType.PageBreak);
                        }
                    }
                }

                p++;
            }
        }
    }

    builder.Document.Save(@"E:\Temp\PDF_Format_Issues\20.5.docx");
    builder.Document.Save(@"E:\Temp\PDF_Format_Issues\20.5.pdf");
}

We can see that Table column cells have incorrect widths in rendered PDF. For the sake of correction, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-20379. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.

Thank you. Do you happen to know how long will it take to get this issue resolved?

@DevD2020,

I am afraid, your issue is currently pending for analysis and is in the queue. There are no estimates available at the moment. Once the analysis of this issue is completed and the root cause is determined, we may then be able to calculate and share the ETA of this issue with you. We apologize for any inconvenience.

Okay, please keep me posted the soonest possible as this problem is causing major issue in our application. Thank you.

@DevD2020,

Sure, we will keep you posted on any further updates and let you know when this issue will get resolved.

Hi @awais.hafeez @Amjad_Sahi - Do you happen to have any updates?If not, when’re we going to expect this issue to be resolved? Thank you.

@DevD2020,

Regarding WORDSNET-20379, we have completed the analysis of this issue but I am afraid, because of complexity, the implementation of the fix of this issue has been postponed till a later date. There are no estimates available at the moment. We will inform you via this thread as soon as this issue will get resolved in future. We apologize for your inconvenience.

@awais.hafeez Do you have any updates?

@DevD2020,

I am afraid, WORDSNET-20379 is not resolved yet and there is no further news about this issue. This is a complex issue and because of that the implementation of this issue has been postponed. Also, there is no ETA available at the moment. We apologize for your inconvenience.

Hi Awais - Is there any updates on this issue?

@DevD2020,

The issue actually depends on the resolution of an internal extremely complex issue (WORDSNET-832). Therefore, WORDSNET-20379 is postponed until at least WORDSNET-832 is resolved, we will review your issue after the resolution of WORDSNET-832. Unfortunately, there is no ETA available at the moment. We will inform you via this forum thread as soon as this issue will get resolved or any further updates may be available in future. We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-20379) have been fixed in this Aspose.Words for .NET 23.10 update also available on NuGet.