AutoFitColumns() is spanning data across wide page

Hello,

We are using AutoFitColumns() method while converting Excel to PDF to get the whole data, but it is getting spanned across wider pages and columns are having too much spaces. Please check it at 3rd page of converted PDF. Also, how to get rid of blank pages coming in the converted PDF? We are using Aspose.Total license version 21.3.0 Please find original and converted files in attachment.

1538041204.zip (646.4 KB)

@Vaidehi123,

The problem “data getting spanned wider pages… too much spaces” looks due to “AllColumnsInOnePagePerSheet” which you might have set in code apart from calling AutoFitColumns. When you set this boolean attribute on, all the columns are pasted on the page which consequently widens the pages. So if you do not want this display then do no use this boolean attribute.

Please note, generally, Aspose.Cells renders a sheet (to PDF) as it is displayed in the print preview of the sheet (in MS Excel). For your needs, you should try to specify your desired Page Setup options (using Page Setup dialog) and now take the print preview to see how it will be rendered into PDF. Now use relevant PageSetup APIs (provided by Aspose.Cells) in your code before rendering to PDF to get similar results. If you still could not evaluate, kindly do provide your updated Excel file (which shows your desired view when taking the print preview of the sheet) and expected PDF file (by MS Excel). We will check it soon.

@Amjad_Sahi Thanks for your response. We can not exclude “AllColumnsInOnePagePerSheet” boolean because of the client’s requirement. I have tried using the Pagesetup options, but still did not get desired output. I have attached original Excel file which gives the desired print preview and the converted pdf using MS Excel. PDF converted by MS Excel is just the same as the desired output except its having blank pages in between. Below is the code which I tried.

string dataDir = System.IO.Path.GetFullPath(@"..\..\");
            Aspose.Cells.License license = new Aspose.Cells.License();
            license.SetLicense(dataDir + "\\Aspose.Total.lic");
            var loadOptions = new LoadOptions();
            loadOptions.CheckExcelRestriction = false;
            Workbook workbook = new Workbook(pInFile, loadOptions);
            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            // option to set all the columns of excel in one page.
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            /* Retain the structure of original excel */
            pdfSaveOptions.ExportDocumentStructure = true;
            /* Formula calculation for any digit formulla applied in excels */
            /*it is best to call Workbook.CalculateFormula() just before rendering the spreadsheet to PDF. 
           * This ensures  that the formula dependent values are recalculated, and the correct
           * values are rendered in the PDF.*/
            workbook.CalculateFormula();
            /* For some sheet, print area is applied, to display all the data in pdf, 
             * below setting is needed */
            
            foreach (Worksheet wks in workbook.Worksheets)
            {
                
                foreach (Cell cell in wks.Cells)
                {
                   
                    Style style = cell.GetStyle();
                    
                    /* if cell is numeric, consider its category type as numeric and update the format */
                    if (cell.Type == CellValueType.IsNumeric &&
                        cell.NumberCategoryType != NumberCategoryType.Number)
                    {
                        
                        style.SetCustom("#,##0.00", true);
                    }
                    style.VerticalAlignment = TextAlignmentType.Center;
                    /* set fonts */
                    style.Font.Name = "Helvetica";
                    cell.SetStyle(style);
                }
               
                wks.PageSetup.PrintArea = "";
                wks.PageSetup.Orientation = PageOrientationType.Portrait;
                wks.PageSetup.IsHFScaleWithDoc = true;
                wks.PageSetup.IsHFAlignMargins = false;
                wks.PageSetup.IsHFDiffFirst = false;
                wks.PageSetup.IsHFDiffOddEven = false;
                wks.PageSetup.Order = PrintOrderType.DownThenOver;
                wks.PageSetup.SetFitToPages(1, 0);
                wks.PageSetup.CenterHorizontally = true;
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;
                PictureCollection pk = wks.Pictures;
                if (pk?.Count() <= 0)
                {
                    wks.Cells.DeleteBlankRows();
                }
                wks.AutoFitColumns();
            }
            workbook.Save(pOutFile, pdfSaveOptions);

1538041204.zip (556.5 KB)

@Vaidehi123,

I used your newly attached template Excel file and converted to PDF via Aspose.Cells for .NET (latest version) and the resultant PDF looks to me almost same. Could you share what is the major difference b/w Aspose.Cells rendering and MS Excel’s rendering using your template (new) Excel file. Give us some screenshots to evaluate it on our end.

Moreover, those blank pages are sheet ends, so you cannot escape from it (even using MS Excel rendering). One workaround though is copy all the data from different sheets into single sheet using Range.Copy() method but it might have other drawbacks. May be you could use Aspose.PDF API to re-process the output PDF to remove those blank areas by it.

@Amjad_Sahi I have tried with latest version to check if it resolves the issue but it won’t work. I have attached screen shot of actual converted and desired pdf page. Also, attached actual converted PDF.
Files.zip (730.4 KB)

@Vaidehi123,

You did not do what I asked previously. I asked you to just render using the following lines to convert your updated Excel file to PDF and check if you find any significant issue when comparing with MS Excel output (PDF).
e.g.
Sample code:

var stringPath = "e:\\test2\\1538041204.xlsx";
var workbook = new Aspose.Cells.Workbook(stringPath);
workbook.Save("out1.pdf", Aspose.Cells.SaveFormat.Pdf); 

If you use your previous or existing code, surely, it will try to paste all the columns (in the sheet) in output PDF (pages), so consequently data would be pasted and pages would be widens. Please see my reply for reference.

@Amjad_Sahi It’s working as expected by excluding other settings now.

For blank pages, I tried to reprocess the generated PDF from excel to exclude blank areas but its giving exception like “At most 4 elements (for any collection) can be viewed in evaluation mode.” But I am using the aspose total license version 21.3.0. I am using below code. Please let me know am I missing something here? Attaching the processed PDF file also.1538041202.pdf (557.8 KB)

            string dataDir = System.IO.Path.GetFullPath(@"..\..\");
            Aspose.Cells.License license = new Aspose.Cells.License();
            license.SetLicense(dataDir + "\\Aspose.Total.lic");
            Document pdfDocument = new Document("\\Files\\1538041202.pdf");
                for (int i = 1; i <= pdfDocument.Pages.Count; i++)
                {
                    bool isBlank = pdfDocument.Pages[i].IsBlank(0.01d);
                    if (isBlank)
                    {
                        pdfDocument.Pages.Delete(i);
                    }
                }
                pdfDocument.Save(pOutFile);

@Vaidehi123,

It looks like you have not set the license for Aspose.PDF for .NET, please set the license in code (as you did for Aspose.Cells for .NET APIs) and give it a try again.

@Amjad_Sahi Yes, that was it. Thanks!

@Vaidehi123,

You are welcome.

@Amjad_Sahi We have some files having text data only in one row, while converting to PDF, data is cutting, Please find the issue in first page of converted PDF. Do we have any solution for it without using AutoFitColumns()? (AutoFitColumns() and AutoFitRows() are problamtic in some other documents so we are not using it.) Please find the code below.
1412050112.zip (102.9 KB)

            Aspose.Cells.License license = new Aspose.Cells.License();
            string workingDirectory = Environment.CurrentDirectory;
            string dataDir = System.IO.Path.GetFullPath(@"..\..\");
            license.SetLicense(dataDir + "\\Aspose.Total.Product.Family.lic");
            FileFormatInfo fileFormatInfo = FileFormatUtil.DetectFileFormat(pInFile);
            Aspose.Cells.LoadOptions loadOptions;
            if (fileFormatInfo.FileFormatType == FileFormatType.Html)
            {
                Aspose.Cells.HtmlLoadOptions htmlLoadOptions = new Aspose.Cells.HtmlLoadOptions
                {
                    AutoFitColsAndRows = true
                };
                loadOptions = htmlLoadOptions;
            }
            else
            {
                loadOptions = new Aspose.Cells.LoadOptions();
            }
            loadOptions.CheckExcelRestriction = false;
            Workbook workbook = new Workbook(pInFile, loadOptions);
            workbook.Settings.CheckExcelRestriction = false;
            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            // option to set all the columns of excel in one page.
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            /* Retain the structure of original excel */
            pdfSaveOptions.ExportDocumentStructure = true;
            /* Formula calculation for any digit formulla applied in excels */
            /*it is best to call Workbook.CalculateFormula() just before rendering the spreadsheet to PDF. 
           * This ensures  that the formula dependent values are recalculated, and the correct
           * values are rendered in the PDF.*/
            workbook.CalculateFormula();
            /* For custom named range, need below code to display number instead of ## */
            Range[] range = workbook.Worksheets?.GetNamedRanges();
            if (range != null && range.Length > 0)
            {
                foreach (var r in range)
                {
                    if (!r.Address.Contains(":"))
                    {
                        r.Worksheet.Cells[r.Address].PutValue(r.Value?.ToString());
                    }

                }
            }
            /* For some sheet, print area is applied, to display all the data in pdf, 
             * below setting is needed */
            foreach (Worksheet wks in workbook.Worksheets)
            {
                foreach (Aspose.Cells.Cell cell in wks.Cells)
                {
                    
                    Style style = cell.GetStyle();
                    /* if cell is numeric, consider its category type as numeric and update the format */
                    if (cell.Type == CellValueType.IsNumeric || cell.Type == CellValueType.IsDateTime)
                    {
                        var updatedValue = (cell.DisplayStringValue.Contains("#") ||
                            cell.DisplayStringValue.ToLower().Contains("e")) ? cell.Value : cell.DisplayStringValue;
                        cell.PutValue(updatedValue.ToString());
                        style.HorizontalAlignment = TextAlignmentType.Right;
                    }
                    style.VerticalAlignment = TextAlignmentType.Center;
                    cell.SetStyle(style);
                }

                wks.PageSetup.PrintArea = "";
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;
            }
            workbook.Save(pOutFile, pdfSaveOptions);
            return 0;

@Vaidehi123,

Thanks for the files.

I compare your input Excel file Vs your output PDF and both are matched, see the screenshot attached for your reference. You can see long data is also cut in input Excel file and so it is cut in the output PDF.
sc_shot1.png (265.8 KB)

In MS Excel, there are only three ways to show all text data in different cells of the column (e.g. A column), there is no other way around. So, Aspose.Cells also have these three ways.

1). Use auto fit columns operation to show long data in different cells. If you do not want to auto-fit all the columns (AutFitColumns()), you may only auto-fit first column using AutoFitColumn() method (e.g., for the column which has long data/text) in worksheet.

2). Extend the column’s width (via Worksheet.Cells.SetColumnWidth()) to a value which should be equal to longest text width. For example, for your file, the column width of A column can be set to 102 which is little greater than the longest text/data in the column. If you opt to go with this option/approach, you will need to scan each cell in the column and check the width via Cell.GetWidthOfValue() and then set the column’s width to this value.

3). Wrap text/data in the cells of the specified column. Doing this, your long data might be pasted in two lines. So, if you accept this, you may try it. See the sample code segment for your reference:
e.g.
Sample code:

    .......
    // Adding a new Style to the styles
    Style style = workbook.CreateStyle();
    style.IsTextWrapped = true;

    // Creating StyleFlag
    StyleFlag styleFlag = new StyleFlag();
    styleFlag,WrapText = true;

    // Accessing a column (A  column) from the Columns collection
    Column column = worksheet.Cells.Columns[0];

    // Applying the style to the column
    column.ApplyStyle(style, styleFlag);
    .......

Apart from above solutions, I do not find any other better way to show long data/text completely. So, you have to choose any one of the above approaches. If you still think there is any other better way to do it in MS Excel manually, kindly share details with sample Excel file, we can check it further.

@Amjad_Sahi I tried with 3rd option it is not wrapping data to next but hiding the cutting data. Please have a look in the attached converted file 1498548888.pdf (90.3 KB)

@Vaidehi123,

Could you please share your exact (runnable) code (after embedding the suggested code segment) with your input Excel file (you provided earlier) using 3rd option, we will check it soon.

Please find the exact code below:

            Aspose.Cells.License license = new Aspose.Cells.License();
            string workingDirectory = Environment.CurrentDirectory;
            string dataDir = System.IO.Path.GetFullPath(@"..\..\");
            license.SetLicense(dataDir + "\\Aspose.Total.Product.Family.lic");
            FileFormatInfo fileFormatInfo = FileFormatUtil.DetectFileFormat(pInFile);
            Aspose.Cells.LoadOptions loadOptions;
            if (fileFormatInfo.FileFormatType == FileFormatType.Html)
            {
                Aspose.Cells.HtmlLoadOptions htmlLoadOptions = new Aspose.Cells.HtmlLoadOptions
                {
                    AutoFitColsAndRows = true
                };
                loadOptions = htmlLoadOptions;
            }
            else
            {
                loadOptions = new Aspose.Cells.LoadOptions();
            }
            loadOptions.CheckExcelRestriction = false;
            Workbook workbook = new Workbook(pInFile, loadOptions);
            workbook.Settings.CheckExcelRestriction = false;
            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            // option to set all the columns of excel in one page.
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            /* Retain the structure of original excel */
            pdfSaveOptions.ExportDocumentStructure = true;
            /* Formula calculation for any digit formulla applied in excels */
            /*it is best to call Workbook.CalculateFormula() just before rendering the spreadsheet to PDF. 
           * This ensures  that the formula dependent values are recalculated, and the correct
           * values are rendered in the PDF.*/
            workbook.CalculateFormula();
            /* For custom named range, need below code to display number instead of ## */
            Range[] range = workbook.Worksheets?.GetNamedRanges();
       
            if (range != null && range.Length > 0)
            {
                foreach (var r in range)
                {
                    if (!r.Address.Contains(":"))
                    {
                        r.Worksheet.Cells[r.Address].PutValue(r.Value?.ToString());
                    }

                }
            }
            /* For some sheet, print area is applied, to display all the data in pdf, 
             * below setting is needed */
            foreach (Worksheet wks in workbook.Worksheets)
            {

                foreach (Aspose.Cells.Cell cell in wks.Cells)
                {

                    Style style = cell.GetStyle();

                    /* if cell is numeric, consider its category type as numeric and update the format */
                    if (cell.Type == CellValueType.IsNumeric || cell.Type == CellValueType.IsDateTime)
                    {
                        if(cell.DisplayStringValue.Contains("#") ||
                            cell.DisplayStringValue.ToLower().Contains("e"))
                        {
                            cell.PutValue(cell.Value.ToString());
                        }
                     
                        style.HorizontalAlignment = TextAlignmentType.Right;
                    }
                    style.VerticalAlignment = TextAlignmentType.Center;
                    cell.SetStyle(style);
                }

                Style wrapTextStyle = workbook.CreateStyle();
                wrapTextStyle.IsTextWrapped = true;
                // Creating StyleFlag
                StyleFlag styleFlag = new StyleFlag();
                styleFlag.WrapText = true;
                // Accessing a column (A  column) from the Columns collection
                Column column = wks.Cells.Columns[0];
                // Applying the style to the column
                column.ApplyStyle(wrapTextStyle, styleFlag);

                wks.PageSetup.PrintArea = "";
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;
            }
            workbook.Save(pOutFile, pdfSaveOptions);

@Vaidehi123,

Thanks for your code segment.

If you are saving to Excel file format, you do not need to auto-fit rows or extend the rows height to display the wrapped text in the column, MS Excel automatically extends the rows height a bit when you open the output Excel file into it. Since you are rendering to PDF file format, so you have to extend the rows height a bit to display/show the wrapped text properly. Please add the following lines of code to your code, it will work fine:
e.g.
Sample code:

....
                Style wrapTextStyle = workbook.CreateStyle();
                wrapTextStyle.IsTextWrapped = true;
                // Creating StyleFlag
                StyleFlag styleFlag = new StyleFlag();
                styleFlag.WrapText = true;
                // Accessing a column (A  column) from the Columns collection
                Column column = wks.Cells.Columns[0];
                // Applying the style to the column
                column.ApplyStyle(wrapTextStyle, styleFlag);

                wks.PageSetup.PrintArea = "";
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;

                for (int row = 9; row < 88; row++)
                {
                    wks.Cells.SetRowHeight(row, 48);
                }
            }
            workbook.Save(pOutFile, pdfSaveOptions);

Hope, this helps a bit.

@Amjad_Sahi Thank you for the solution. Above solution will work for only this particular file. Is there any generic solution which can work for any file?

@Vaidehi123,

I am afraid, without using AutoFitRows/AutoFitColumns (as you want to enforce not to use it), there is no generic solution or approach to accomplish your task. This is same as in MS Excel.

You may share your sample files and sample code to demonstrate the issue, we can check it in details.

@Amjad_Sahi Please find excel and converted PDF DadosDocumento.zip (320.9 KB)
file attached below, Please let us know if there is any generic solution to get all the text visible in converted PDF?

@Vaidehi123,

In MS Excel, if you talk about generic solution for Excel to PDF rendering, there is no other reliable way but to use auto-fit column(s) for the task, see below bullet for reference. What problem you find regarding auto-fit columns when used in code to render your attached file “DadosDocumento.xlsx” to PDF file format? Could you also share your current sample code where auto-fit columns operation is giving problem for your attached file. We will check your issue soon.

  • Use auto fit columns operation to show long data in different cells. If you do not want to auto-fit all the columns (AutFitColumns()), you may only auto-fit first column using AutoFitColumn() method (e.g., for the column which has long data/text) in worksheet.