Arabic excel with cell length more than 32k is not converting to pdf properly

I have one excel with Arabic and English language and some of the merged cell length greater than 32k.
I am trying to convert that excel to pdf using CheckExcelRestriction = false flag, but conversion is not done properly.

Code is below:

            var loadOptions = new LoadOptions();
            loadOptions.CheckExcelRestriction = false;
            Workbook workbook = new Workbook(pInFile, loadOptions);
            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            workbook.Settings.CheckExcelRestriction = false;
            pdfSaveOptions.ExportDocumentStructure = true;
            workbook.CalculateFormula();
            AutoFitterOptions options = new AutoFitterOptions();
         
            foreach (Worksheet wks in workbook.Worksheets)
            {
                int maxRow = wks.Cells.MaxRow;
                int maxColumn = wks.Cells.MaxColumn;
                wks.PageSetup.PrintArea = "A1:" + CellsHelper.CellIndexToName(maxRow, maxColumn);
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;
               
                wks.AutoFitRows(options);
                wks.AutoFitColumns();
                foreach (Cell cell in wks.Cells)
                {
                    Style style = cell.GetStyle();
                    style.VerticalAlignment = TextAlignmentType.Center;
                    style.IsTextWrapped = true;
                    style.IsJustifyDistributed = true;
                    cell.SetStyle(style);
                }
            }
            workbook.Save(pOutFile, pdfSaveOptions);

Please find attached original and converted files.
1522217932.pdf (1.0 MB)
Capture1.PNG (31.1 KB)
Capture2.PNG (20.6 KB)

@Vaidehi123,

Please zip and attach your input Excel file as well. Moreover, please provide your expected (output) PDF file using MS Excel. We will check your issue soon.

@Amjad_Sahi Please find the zip folder in attachment.Files.zip (375.0 KB)

@Vaidehi123,

I checked the files in the zipped archive. Probably, the PDF is the output PDF file using MS Excel. I think you may try the following sample code, it will render/wrap the bigger (merged) cells properly to display all the data.
e.g.
Sample code:

var loadOptions = new LoadOptions();
            loadOptions.CheckExcelRestriction = false;
            Workbook workbook = new Workbook("e:\\test2\\1522221182.xls", loadOptions);
            workbook.CalculateFormula();
            AutoFitterOptions options = new AutoFitterOptions();
            options.AutoFitMergedCellsType = AutoFitMergedCellsType.EachLine;

            foreach (Worksheet wks in workbook.Worksheets)
            {
                int maxRow = wks.Cells.MaxRow;
                int maxColumn = wks.Cells.MaxColumn;
                wks.PageSetup.PrintArea = "A1:" + CellsHelper.CellIndexToName(maxRow, maxColumn);
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;

                foreach (Cell cell in wks.Cells)
                {
                        Style style = cell.GetStyle();
                        style.VerticalAlignment = TextAlignmentType.Center;
                        style.IsTextWrapped = true;
                        style.IsJustifyDistributed = true;
                        cell.SetStyle(style);
                    
                }
                
                wks.AutoFitRows(options);
                wks.AutoFitColumns();
                
               
            }

            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            workbook.Settings.CheckExcelRestriction = false;
            pdfSaveOptions.ExportDocumentStructure = true;
            
            workbook.Save("e:\\test2\\out1.pdf", pdfSaveOptions);

Let us know if it does not render as per your expectations.

Hi @Amjad_Sahi … Thanks for your reply. I have tried above code but still its not rendering as per the original excel file. I am attaching the converted PDF file generated from above code. Could you please help?1522221182.pdf (1.3 MB)

Just to update, we have a licensed version of aspose total 21.3.0 and we are using that for conversion.

@Vaidehi123,

Please notice, I am able to find the issue as you mentioned using your template file and following sample code. I found Arabic Excel file with cell length more than 32k is not converting to PDF properly, the text is not wrapped in those cells. I tried many variations and options but to no avail:
e.g.
Sample code:

  var loadOptions = new LoadOptions();
            loadOptions.CheckExcelRestriction = false;
            Workbook workbook = new Workbook("e:\\test2\\1522221182.xls", loadOptions);

            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            workbook.Settings.CheckExcelRestriction = false;
            pdfSaveOptions.ExportDocumentStructure = true;
            workbook.CalculateFormula();
            AutoFitterOptions options = new AutoFitterOptions();
            options.AutoFitMergedCellsType = AutoFitMergedCellsType.EachLine;

            foreach (Worksheet wks in workbook.Worksheets)
            {
                int maxRow = wks.Cells.MaxRow;
                int maxColumn = wks.Cells.MaxColumn;
                wks.PageSetup.PrintArea = "A1:" + CellsHelper.CellIndexToName(maxRow, maxColumn);
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;

                wks.AutoFitRows(options);
                wks.AutoFitColumns();

                foreach (Cell cell in wks.Cells)
                {
                        Style style = cell.GetStyle();
                        style.VerticalAlignment = TextAlignmentType.Center;
                        style.IsTextWrapped = true;
                        style.IsJustifyDistributed = true;
                        cell.SetStyle(style);

                }

            }

            workbook.Save("e:\\test2\\out1.pdf", pdfSaveOptions);

I have logged a ticket with an id “CELLSNET-50215” for your issue. We will look into it soon.

Once we have an update on it, we will let you know.

Thanks @Amjad_Sahi!

@Vaidehi123,

You are welcome.

@Vaidehi123,

We found your source file is a complex html with Arabic text. I am afraid, we need more time to analyze it and then resolve your issue.

We will keep you posted on updates (once available) on it.

Thanks @Amjad_Sahi for the update

@Vaidehi123

We have made some improvement in the fix v22.1.1.
Please try it with the following code, and hightlight the issues that we need to fix.

var loadOptions = new LoadOptions();
loadOptions.CheckExcelRestriction = false;
Workbook workbook = new Workbook("1522221182.xls", loadOptions);

workbook.CalculateFormula();

//use style pool to change style for all Cells.
int styleCount = workbook.CountOfStylesInPool;
for (int i = 0; i < styleCount; i++)
{
    Style style = workbook.GetStyleInPool(i);
    style.VerticalAlignment = TextAlignmentType.Center;
    style.IsTextWrapped = true;
    style.IsJustifyDistributed = true;
}
Style defaultStyle = workbook.DefaultStyle;
defaultStyle.VerticalAlignment = TextAlignmentType.Center;
defaultStyle.IsTextWrapped = true;
defaultStyle.IsJustifyDistributed = true;

AutoFitterOptions options = new AutoFitterOptions();
options.AutoFitMergedCellsType = AutoFitMergedCellsType.EachLine;
foreach (Worksheet wks in workbook.Worksheets)
{
    int maxRow = wks.Cells.MaxRow;
    int maxColumn = wks.Cells.MaxColumn;
    wks.PageSetup.PrintArea = "A1:" + CellsHelper.CellIndexToName(maxRow, maxColumn);
    wks.PageSetup.BottomMargin = 1;
    wks.PageSetup.LeftMargin = 1;
    wks.PageSetup.RightMargin = 1;
    wks.PageSetup.TopMargin = 1;

    wks.AutoFitRows(options);
    wks.AutoFitColumns();
}


Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
pdfSaveOptions.MergeAreas = true;
workbook.Settings.CheckExcelRestriction = false;
pdfSaveOptions.ExportDocumentStructure = true;

workbook.Save("output.pdf", pdfSaveOptions);
}

Aspose.Cells22.1.1 For .Net2_AuthenticodeSigned.Zip (5.6 MB)
Aspose.Cells22.1.1 For .Net4.0.Zip (5.6 MB)
Aspose.Cells22.1.1 For .NetStandard20.Zip (5.6 MB)

@Peyton.Xu Thanks for your reply. Above code is able to converted the required Arebic pdf but it’s disturbing the other PDF conversion. In our system, input file format and language is not identical. I have attached the original and 1522206818.zip (239.9 KB)
converted file which is converted using above code. Please share your thoughts.

@Vaidehi123

Please share your original code that works OK for the file.

@Peyton.Xu Please find the code below.

            string dataDir = System.IO.Path.GetFullPath(@"..\..\");
            Workbook workbook = new Workbook(pInFile);
            Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
            // option to set all the columns of excel in one page.
            pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
            pdfSaveOptions.MergeAreas = true;
            /* Retain the structure of original excel */
            pdfSaveOptions.ExportDocumentStructure = true;
            /* Formula calculation for any digit formulla applied in excels */
            /*it is best to call Workbook.CalculateFormula() just before rendering the spreadsheet to PDF. 
           * This ensures  that the formula dependent values are recalculated, and the correct
           * values are rendered in the PDF.*/
            workbook.CalculateFormula();
            /* For custom named range, need below code to display number instead of ## */
            Range[] range = workbook.Worksheets?.GetNamedRanges();
         
           // workbook.Worksheets.get
            if(range != null && range.Length > 0)
            {
                foreach(var r in range)
                {
                    if (!r.Address.Contains(":"))
                    {
                        r.Worksheet.Cells[r.Address].PutValue(r.Value?.ToString());
                    }
                 
                }
            }
            foreach (Worksheet wks in workbook.Worksheets)
            {
               
                foreach (Aspose.Cells.Cell cell in wks.Cells)
                {
                    Style style = cell.GetStyle();
                    /* if cell is numeric, consider its category type as numeric and update the format */
                    if (cell.Type == CellValueType.IsNumeric || cell.Type == CellValueType.IsDateTime)
                    {
                        var updatedValue = cell.DisplayStringValue.Contains("#") ? cell.Value : cell.DisplayStringValue;
                        cell.PutValue(updatedValue.ToString());
                    }
                    style.VerticalAlignment = TextAlignmentType.Center;
                    cell.SetStyle(style);
                }
                wks.PageSetup.PrintArea = "";
                wks.PageSetup.BottomMargin = 1;
                wks.PageSetup.LeftMargin = 1;
                wks.PageSetup.RightMargin = 1;
                wks.PageSetup.TopMargin = 1;
            }
            workbook.Save(pOutFile, pdfSaveOptions);

@Vaidehi123,

Thanks for sharing the original code.

We will evaluate it and get back to you soon.

@Vaidehi123

Please try to use FileFormatUtil.DetectFileFormat(string filePath) to detect file format, and do some adaptations for Html file format.

 FileFormatInfo fileFormatInfo = FileFormatUtil.DetectFileFormat(pInFile);
 LoadOptions loadOptions;
 if (fileFormatInfo.FileFormatType == FileFormatType.Html)
 {
     HtmlLoadOptions htmlLoadOptions = new HtmlLoadOptions();
     htmlLoadOptions.AutoFitColsAndRows = true;

     loadOptions = htmlLoadOptions;
 }
 else
 {
     loadOptions = new LoadOptions();
 }
 Workbook workbook = new Workbook(pInFile, loadOptions);
 Aspose.Cells.PdfSaveOptions pdfSaveOptions = new Aspose.Cells.PdfSaveOptions();
 // option to set all the columns of excel in one page.
 pdfSaveOptions.AllColumnsInOnePagePerSheet = true;
 pdfSaveOptions.MergeAreas = true;
 /* Retain the structure of original excel */
 pdfSaveOptions.ExportDocumentStructure = true;
 /* Formula calculation for any digit formulla applied in excels */
 /*it is best to call Workbook.CalculateFormula() just before rendering the spreadsheet to PDF. 
* This ensures  that the formula dependent values are recalculated, and the correct
* values are rendered in the PDF.*/
 workbook.CalculateFormula();
 /* For custom named range, need below code to display number instead of ## */
 Range[] range = workbook.Worksheets?.GetNamedRanges();

 // workbook.Worksheets.get
 if (range != null && range.Length > 0)
 {
     foreach (var r in range)
     {
         if (!r.Address.Contains(":"))
         {
             r.Worksheet.Cells[r.Address].PutValue(r.Value?.ToString());
         }

     }
 }
 foreach (Worksheet wks in workbook.Worksheets)
 {

     foreach (Aspose.Cells.Cell cell in wks.Cells)
     {
         Style style = cell.GetStyle();
         /* if cell is numeric, consider its category type as numeric and update the format */
         if (cell.Type == CellValueType.IsNumeric || cell.Type == CellValueType.IsDateTime)
         {
             var updatedValue = cell.DisplayStringValue.Contains("#") ? cell.Value : cell.DisplayStringValue;
             cell.PutValue(updatedValue.ToString());
         }
         style.VerticalAlignment = TextAlignmentType.Center;
         cell.SetStyle(style);
     }
     wks.PageSetup.PrintArea = "";
     wks.PageSetup.BottomMargin = 1;
     wks.PageSetup.LeftMargin = 1;
     wks.PageSetup.RightMargin = 1;
     wks.PageSetup.TopMargin = 1;
 }
 workbook.Save(pOutFile, pdfSaveOptions);

The code works with Aspose.Cells v21.1.1 for both of your shared files. There is only a little issue for the source html file with Arabic text: some lines are not wrapped and are rendered as a single long line.

@Peyton.Xu - Thanks for the updated code. Could you please provide me your converted pdf file using above code? Because I have some issues in converted file like all table columns are not in one row etc. I just want to compare the converted files to get more idea.

@Vaidehi123

Files generated by Aspose.Cells v21.1.1 are attached:
1522206818.pdf (470.5 KB)
1522221182.pdf (455.7 KB)

@Peyton.Xu Thanks for sharing. Need some confirmation from your end.

  1. I need to get licensed version of Aspose.Total 22.1.0
  2. Issue of text wrapping is not resolvable? Is it the limitation of Aspose or Excel?

@Vaidehi123

Currently, we can’t fix the text wrapping issue while reading your shared html source file into our Workbook object. It is an issue of Aspose.Cells.

@Amjad_Sahi will share you more details.