PDF to Excel C# | Columns Splitting into 2 Columns

We are using the Aspose.PDF dll to convert the PDF data to xlsx format.

In PDF format, my report has 4 columns, whereas when exported to xlsx format using Aspose, the resulting columns are 5. Randomly some columns are split into 2 in xlsx, and data is split between those 2 columns. Could you please help us in resolving this issue?

DLL : Aspose.PDF
Version : 21.4.0

@sbdream21

Would you please make sure using the latest version of the API i.e. 21.5 and if you still face any issue, please share your sample PDF with us so that we can test the scenario in our environment and address the issue accordingly.

Upgrade did not work. Attached is a sample PDF.

AVA.pdf (64.6 KB)

@sbdream21

We were able to replicate the issue at our end while using 21.5 version and the below code snippet:

Document pdfDocument = new Document(dataDir + "AVA.pdf");
Aspose.Pdf.ExcelSaveOptions excelsave = new ExcelSaveOptions();
excelsave.Format = ExcelSaveOptions.ExcelFormat.XLSX;
//excelsave.MinimizeTheNumberOfWorksheets = true;
excelsave.ConversionEngine = ExcelSaveOptions.ConversionEngines.LegacyEngine;
pdfDocument.Save(dataDir + "AVA.out.xlsx", excelsave);

We have logged an issue as PDFNET-49938 in our issue tracking system. We will further look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

Thank you. Notice there are cells that are merging as well (see Northern Sonoma and North Coast). Merging and splitting data between multiple columns is consistent behavior we are seeing across most reports.

Do you have an ETA on when I might receive an update? We have some promotion deadlines coming up and I’m trying to plan what will be in our release.

@sbdream21

We have already recorded this information under the ticket ID that has been shared earlier.

The issue is logged under the free support model and will be resolved on a first come first serve basis. We are afraid that we cannot share an ETA until the issue is investigated completely. We will further inform you once we have definite updates regarding its resolution.

We are sorry for the inconvenience.

I noticed there is an issue status of resolved for PDFNET-49938. Is there a new version coming out with this fix?

@sbdream21

We would like to share with you that the issue has been resolved in 21.6 version of the API which will be released next month. You will surely be notified once the new version is available for the download.

Great! Thank you.

1 Like

Will this fix also fix where column headings are being merged across multiple cells, causing blank cells to be inserted into columns? There is also cell merging issues within the data. See line 34 in the attached. "Nutrient - ME Powder 1.01655 is being merged into multiple cells. Lot Ingrs.pdf (80.7 KB)

@sbdream21

With Aspose.PDF for .NET 21.6, the conversion result of your new file will be better. However, some issues will still be there. So, we have logged another ticket as PDFNET-50004 in our issue tracking system for the sake of correction. We will let you know once it is fixed. Please give us some time.

The issues you have found earlier (filed as PDFNET-49938) have been fixed in Aspose.PDF for .NET 21.6.

Thank you! We’ll install the latest version and take a look.

1 Like

We are still seeing cell merge issues. In the Lot Pcts attachment, the percent values are being merged across columns F-H. Also, the Appellation, percents and Variety are being merged in C8, D8, and E8. Also, percents are being merged in the lower section. Ex. Cells B22-D22.

Lot Pcts 20210617114607991.pdf (73.6 KB)

In the Tankboard example, column headings are being merged causing data to shift. See J4-J5 (WIP UOM) and H4 (Clarity and Contract).

TankBoard_20210617115104541.pdf (89.3 KB)

@sbdream21

We have logged two tickets in our issue tracking system for the rectification of the issues in your recently shared files. The ticket IDs are as follows:

  • PDFNET-50098
  • PDFNET-50099

We will surely look into details of the tickets and let you know as soon as they are resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

PDFNET-49938 is not resolved yet. I tried converting the attached via your online converter and the issue still exists on line 6 for Oregon [Williamette Valley]. AVAHierarchy.pdf (64.4 KB)

@sbdream21

Another issue as PDFNET-50474 has been logged in our issue tracking system after testing the scenario with 21.8 version of the API. We will surely look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.