Convert PDF to CSV using Aspose.PDF for .NET - returns an empty string

Currently we are trying to convert PDFs to CSV. While some PDFs convert correctly others just return an empty string. The same thing happens trying to convert to text. Any help would be appreciated. Thanks in advance!


Code:

public HttpResponseMessage GetText(string path)
{
var pdfDocument = new Document(path);
var excelsave = new ExcelSaveOptions { MinimizeTheNumberOfWorksheets = true };
var exceldoc = new Aspose.Cells.Workbook();

using (var stream = new MemoryStream())
{
pdfDocument.Flatten();
pdfDocument.Save(stream, excelsave);
exceldoc = new Aspose.Cells.Workbook(stream);
}

//The rest of the codeā€¦
}

Hi Robert,


Thanks for your inquiry. We will appreciate it if you please share the some sample problematic PDF documents here. We will look into it and will provide you information accordingly.

We are sorry for the inconvenience caused.

Best Regards,

Hello,


I have emailed you an example document. Thanks for the fast response!

Hi Robert,


Thanks for sharing the source PDF document. Your PDF document is image only i.e. non-searchable document, so you are getting an empty string. We have converted your sample PDF document to searchable PDF document using Google tesseract-ocr and rendered it to Excel. However output Excel file is corrupted. So we have logged a ticket PDFNEWNET-38689 in our issue tracking system for further investigation and resolution. We will keep you updated about the issue resolution progress within this forum thread.

We are sorry for the inconvenience caused.

Best Regards,

The issues you have found earlier (filed as PDFNET-38689) have been fixed in Aspose.PDF for .NET 18.11.

@mrawesome

We would like to share with you that now you can generate CSV files directly from PDF documents using Aspose.PDF for .NET 20.7. Please consider using following code snippet:

PDF to CSV

ExcelSaveOptions options = new ExcelSaveOptions();
options.ConversionEngine = ExcelSaveOptions.ConversionEngines.NewEngine;
options.Format = ExcelSaveOptions.ExcelFormat.CSV;

Document pdfDocument = new Document("Currencies.pdf");
pdfDocument.Save("Currencies.csv", options);