We are using Aspose.PDF .NET to convert PDF’s that we’ve generated to Excel. Some of the PDF’s are converting correctly with no translation in numeric formatting, while others are converting the separators for European formatting. We are not seeing any properties in the PDF’s that would cause Aspose to switch to translating numeric formatting on the values that are written to Excel.
Example:
Value in PDF 1,080
Value in Excel 1.08
Why would Aspose decide to convert if we’re not forcing any locale settings?
All PDF’s are generated from the same system, so we know this is not a system locale conflict.
@leelierz
The issue you’re experiencing with numeric formatting during the conversion of PDF to Excel using Aspose.PDF may stem from how the library interprets numeric values within the PDF. Different PDFs can have varying internal representations of numbers, leading to discrepancies in formatting upon conversion.
Here are a few considerations to help address the issue:
-
Locale Settings: Aspose.PDF may infer locale settings based on the content of the PDF. If some PDFs contain numbers formatted with commas as decimal separators, Aspose might interpret them according to European-style formatting.
-
PDF Content: Examine the content of the PDFs that convert correctly versus those that do not. There may be subtle differences in how numbers are represented or encoded in the problematic PDFs.
-
Conversion Options: While there may not be direct properties in the PDF to control numeric formatting, you can utilize the ExcelSaveOptions
class to specify how you want the numbers to be formatted during conversion.
Here’s a basic example of how to set up the conversion:
using Aspose.Pdf;
using Aspose.Pdf.ExcelSaveOptions;
Document pdfDocument = new Document("input.pdf");
ExcelSaveOptions options = new ExcelSaveOptions();
options.ConvertFormFieldsToText = true; // Example option
pdfDocument.Save("output.xlsx", options);
If the issue persists, consider checking the Aspose forums or documentation for any updates or specific settings that might assist with numeric formatting during conversion.
Thank you for your response. The fact that Aspose has Bullet #1 with no clear way to over-ride is very bad, especially when there are no visible numbers using commas as a decimal separator.
Bullet #3 is not supported under Aspose.PDF for .NET based on current documentation.
@leelierz
Would you kindly confirm if you used 25.9 version of the API? Please share your sample source and output file(s) for our reference along with the code snippet that you used. We will test the scenario in our environment and address it accordingly.
Thank you for looking into this.
Here is the code snippet
string excelOutputFileName = outFolder + System.IO.Path.GetFileNameWithoutExtension(file) + “.xlsx”;
using (var document = new Aspose.Pdf.Document(file))
{
document.Save(excelOutputFileName);
}
I’ve also tried it using the ExcelSaveOptions object, no difference, and this is where we don’t see that property they mentioned for converting numbers to text:
string excelOutputFileName = outFolder + System.IO.Path.GetFileNameWithoutExtension(file) + “.xlsx”;
using (var document = new Aspose.Pdf.Document(file))
{
ExcelSaveOptions saveOptions = new Aspose.Pdf.ExcelSaveOptions { Format = ExcelSaveOptions.ExcelFormat.XLSX };
document.Save(excelOutputFileName, saveOptions);
}
146564-Protect Cost National-Black Label Premium Maintenance Plan ESP.pdf (57.8 KB)
x146564-Protect Cost National-Black Label Premium Maintenance Plan ESP.zip (12.3 KB)
a source pdf and the excel output have been attached that show the behavior we’ve been discussing.
@leelierz
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-60821
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.