We recently encountered a major issue with the formatting of exported Excel files. The files were not well-formatted. Since we have over 2,500+ reports, and modifying the existing code was not feasible, we decided to use your Aspose.PDF product to convert PDF files to Excel—without making any changes to the source code.
For our English clients, the PDF to Excel conversion works perfectly. However, we also have clients who use other languages, such as Marathi. For Marathi PDFs, we use Unicode fonts like YatraManav and LokWeb.
Issue: we are facing is that when these Marathi PDFs are converted to Excel, the resulting Excel files contain garbled or unreadable text, instead of properly rendered Marathi characters.
This is a critical concern for us. We kindly and urgently request your support in resolving this issue. If there are any specific configurations, font settings, or workarounds that could help enable accurate Marathi text export from PDF to Excel, we would highly appreciate your guidance.
It seems you’re experiencing issues with the conversion of Marathi PDFs to Excel using Aspose.PDF, particularly with Unicode fonts like YatraManav and LokWeb. When converting documents that contain non-English characters, such as Marathi, it’s crucial to ensure that the fonts are properly embedded in the PDF.
Here are some steps and configurations you can try to improve the conversion quality:
Ensure Fonts are Embedded: Make sure that the fonts used in the PDF are embedded. If the fonts are not embedded, the conversion process may not render the characters correctly.
Use the Latest Version: Ensure you are using the latest version of Aspose.PDF, as updates often include bug fixes and improvements for handling various languages and fonts.
Check Encoding: Ensure that the PDF is correctly encoded. Sometimes, issues arise from incorrect encoding settings in the source PDF.
Test with Different Output Formats: If Excel output is problematic, consider converting the PDF to another format (like CSV or DOCX) first, and then to Excel. This can sometimes yield better results.
If these suggestions do not resolve the issue, you may want to reach out to Aspose support for more tailored assistance.
We have already tried all the recommended steps, including embedding fonts, using the latest Aspose.PDF version, and testing different output formats. The main issue appears to be related to encoding.
Could you please advise if there is any specific solution or workaround available to handle encoding issues, especially for Marathi Unicode fonts like YatraManav and LokWeb, during PDF to Excel conversion? and provide patches, workarounds, or code samples specific to Marathi Unicode handling.
We have recently encountered a significant issue related to the formatting of Excel files exported from our system. The output files are not properly formatted. As we maintain over 2,500 reports, modifying the existing codebase was not a viable option. To address this, we chose to use your Aspose.PDF product for converting PDF files to Excel without altering any of the original code.
The conversion process works seamlessly for our English-language reports. However, we also serve clients who require reports in other languages, such as Marathi. For these reports, we use Unicode fonts like YatraManav and LokWeb.
The problem we’re encountering is that when these Marathi PDFs are converted to Excel, the resulting files display garbled or unreadable text, rather than properly rendered Marathi characters.
This issue is critically impacting our multilingual reporting capabilities. We therefore urgently request your assistance in resolving this matter. If there are any specific configurations, encoding settings, or workarounds that would ensure accurate Marathi text conversion from PDF to Excel, we would greatly appreciate your recommendations.
To address the issue of garbled text when converting Marathi PDFs to Excel using Aspose.PDF, you can try the following steps:
Ensure Font Embedding: Verify that the fonts used in your Marathi PDFs (such as YatraManav and LokWeb) are embedded in the PDF. If the fonts are not embedded, the conversion may not render the text correctly.
Use the Latest Version: Make sure you are using the latest version of Aspose.PDF, as updates often include bug fixes and improvements related to text rendering and encoding.
Use ExcelSaveOptions: While Aspose.PDF does not have specific encoding settings for Excel conversion, you can utilize the ExcelSaveOptions class to specify how the PDF should be converted to Excel. Here’s a basic example:
using Aspose.Pdf;
using Aspose.Pdf.ExcelSaveOptions;
// Load your PDF document
Document pdfDocument = new Document("input.pdf");
// Create Excel save options
ExcelSaveOptions saveOptions = new ExcelSaveOptions();
saveOptions.AutoFitColumns = true; // Adjust column widths
// Save the document as Excel
pdfDocument.Save("output.xlsx", saveOptions);
Post-Processing: If the text still appears garbled, consider post-processing the Excel file to replace garbled text with the correct Marathi characters using a mapping or dictionary approach.
Contact Support: If the issue persists, consider reaching out to Aspose support for more tailored assistance, especially since this is a critical issue for your multilingual reporting.
These steps should help improve the conversion quality for your Marathi reports. If you have further questions or need additional assistance, feel free to ask!
Thank you for your detailed response and suggestions.
We would like to clarify that our current process involves directly converting PDF files to Excel using Aspose.PDF. We are not generating Excel reports programmatically, but rather converting pre-generated PDF reports (which include Marathi text) into Excel format.
Given this, we are specifically looking for a solution where we do not have to recreate or manually build Excel files. Instead, we want to ensure that the PDF to Excel conversion preserves the Marathi Unicode text accurately.
We have already tried:
Embedding fonts properly in the PDF (YatraManav and LokWeb),
Using the latest version of Aspose.PDF,
Applying ExcelSaveOptions with autofit settings.
However, the issue still seems to be related to encoding or font mapping during conversion.
Could you please confirm:
Whether Aspose.PDF supports direct and accurate Unicode (Marathi) text conversion from PDF to Excel?
Is there any way to specify custom font substitution or encoding behavior during the conversion process?
Are there any advanced features or configuration options available in Aspose.PDF to handle such multilingual text scenarios more accurately?
We would highly appreciate any guidance or examples you can provide, as this is a critical requirement for our multilingual reporting.
As requested, please find the necessary information below to help you reproduce and test the scenario in your environment. Kindly review this at your earliest convenience and let us know the outcome.
Dim pdfdoc As Aspose.Pdf.Document = Nothing
Dim pdfFullPath As String = Server.MapPath(“…/” & pPDFAOActionReports & “/” & pclsConcurrency.strUsersId & “_” & strReportName & “.pdf”)
pdfdoc = New Aspose.Pdf.Document(pdfFullPath)
Dim ExcelDoc As New Aspose.Pdf.ExcelSaveOptions()
ExcelDoc.Format = Aspose.Pdf.ExcelSaveOptions.ExcelFormat.XLSX
ExcelDoc.MinimizeTheNumberOfWorksheets = True
ExcelDoc.InsertBlankColumnAtFirst = False
Dim excelFilePath As String = pdfFullPath.Replace(“.pdf”, “.xlsx”)
pdfdoc.Save(excelFilePath, ExcelDoc)
If you require any additional information, please don’t hesitate to let us know — we’ll be happy to assist further.
As requested, please find the necessary information below to help you reproduce and test the scenario in your environment. Kindly review this at your earliest convenience and let us know the outcome.
Dim pdfdoc As Aspose.Pdf.Document = Nothing
Dim pdfFullPath As String = Server.MapPath(“…/” & pPDFAOActionReports & “/” & pclsConcurrency.strUsersId & “_” & strReportName & “.pdf”)
pdfdoc = New Aspose.Pdf.Document(pdfFullPath)
Dim ExcelDoc As New Aspose.Pdf.ExcelSaveOptions()
ExcelDoc.Format = Aspose.Pdf.ExcelSaveOptions.ExcelFormat.XLSX
ExcelDoc.MinimizeTheNumberOfWorksheets = True
ExcelDoc.InsertBlankColumnAtFirst = False
Dim excelFilePath As String = pdfFullPath.Replace(“.pdf”, “.xlsx”)
pdfdoc.Save(excelFilePath, ExcelDoc)
If you require any additional information, please don’t hesitate to let us know — we’ll be happy to assist further.
Thanks for sharing the requesting information. Can you please highlight the garbled or unreadable text from the Excel file? You can share screenshot as well.
The Excel file you have shared seems to contain garbage values (Marathi text). We need it to be in the same format and clarity as the PDF. image.png (61.0 KB)
As requested, I have highlighted the unreadable/garbled Marathi text in the Excel file and compared it with the correct format from the PDF. I’ve attached two screenshots for your reference — the issues are marked clearly in red columns to show the differences between the Excel and PDF content.
Additionally, I’ve reattached both the Excel file and the PDF for your convenience.
As requested, I have highlighted the unreadable/garbled Marathi text in the Excel file and compared it with the correct format from the PDF. I’ve attached two screenshots for your reference — the issues are marked clearly in red columns to show the differences between the Excel and PDF content.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-60797
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.