Problems converting Pdf to Excel

We have purchased Aspose Pdf for Java and we are trying generate excel from pdf and its unable to convert. We are getting spread sheet not the regular excel. We are getting invalid header signature exception while opening the file. We want regular excel file from pdf.


please help us regarding this.

Hi Vinaya Sagar,


Thanks for contacting support.

Aspose.Pdf convert PDF file to SpreadSheet XML format and when viewing the file, you may get a prompt. However in order to resolve this issue, you may consider opening resultant SpreadSheetML file into Aspose.Cells for Java and then save the output in required Excel format. For more information, please visit

We are sorry for this inconvenience.

Hello,
for .Net lib this issue is still actual.
- We cann’t open Cells.Workbook from pdf,
- Converting to html, mhtml then openning from that in Cells.Workbook is not working as well as needed
Why method Aspose.Pdf.Document.Save(“filename.xls”, SaveFormat.Excel) not works correctly? I mean in this case we loose any formating data in saved file…
Can you, please, give some code examples for solving this problem?
thanks in advance

palasheev:
- We cann’t open Cells.Workbook from pdf,
Hi Kevin,

Thanks for using our API’s.

Please note that Aspose.Pdf for .NET can only create as well as manipulate existing PDF files and it does not offer the feature to open excel workbooks.
palasheev:
- Converting to html, mhtml then openning from that in Cells.Workbook is not working as well as needed
Do you mean you are saving PDF file in HTML format using Aspose.Pdf for .NET and then trying to open HTML/MHTML file using Aspose.Cells ? Can you please share the source PDF file, so that we can test the scenario in our environment.
palasheev:
Why method Aspose.Pdf.Document.Save(“filename.xls”, SaveFormat.Excel) not works correctly? I mean in this case we loose any formating data in saved file…
Can you, please, give some code examples for solving this problem?
thanks in advance
Please share the resource PDF file, so that we can test the scenario in our environment. We are sorry for your inconvenience.

Thank you for quick reply, Nayyer.

codewarior:
Please
note that Aspose.Pdf for .NET can only create as well as manipulate
existing PDF files and it does not offer the feature to open excel
workbooks.

My initial goal is the converting of pdf-document to excel-document, but not vice versa.
In your previous message you said that for resolving that issue we can save pdf-doc to SpreadSheetML, then open it into Aspose.Cells.Workbook and then save as excel-file as I understand. In this context, I mean that we cann’t open pdf-document into Aspose.Cells.Workbook directly.

But saved in MobiXml-format resulting file not opens into Aspose.Cells.Workbook (“Line 0: in the SpreadsheetML file.” - google did not help me what I need to do for resolving).
Saving to SaveFormat.Xml ends with error (“Tagged pdf expected. Please use tagged pdf file for converting to xml format or use MobiXml for untagged pdf”).

codewarior:
Do you mean you are saving PDF file in HTML
format using Aspose.Pdf for .NET and then trying to open HTML/MHTML file
using Aspose.Cells ?

Yes, I tried to use many Load/Save-format types including cases with intermediate file saving.
Direct saving pdf-doc to Excel-doc causes formating loosing.

In attach you can find my simple pdf-file.

palasheev:
Thank you for quick reply, Nayyer.
codewarior:
Please note that Aspose.Pdf for .NET can only create as well as manipulate existing PDF files and it does not offer the feature to open excel workbooks.
My initial goal is the converting of pdf-document to excel-document, but not vice versa.
In your previous message you said that for resolving that issue we can save pdf-doc to SpreadSheetML, then open it into Aspose.Cells.Workbook and then save as excel-file as I understand. In this context, I mean that we cann't open pdf-document into Aspose.Cells.Workbook directly.
Hi Kevin,

Thanks for sharing the details.

Yes your understanding is correct. You cannot open PDF document using Aspose.Cells.Workbook object.
palasheev:
But saved in MobiXml-format resulting file not opens into Aspose.Cells.Workbook ("Line 0: in the SpreadsheetML file." - google did not help me what I need to do for resolving).
The MobiXml file cannot be opened with Aspose.Cells object as it expects the input file in MS Excel format.
palasheev:
Saving to SaveFormat.Xml ends with error ("Tagged pdf expected. Please use tagged pdf file for converting to xml format or use MobiXml for untagged pdf").
The error is correct because in order to convert PDF file to XML format, the source file should be tagged PDF.

palasheev:
codewarior:
Do you mean you are saving PDF file in HTML format using Aspose.Pdf for .NET and then trying to open HTML/MHTML file using Aspose.Cells ?
Yes, I tried to use many Load/Save-format types including cases with intermediate file saving.
Direct saving pdf-doc to Excel-doc causes formating loosing.
I have tested the scenario using Aspose.Pdf for .NET 11.1.0 and have managed to reproduce the same issue that formatting issues are appearing in resultant Excel file i.e. The logo is missing, text formatting is ignored etc. For the sake of correction, I have logged it as PDFNEWNET-40124 in our issue tracking system. We will further look into the details of this problem and will keep you posted on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.
codewarior:
For the sake of correction, I have logged it as PDFNEWNET-40124 in our issue tracking system. We will further look into the details of this problem and will keep you posted on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

Of course, Nayyer. Thank you very much.

Hello,
is there already any news on the issue mentioned here?

@tmdolphin

The issue is not resolved yet. However, I request you to share your source and output file so that we may try to reproduce the same on our end.