Convert PDF to Excel | XLS | XLSX | Aspose.PDF for Java

Hi Support,

I am facing an issue while converting PDF to Excel file suing ASPOSE API. My scenario is to create a single sheet Excel file for all the pages exist in the PDF. I am creating xls and xlsx file. While opening xls file it shows error (for reference please check XLSFileError.png file). When we open the file through option “Do you want to open it anyway”, it shows all the content in single sheet. But when we close this file it always ask “Want to save your changes”. Somehow Aspose API is creating corrupted Excel files. Also we are not able to open XLSX file created from ASPOSE API. We are getting error which you can check in file “XLSXFileError.png”.

I am attaching two PDF file with converted XLS and XLSX files. Also I am attaching java code and error png files for references. Please check and let me know how we can sole this.


Thanks

Vishal

Hi Vishal,


Thanks for contacting support.

This warning message of MS Excel version 2007 and higher is not a bug. This message appears because Aspose.Pdf provides saving in MS Excel 2003 XML format. Whereas when using .xls file extension for output file, it is correct for lower versions of MS Excel. the Versions of MS Excel 2007 and higher expect .xls file extension for binary document of lower versions Excel, and *.xml for XML.

If you save an Excel document as xml with .xls extension in MS Excel 2003, and then try open it in MS Excel version 2007 or higher version, it will result in same warning message. So, its not a bug of Aspose.Pdf for Java. Therefore in order to avoid this warning message, you can change output file extension. Code snippet:

[Java]

Document doc = new Document(“C:\pdftest\FA16_Size_Matrix_Artwork
1.5.16.pdf”
);

com.aspose.pdf.ExcelSaveOptions excelsave = new com.aspose.pdf.ExcelSaveOptions();

doc.save("C:\\pdftest\\FA16_Size_Matrix_Artwork 1.5.16.xml", excelsave);


See attached files and you can open them using in MS Excel version 2007 and higher without any warning messages.

Thanks for reply. Now if I am saving file as xlsx extension then it is creating file without any error but I am not able to open this file. Can you please check this and let me know why this is coming? I have also attached sample xlsx file.

Please provide an update on this as this is blocking our development.



Hi Vishal,


Thanks for sharing the details.

As shared earlier, please use xml as extension instead of xlsx and resultant file can be viewed using MS Excel.

Hi Support,


actually I am converting PDF to XLS file and then I want to process that XLSfile using APACHE POI API. But when I am doing this it is giving error as “java.io.IOException: Invalid header signature; read 7311066695147732796, expected -2226271756974174256”. please let me know how to solve this.

Please update on this.

Hi Vishal,


Thanks for using our API’s.

I have tested the scenario and have managed to reproduce same problem. For the sake of correction, I have logged it as PDFJAVA-35953 in our issue tracking system. We will further look into the details of this problem and will keep you posted on the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.

[Java]

Document doc = new Document(“C:\FA16_Size_Matrix_Artwork
1.5.16.pdf”
);<o:p></o:p>

com.aspose.pdf.ExcelSaveOptions excelsave = new com.aspose.pdf.ExcelSaveOptions();

doc.save("C:\\FA16_Size_Matrix_Artwork 1.5.16_new.xml", excelsave);

InputStream inp = new FileInputStream("C:\\FA16_Size_Matrix_Artwork 1.5.16_new.xml");

org.apache.poi.hssf.usermodel.HSSFWorkbook wb = new org.apache.poi.hssf.usermodel.HSSFWorkbook(inp);

Thanks for the update. We are waiting for your update on this.



Hi Vishal,


Thanks for your patience.

As we recently have noticed earlier reported issue, so its pending for review and is not yet resolved. However the product team will surely consider investigating/fixing it as per development schedule and as soon as we have some definite updates regarding its resolution, we will let you know. Please be patient and spare us little time. We are sorry for this delay and inconvenience.

Hi Support,


Let me know if there is any update.

Thanks
Vishal

Hi Vishal,


Thanks for your patience.

I am afraid the earlier reported issue is not yet resolved as the team has been busy fixing other previously reported high priority issues. However as soon as we have some definite updates, we will let you know.

Your patience and comprehension is greatly appreciated in this regard.

Hi Support,


Please provide some update on this

Hi Vishal,


Thanks for your patience.

As we recently have noticed earlier reported issue, so its pending for review and is not yet resolved. Please note that issues are resolved in first come first serve basis as we believe its the fairest policy to all the customers. As soon as we have some further updates, we will let you know.

Please provide update on this. We have taken your license to implement our use cases but this issue is blocking our development.


Thanks
Vishal

Hi Vishal,


Thanks for your patience.

I am afraid the issue is not yet resolved. However I have again intimated the product team to try accommodating the issue in their development schedule and as soon as we have some further updates, we will let you know.

We are sorry for this delay and inconvenience.

Hi Vishal,


Thanks for your patience.

We have further investigated earlier reported issue and it does not seem to be a bug. We produce raw XML file. But formats such as Office 2003 XML are not supported by POI. Howecer you can use Aspose.Cells-16.11 to convert raw XML file into XLSX format.

[Java]

com.aspose.cells.Workbook
workbook =
new
com.aspose.cells.Workbook(myDir +
“FA16_Size_Matrix_Artwork
1.5.16_new.xml”
);<o:p></o:p>

workbook.save(myDir + "out1.xlsx", com.aspose.cells.SaveFormat.XLSX);


Therefore we proposes to use the last stable version of poi-bin-3.15-20160924 and the following code:

OPCPackage pkg = OPCPackage.open(new File(myDir + "out1.xlsx"));

XSSFWorkbook wb = new XSSFWorkbook(pkg);

//or

InputStream inp = new FileInputStream(myDir + "out1.xlsx");

org.apache.poi.xssf.usermodel.XSSFWorkbook
wb =
new
org.apache.poi.xssf.usermodel.XSSFWorkbook(inp);
<o:p></o:p>

@vishu88in

We would like to share with you that you can now convert PDF to XLSX using Aspose.PDF for Java. Please use the following code snippet to convert PDF to XLSX:

PDF to XLSX in Java using Aspose.PDF

Document doc = new Document(dataDir + "input.pdf");
ExcelSaveOptions ex = new ExcelSaveOptions();
ex.setFormat(ExcelSaveOptions.ExcelFormat.XLSX);
doc.save(dataDir + "Testxlsx.xlsx", ex);