I am trying to convert a .png to pdf object and then the pdf to excel which is then exported using the following code. The .png to pdf works (I even tested exporting that file) however when saving the pdf doc obejct to excel the image from the .png is not displayed in the excel file output.
Any ideas?
MemoryStream ms = new MemoryStream();
finalOutputDoc.Save(ms, Aspose.Pdf.SaveFormat.Excel);
Aspose.Cells.Workbook wb = new Workbook(ms);
wb.Save(“mergedoutput.xlsx”, FileFormatType.Xlsx, Aspose.Cells.SaveType.OpenInExcel, this.Response);
Hi Al Belmondo,
Thanks for contacting support.
I have tested the scenario and I am able to
notice the same problem. For the sake of correction, I have logged this problem
as PDFNEWNET-36478 in our issue tracking system. We will further
look into the details of this problem and will keep you updated on the status
of correction. Please be patient and spare us little time. We are sorry for
this inconvenience.
Hi Al Belmondo,
As we recently have been able to notice this
issue, so development team requires little time to investigate and figure out
the reasons of this problem. Nevertheless, as soon as we
have made some definite progress towards its resolution, we would be more than
happy to update you with the status of correction.
Hi Al Belmondo,
Thanks for your patience.
We have further investigated the issue reported earlier and it does not seem to be a bug. Please note that pdf to Excel worksheet conversion was developed to extract table data from pdf document. This means PDF should contain corresponding text to be represented in Excel document. However in your particular scenarios, you are adding an image to pdf and then trying to convert it into Excel format. For such scenarios, you may either consider using Aspose.Cells to add image inside Excel workbook.
In the event of any further query, please feel free to contact.
The issues you have found earlier (filed as PDFNEWNET-36478) have been fixed in Aspose.Pdf for .NET 9.0.0.
It doesn’t seem like the issue of pulling images in a pdf into excel has been resolved in this update. I am not seeing any of the images in my pdf in my excel sheets.
Thanks for the update…
Do you think you have plans in the future to pull images (along with the text which you currently support) within a pdf into excel? It would definitely be helpful…
The problem is if I have a very large pdf file with a lot of pages I love the functionality of converting it to excel which you currently support. But now it seems I will have to walk through each pdf page in C# looking for images (which also may fall in between text) and manually create the excel workbook sheets which seems like a large undertaking.
In the mean time do you have any sample code that scans pages in a pdf exporting the text/images on each page to excel sheets using a combo of aspose.pdf and aspose.cells?
Hi Al Belmondo,
Thanks for sharing the details.
I have shared the information with development team and have asked them to share their feedback on either we can support this feature or not. As soon as we have some further updates, we will let you know.
Hi Al Belmondo,
I have further discussed this requirement with development team and in order to properly understand the requirement, can you please share some sample PDF files, Excel workbooks and some images. This will help us in implementing this requirement.
Basically just take any old .png and convert it to a pdf document using aspose.pdf. From that .pdf then try to convert to excel you will see the .png is not included in the excel output.
Hi Al Belmondo,
We performed the basic conversion testing by following similar approach as stated above but as per your comments, the Excel file will also contain Text along with images, therefore I requested you to please share some sample files.
See attached pdf as an example of a pdf with and image that will not get pulled into Excel . When I try to simply take this PDF and convert to EXCEL (using the following code the image is not pulled into excel)
This is an issue b/c I would expect the image to also be pulled into excel… This works fine with text in the PDF… the issue is whenever there is an image it is not pulled into Excel
Aspose.Pdf.Document pdfDoc = new Aspose.Pdf.Document(file.InputStream);
MemoryStream ms = new MemoryStream();
pdfDoc.Save(ms, Aspose.Pdf.SaveFormat.Excel);
Aspose.Cells.Workbook wb = new Workbook(ms);
wb.Save(“mergedoutput.xlsx”, FileFormatType.Xlsx, Aspose.Cells.SaveType.OpenInExcel, this.Response);
Hi Al Belmondo,
Thanks for sharing additional information. As stated above, currently Aspose.Pdf does not support rendering of image in PDF to Excel conversion. So we have logged a new feature request as PDFNEWNET-36628 for your requirement in our issue tracking system for further investigation and implementation. We will notify you as soon as it is implemented.
We are sorry for the inconvenience caused.
Best Regards,
Thank you… looking forward to this update.
Hi Al Belmondo,
Thanks for your inquiry. I am afraid as we have logged the issue recently, it is pending for investigation in queue with other priority task. As soon as its investigation is completed then we will be in a good position to share an ETA with you.
We are sorry for the inconvenience caused.
Best Regards,
The issues you have found earlier (filed as PDFNET-36628) have been fixed in Aspose.PDF for .NET 23.7.