Thanks for your reply.
Attached two pdf files which need to be converted to docx and html files using Aspose PDF Java product. Aspose2Column.zip (882.5 KB)
Note: The table should be editable table.[Not as image]
The entire output documents(docx/html) should be a 1 column layout.
Can you please convert them and send the java code for docx and html
Regards,
Berlin
We have converted your one of PDF documents (white-paper-c11-737224.pdf) into DOCX and HTML using following code snippet with Aspose.PDF for Java 19.12. For your kind reference, output files are also attached. Would you kindly view them and share your feedback in case you notice any anomaly. We will further proceed accordingly to assist you.
PDF to DOCX in Java
Document doc = new Document(dataDir + "white-paper-c11-737224.pdf");
DocSaveOptions options = new DocSaveOptions();
options.setFormat(DocSaveOptions.DocFormat.DocX);
options.setMode(DocSaveOptions.RecognitionMode.Flow);
doc.save(dataDir + "19.12.docx", options);
The issues were logged recently in our issue tracking system and are still pending for analysis. We will surely investigate and resolve them on first come first serve basis and let you know as soon as they are resolved. Please spare us some time.
We would like to share with you that your issues are being investigated and expected to be resolved in Aspose.PDF for Java 20.3 which will be available in the end of March. We will surely inform you as soon as we have some more updates in this regard. Please spare us little time.
Regretfully the tickets are not yet resolved. The implementation of the required features depends upon several internal components of the API and requires more time. For now, the tickets are under feedback status. We will surely inform you as soon as required functionality is implemented. Please spare us some time.
I’ve recently purchased Aspose.Words and Aspose.PDF for .Net, and I’m having the same problem with some elements been converted as images in PDF to HTML conversion, most important of them are Tables. I’d like to know if there is some progress on this request. On the blog, all that I can see is that this topic was split on two post, but I can’t find more info. I can see just one link (Issues while processing) docx…) without a solution (as far as I can tell).
Conversion from Docx to HTML works as expected (producing editable tables -Table, TR and TD elements on resulting HTML), but not for PDF to HTML.
In PDF to HTML conversion, Tables are converted as Images, and sometimes these ‘tables’ are merged with other objects in the same image. Our goal is to convert the PDF to insert it as HTML in an WYSIWYG Editor (Kendo), so user can edit the content if needed, and to be as close as the original as posible. Editable Tables is the main concern so far.
The ticket related to the issue which you are also facing is PDFJAVA-39095 and it is pending for resolution. We have logged your comments under the ticket and will surely consider them during its investigation. We will inform you as soon as we have some certain updates regarding its resolution. Please be patient and spare us some time.