Hi,
We are trying to convert a PDF file into HTML using the following code.
Document doc = new Document(@“C:\Test.pdf”);
doc.Save(@“c:\test.html”, SaveFormat.Html);
It does the conversion and saves a html version, but the file does not have any HTML tag other that the “div” tag.
For e.g.
Typically, these 2 lines would be represented in a table as separate rows, but the converted file just has style which pushes the data to different parts of the page.
div class=“stl_01” style="left:35.2809em;top: 8.7022em; ">span class=“stl_07 stl_08 stl_09” style=“word-spacing:0.0027em;”>Date: 18-Feb-2019 </span</div
div class=“stl_01” style="left:36.1852em;top: 9.7918em; ">span class=“stl_07 stl_08 stl_10” style=“word-spacing:0.0025em;”>Name: ABCDE FGHIJ </span</div
Is there any option to convert the pdf to a traditional html?