Is it possible to print out the DOM structure of a pdf document to a string easily? We basically just want to see a high level document overview so we know what we’re working with and best way to attempt parsing some documents.
I’m curious in a high level document structure overview - something that looks similiar to your DOM structure outline. Is something like this in outline format easily available? We have many different types of pdf’s that we need to parse, and it would be nice if there’s a way to see a high level structure of the pdf object formats of each.
currently not supported but for the sake of implementation, I have logged this
requirement as investigation task in our issue tracking system as <span style=“font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:“Arial”,“sans-serif”;
mso-fareast-font-family:SimSun;mso-fareast-language:ZH-CN”>PDFNEWNET-34596. We will further investigate this requirement
in details and will keep you updated on the status of a correction.
We apologize for your inconvenience.