Printing out DOM structure to a string?

Is it possible to print out the DOM structure of a pdf document to a string easily? We basically just want to see a high level document overview so we know what we’re working with and best way to attempt parsing some documents.

Hi Mark,


Thanks for contacting support.

Can you please share some details regarding your requirement ? The expected document elements/objects which you would like to see in string format and if possible, please share some details regarding your requirement on why you would like to represent the document structure in string format.

This will help us in understanding your requirement and replying accordingly.

I’m curious in a high level document structure overview - something that looks similiar to your DOM structure outline. Is something like this in outline format easily available? We have many different types of pdf’s that we need to parse, and it would be nice if there’s a way to see a high level structure of the pdf object formats of each.

Hi Mark,


Thanks for sharing the details.

I am afraid the requested feature is
currently not supported but for the sake of implementation, I have logged this
requirement as investigation task in our issue tracking system as
<span style=“font-size:10.0pt;mso-bidi-font-size:12.0pt;font-family:“Arial”,“sans-serif”;
mso-fareast-font-family:SimSun;mso-fareast-language:ZH-CN”>PDFNEWNET-34596
. We will further investigate this requirement
in details and will keep you updated on the status of a correction.

We apologize for your inconvenience.