Cant convert pdf to xml

we tried the Convert PDF to XML via Java | Aspose.PDF example and got an error :Exception in thread “main” class com.aspose.pdf.exceptions.PdfException: Tagged pdf expected. Please use tagged pdf file for converting to xml format or use MobiXml for untagged pdf.

this is the code

@eyalsadeh Your question is related to Aspose.PDF, so I have moved your request into the appropriate forum category. My colleagues from Aspose.PDF team will help you shortly.


Can you please share what type of output XML do you expect from the API? Can you please share your sample source and expected output files for our reference? We will investigate the feasibility and share our feedback with you.

We expect to have the text of the pdf, their locations, size, pages , and pother relevant data on the pdf. Our PDF is attached
template.pdf (410.1 KB)

This is the code that we used.
import com.aspose.pdf.Document;
import com.aspose.pdf.SaveFormat;

public class Main {
    public static void main(String[] args) throws Exception {
        // load PDF with an instance of Document
        Document document = new Document("template.pdf");
// save document in XML format"output.xml", SaveFormat.Xml);


pls let us know what should we do.


We are afraid that this feature is not yet available in the API. As error message also stated that you can also generate MobiXml from a PDF that may not contain all the information that you need at the moment. Therefore, we have logged a feature request as PDFJAVA-43311 in our issue tracking system.

We will look into its details and keep you posted with the status of its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

ok, if the pdf is more simple then will this feature work?


We are afraid that it would not work because such implementation has not been made to the API. Can you please share an expected output XML for our reference? It would help us in investigation the ticket.

ok, then why do u provide a code example that is doing this? Are you sure about this?

Basically, we need to see the pdf text locations in the doc.


We are checking it and will get back to you shortly.