Free Support Forum - aspose.com

Convert pdf to xml to json

  1. https://products.aspose.app/pdf/conversion/pdf-to-xml – not to extract xml from here

  2. does that mean we can’t use the product or do we need to buy before testing

can provide the pdf file if required

I am trying to to convert pdf to Json, It is converting but it is giving the words before font size {"@x":“52.875”,"@y":“168.469”,"@width":“100.835”,"@height":“47.131”,"#text":
{"@width":“612”,"@height":“792”,“font”:[{"@size":“42”,"@face":“RFPLAD+Arial”,"@src":

@vmamilla

Would you please provide a sample PDF document along with expected output XML file in .zip format? We will test the scenario in our environment and share our feedback with you accordingly.

Hi Asad,

These documents, I am trying to convert pdf to xml giving 404 error.

Thanks,
Venkat.employeeguide.pdf (6.5 MB)
guide-to-fmla.pdf (2.3 MB)

@vmamilla

We tried to convert your files while using Aspose.PDF for .NET 20.11 and following code snippet. We did not notice any issue.

Document doc = new Document(dataDir + "guide-to-fmla.pdf");
doc.Save(dataDir + "guide-to-fmla.xml", SaveFormat.MobiXml);

guide-to-fmla.zip (4.1 MB)

In case you are facing some issue over Free Apps Domain, you may please create a post in respective forum where you will be assisted accordingly.

Hi Asad,

My question I am not able to parse here https://products.aspose.app/pdf/conversion/pdf-to-xml

From code I am able to parse only 4 pages, each and every is showing these kind of font size and co-ordinates.

{"@x":“52.875”,"@y":“168.469”,"@width":“100.835”,"@height":“47.131”,"#text":
{"@width":“612”,"@height":“792”,“font”:[{"@size":“42”,"@face":“RFPLAD+Arial”,"@src":

How can we remove these. I need to convert PDF to Json, I am trying pdf to xml then Json, I am getting all the special charters fonts.

If, I try pdf to txt also from .net converting only one line.

Evaluation Only. Created with Aspose.PDF. Copyright 2002-2020 Aspose Pty Ltd. Guide to the Famil

I am using this code in .net

// Open document
    Document pdfDocument = new Document(_dataDir + "demo.pdf");
    TextAbsorber ta = new TextAbsorber();
    ta.Visit(pdfDocument);
    // Save the extracted text in text file
    File.WriteAllText(_dataDir + "input_Text_Extracted_out.txt",ta.Text);

guide-to-fmla.pdf (2.3 MB)

Thanks,
Venkat.

@vmamilla

As requested earlier, you need to post this issue in the respective forum in order to get is addressed properly.

Furthermore, the 4 pages limitation is due to trial version. You can download a free 30-days temporary license in order to evaluate the API without any restriction.

This is because the API supports conversion to MobiXml format only. We are afraid that you cannot convert to XML or JSON at the moment. However, we will further investigate the feasibility if you could please share a sample expected output format.

This is also due to the limitation of trial version usage. Please apply a valid license in order to use the API without any restriction.