I need to open PDF file is aspose.Words and then I need to do some operations using aspose.words api. Hence I am using following code to open the document.
PdfLoadOptions pdfLoadOptions = new PdfLoadOptions();
Document document = new Document(path, pdfLoadOptions);
However, I am unable to find table of content, table of list or table of figures in opened document. Is there any option to recognize these fields from PDF? I saved document and compare the result, it is importing TOC but in plain text, I can not update TOC.
When you open a PDF document using Aspose.Words, the Table of Contents (TOC), Table of Figures, and similar fields may not be recognized as interactive elements. Instead, they are imported as plain text, which means you cannot update them as you would in a Word document.
Currently, Aspose.Words does not support the direct recognition or manipulation of TOC fields from PDF files. The TOC is typically generated in Word documents and may not retain its functionality when converted to PDF and then back to Word format.
If you need to work with TOC in a Word document, you might consider creating the TOC programmatically after importing the content. Here’s a basic example of how to create a TOC in a Word document using Aspose.Words:
Document doc = new Document("path/to/your/document.docx");
// Ensure the document has headings for the TOC
doc.UpdateFields(); // This will update the TOC if it exists
// Save the document
doc.Save("path/to/your/updated_document.docx");
In this example, the UpdateFields method is used to refresh the TOC if it exists in the document. However, since your original document is a PDF, you will need to manually create the TOC after importing the content.
For further assistance, you may want to explore the Aspose.Words documentation or community forums for more specific use cases related to TOC handling.
If you have any more questions or need further clarification, feel free to ask!
@GFari
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSNET-27353
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
You should note, Aspose.Words is designed to work with MS Word documents. MS Word documents are flow documents and they have structure very similar to Aspose.Words Document Object Model . On the other hand PDF documents are fixed page format documents. While loading PDF document Fixed Page Document structure is converted into the Flow Document Object Model. Unfortunately, such conversion does not guaranty 100% fidelity and might be quite resource consuming.
Also in PDF document there is actually no TOC concept. It is represented as a set of cross-references and they are exported as such to DOCX.