Remove Page Containing Table of Contents TOC Field from Word DOCX and PDF Files using C# .NET or Java

Dears,

We have the Aspose License that we are using for a long time. (Company Name: Everteam)
We are having troubles removing only the page containing a table of contents from a pdf or a word document.

So for instance, if we have a PDF document (Or word document) with 10 pages and the first page (Or any other page) contains a table of content, we would like to remove this particular page and thus result with a PDF with 9 pages in that case.

You can find attached documents examples.Documents Samples.zip (105.4 KB)

@Emilee,
For Aspose.PDF, you can delete the particular PDF Page using this example:

In case you do not know which page contains TOC info, you can search for relevant keyword to find that Page and then delete it:

For Aspose.Words, we will share our feedback soon.

@Emilee,

I am afraid, there is no concept of Page in MS Word document. Pages are created on the fly when you open a Word document with MS Word. However, you can use Aspose.Words for .NET to remove any Node (Paragraph, Shape, TOC Field etc) in a particular Page by using the following code:

Document doc = new Document("E:\\Temp\\Documents Samples\\file-sample_100kB.doc");

ArrayList tocList = new ArrayList();
foreach (Field field in doc.Range.Fields)
{
    if (field.Type == FieldType.FieldTOC)
        tocList.Add(field);
}

foreach (FieldToc toc in tocList)
{
    LayoutCollector collector = new LayoutCollector(doc);
    int tocPageNumber = collector.GetStartPageIndex(toc.Start.ParentParagraph);
    Console.WriteLine("Removing TOC at Page# " + tocPageNumber);
    toc.Remove();

    foreach (Paragraph para in doc.GetChildNodes(NodeType.Paragraph, true))
    {
        if (collector.GetStartPageIndex(para) == tocPageNumber && collector.GetNumPagesSpanned(para) == 0)
            para.Remove();
    }
}

doc.Save("E:\\Temp\\Documents Samples\\20.4.docx");

Hope, this helps in achieving what you are looking for.