Adding TOC to existing PDF when the TOC will span multiple pages

Hi,


I’m currently following the instructions in Manipulate PDF Document in C#|Aspose.PDF for .NET to add a TOC to an existing PDF document. The PDF document that I am adding the TOC to is a combination of a number of pregenerated PDFs as well as some custom created PDFs.

The TOC is added fine following the steps in Add TOC in Existing PDF page except when the TOC content spills over into a 2nd page.

What do I need to do to allow the custom TOC to span over multiple pages?

Hi Chris,


Thanks for using our API’s.

Can you please share that what issue you are facing when TOC is spanning over two pages i.e. either the contents are not being displayed or TOC formatting is being lost.

If possible, please share some sample PDF files so that we can test the scenario at our end. We are sorry for your inconvenience.

Hi,


When the TOC spans over two pages the doc.Save call fails with a System.NullReferenceException. The simplest way to recreate this is by adding a bottom margin to the header from the code on the http://www.aspose.com/docs/display/pdfnet/Add+TOC+in+existing+PDF page.

Setting the margin to 150 allows the 4 TOC items to be on one page, changing the margin to 300 forces the TOC to go onto a 2nd page causing the crash. Note this also happens if you make the page height smaller or just add enough TOC entries that it naturally goes to a 2nd page. I’ve attached our sample pdf but it is happening with any PDF I try.

Code used is below. Matches the code on the Aspose wiki except for the heading2.Margin.Bottom = 300;
var license = new License();
license.SetLicense(“Aspose.Pdf.lic”);

var doc = new Document(@“D:\temp\TocError\test.pdf”);

// Get access to first page of PDF file
var tocPage = doc.Pages.Insert(1);

// Create object to represent TOC information
var tocInfo = new TocInfo();
var title = new TextFragment(“Table Of Contents”);
title.TextState.FontSize = 20;
title.TextState.FontStyle = FontStyles.Bold;

// Set the title for TOC
tocInfo.Title = title;
tocPage.TocInfo = tocInfo;

// Create string objects which will be used as TOC elements
var titles = new string[4];
titles[0] = “First page”;
titles[1] = “Second page”;
titles[2] = “Third page”;
titles[3] = “Fourth page”;
for (var i = 0; i < 4; i++)
{
// Create Heading object
var heading2 = new Heading(1);
var segment2 = new TextSegment();
heading2.TocPage = tocPage;
heading2.Segments.Add(segment2);
heading2.Margin.Bottom = 300;

<span style="color:green;">// Specify the destination page for heading object</span>
heading2.<span style="color:purple;">DestinationPage</span> = doc.<span style="color:purple;">Pages</span>[<span style="font-weight:bold;">i</span> + 1];

<span style="color:green;">// Destination page</span>
heading2.<span style="color:purple;">Top</span> = doc.<span style="color:purple;">Pages</span>[<span style="font-weight:bold;">i</span> + 1].<span style="color:purple;">Rect</span>.<span style="color:purple;">Height</span>;

<span style="color:green;">// Destination coordinate</span>
segment2.<span style="color:purple;">Text</span> = titles[<span style="font-weight:bold;">i</span>];

<span style="color:green;">// Add heading to page containing TOC</span>
tocPage.<span style="color:purple;">Paragraphs</span>.<span style="color:darkcyan;">Add</span>(heading2);

}

doc.Save(@“D:\temp\TocError\test-output.pdf”);


Hi Chris,


Thanks
for sharing the details.
<o:p></o:p>

I
have tested the scenario and I am able to notice the same problem. For the sake
of correction, I have logged this problem as PDFNEWNET-37019 in
our issue tracking system. We will further look into the details of this
problem and will keep you updated on the status of correction. Please be
patient and spare us little time. We are sorry for this inconvenience.

Do you know of any workaround we can use to generate the TOC?

Hi Chris,


Since we recently have noticed this problem and until or unless we have investigated this problem in details, we might not be able to figure out the actual reasons of this problem and I am afraid currently we might not be able to share any workaround. Nevertheless, as soon as we have some further updates, we will let you know.

Ok thanks.


Just in case others are having a similar problem. This is what we’ve used as a workaround. We estimated roughly how many TOC entries would fit on a page before it wrapped and set that as a config value. We then do a test run on generating the relevant TOC for our data forcing a new TOC page every X entries. We then test save that to a stream to see if it crashes. If it succeeds we then use that number of entries/page to generate our actual TOC. If it fails (this can happen if a TOC entry is long and wraps) we try a smaller number but only until an error limit to prevent infinite tries

The code we used is:
private bool AddTableOfContents(Document outputDocument, ICollection<TableOfContentsEntry> tableOfContentsEntries)
{
var tocEntriesPerPage = this.config.TableOfContentsEntriesPerPage;
 <span style="color:blue;">var</span> <span style="font-weight:bold;">success</span> = <span style="color:blue;">false</span>;
 <span style="color:blue;">var</span> <span style="font-weight:bold;">retry</span> = <span style="color:blue;">true</span>;

 <span style="color:blue;">do</span>
 {
     <span style="color:blue;">if</span> (<span style="font-weight:bold;">success</span>)
     {
         <span style="color:green;">// If a test run was successful then generate the TOC against the actual output document</span>
         <span style="font-weight:bold;">success</span> = <span style="color:darkcyan;">DoAddTableOfContents</span>(outputDocument, tableOfContentsEntries, <span style="font-weight:bold;">tocEntriesPerPage</span>);
         <span style="font-weight:bold;">retry</span> = <span style="color:blue;">false</span>;
     }
     <span style="color:blue;">else</span>
     {
         <span style="color:green;">// Do a test run generation of the TOC against an intermediate document</span>
         <span style="color:blue;">using</span> (<span style="color:blue;">var</span> document = <span style="color:blue;">new</span> <span style="color:darkblue;">Document</span>())
         {
             document.<span style="color:purple;">Pages</span>.<span style="color:darkcyan;">Add</span>(outputDocument.<span style="color:purple;">Pages</span>);
             <span style="font-weight:bold;">success</span> = <span style="color:darkcyan;">DoAddTableOfContents</span>(document, tableOfContentsEntries, <span style="font-weight:bold;">tocEntriesPerPage</span>);
         }

         <span style="color:blue;">if</span> (!<span style="font-weight:bold;">success</span>)
         {
             <span style="color:green;">// Decrease the entries per page and try again - but stop if we have hit the error limit.</span>
             <span style="font-weight:bold;">tocEntriesPerPage</span>--;
             <span style="font-weight:bold;">retry</span> = <span style="font-weight:bold;">tocEntriesPerPage</span> >= <span style="color:blue;">this</span>.<span style="color:purple;">config</span>.<span style="color:purple;">TableOfContentsEntriesPerPageErrorLimit</span>;
         }
     }
 }
 <span style="color:blue;">while</span> (<span style="font-weight:bold;">retry</span>);

 <span style="color:blue;">return</span> <span style="font-weight:bold;">success</span>;

}

private bool DoAddTableOfContents(Document document, ICollection<TableOfContentsEntry> tableOfContentsEntries, int tocEntriesPerPage)
{
// Determine how many TOC pages we expect and generate the blank ones within the document
var expectedTocPageCount = (int)Math.Ceiling(tableOfContentsEntries.Count / (double)tocEntriesPerPage);
var tocPages = new Page[expectedTocPageCount];
for (var i = 0; i < expectedTocPageCount; i++)
{
tocPages[i] = document.Pages.Insert(this.config.TableOfContentsPagePosition + i);
}
<span style="color:green;">// Go through a chunked list of the TOC entries adding to the relevant TOC page</span>
<span style="color:blue;">var</span> <span style="font-weight:bold;">count</span> = 0;
<span style="color:blue;">foreach</span> (<span style="color:blue;">var</span> <span style="font-weight:bold;">chunkedEntries</span> <span style="color:blue;">in</span> tableOfContentsEntries.<span style="color:darkcyan;">Chunk</span>(tocEntriesPerPage))
{
    <span style="color:blue;">var</span> tocPage = tocPages[<span style="font-weight:bold;">count</span>];
    <span style="color:darkcyan;">AddTableOfContentsPage</span>(document, <span style="font-weight:bold;">chunkedEntries</span>, tocPage, expectedTocPageCount - 1, <span style="font-weight:bold;">count</span> == 0);
    <span style="font-weight:bold;">count</span>++;
}

<span style="color:blue;">try</span>
{
    <span style="color:green;">// Test the save of the document - workaround for the TOC issue when spanning pages.</span>
    <span style="color:blue;">using</span> (<span style="color:blue;">var</span> testStream = <span style="color:blue;">new</span> <span style="color:darkblue;">MemoryStream</span>())
    {
        document.<span style="color:darkcyan;">Save</span>(testStream);
    }
}
<span style="color:blue;">catch</span> (<span style="color:darkblue;">NullReferenceException</span>)
{
    <span style="color:green;">// NullReferenceException occurs due to an error within Aspose when the TOC page contains more content than will fit in a single page</span>
    <span style="color:blue;">return</span> <span style="color:blue;">false</span>;
}

<span style="color:blue;">return</span> <span style="color:blue;">true</span>;

}

public class TableOfContentsEntry
{
///
/// Gets or sets a value indicating whether is sub item level.
///
public bool IsSubitem { get; set; }
<span style="color:gray;">///</span><span style="color:green;"> </span><span style="color:gray;"><summary></span>
<span style="color:gray;">///</span><span style="color:green;"> Gets or sets the pages.</span>
<span style="color:gray;">///</span><span style="color:green;"> </span><span style="color:gray;"></summary></span>
<span style="color:blue;">public</span> <span style="color:blue;">int</span> <span style="color:purple;">PageNumber</span> { <span style="color:darkcyan;">get</span>; <span style="color:darkcyan;">set</span>; }

<span style="color:gray;">///</span><span style="color:green;"> </span><span style="color:gray;"><summary></span>
<span style="color:gray;">///</span><span style="color:green;"> Gets or sets the title.</span>
<span style="color:gray;">///</span><span style="color:green;"> </span><span style="color:gray;"></summary></span>
<span style="color:blue;">public</span> <span style="color:blue;">string</span> <span style="color:purple;">Title</span> { <span style="color:darkcyan;">get</span>; <span style="color:darkcyan;">set</span>; }

}

public static IEnumerable<List<T>> Chunk<T>(this IEnumerable<T> source, int max)
{
var result = new List<T>(max);
foreach (var item in source)
{
result.Add(item);
if (result.Count == max)
{
yield return result;
result = new List<T>(max);
}
}
<span style="color:blue;">if</span> (<span style="font-weight:bold;">result</span>.<span style="color:darkcyan;">Any</span>())
{
    <span style="color:blue;">yield</span> <span style="color:blue;">return</span> <span style="font-weight:bold;">result</span>;
}

}



Hi Chris,


Thanks for sharing the information.

Your approach seems to be correct and it will be helpful for users encountering similar issues. Nevertheless, as soon as we have updates regarding the resolution of earlier reported issue, we will let you know. Please note that as a normal rule of practice, if the TOC entries have reached end of page, they should be moved to subsequent page instead of breaking the application.

The issues you have found earlier (filed as PDFNEWNET-37019) have been fixed in Aspose.Pdf for .NET 9.4.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.