Apply Continuous Page Start Numbering C# .NET | Create Booklet of Several Word Documents, Convert & Merge to Single PDF

I am creating a booklet of documents by taking several Word documents, converting them to PDF, and then merging them into a single PDF document. I need continuous page numbering in the document, but need the location within the document where the number shows up to be determined by the MS Word template, not overlaid into the PDF. I tried using Aspose.Words to set the PageSetup.PageStartingNumber property at the start of each section in the two documents (I counted the pages for each section by making a temporary new document from it and checking the page count for that document), but it doesn’t seem to work - that is, the page numbering in a document still starts with Page 1 even when I have set the PageStartingNumber for the only section in that document to 4. It starts with Page 1 regardless of whether I save the document directly to disk as a .docx file or output it as bytes, convert to PDF and merge with other PDFs. Am I misunderstanding the usage of this property? How can I accomplish my goal of continuous page numbering? Thanks much for your assistance!

@shammann,

You can join all Word documents together by using Document.AppendDocument method and save the final Document to PDF. Please refer to the following articles:

Joining and Appending Documents
Controlling How Page Numbering is Handled

Thank you! I’ve now got page numbering working now. I can’t use Document.AppendDocument because my documents have different styles and that causes them to share styles, so I instead output each doc to PDF separately, but I’m setting section.PageSetup.PageStartingNumber and setting RestartPageNumbering to true for the first section in each doc, and setting RestartPageNumbering to false for the other sections in each doc, and that is working for the page number.

However, the NUMPAGES doesn’t work since it restarts for each document. I see the documentation at the Joining and Appending Documents link you cite above, and I see how it creates a bookmark and replaces NUMPAGES references with references to the bookmark. I can see how I can take out the part that creates a bookmark at the end of each section and just leave in the part that creates it at the end of the document. However, I don’t really understand how it’s setting the value for the bookmark, and I need to set it differently, based on a parameter I pass in and am not sure how to do that. Do you have any guidance that might assist?

Many thanks!

PS - Sorry for the late response. I got pulled off for another project and only just was able to return to this.

@shammann,

AsposeWords provides three ImportFormatMode options that you can pass to Document.AppendDocument method. Please use the one that suits your needs.

Please ZIP and upload your sample input Word document and your expected Word document (DOCX file) showing the correct behavior here for testing. Please create this expected document by using MS Word. We will then investigate the structure of your expected document as to how you want your final output be generated like. Thanks for your cooperation.

Thanks. I am using ImportFormatMode.KeepSourceFormatting, but the third and fourth documents end up having the pages end at different points in the created document than in the source. Here are the four documents I’m joining:
PolicyBook.zip (183.9 KB)

@shammann,

Thanks for sharing your input Word documents. Please also attach the following resources here for testing:

  • Aspose.Words generated output DOCX file showing the undesired behavior.
  • Your expected document here for our reference. We will investigate the structure of your expected document as to how you want your final output be generated like. You can create expected document by using Microsoft Word.
  • Create a comparison screenshot highlighting (encircle) the problematic areas in this Aspose.Words generated output DOCX (with respect to your expected document) and attach it here for our reference
  • Create a standalone console application (source code without compilation errors) that helps us reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start further investigation into your issue and provide you more information. Thanks for your cooperation.

I’ve attached the output as MergedPolicyBook_PageEndingsWrong.pdf. The page breaks aren’t in the right places - that is to say, they do not match the page breaks in the source documents. Note that the page NUMBERS in that document are what I intend - continuous page numbering with the numpages also correct.

I have also attached a document that is close to an “expected” document: MergedPolicyBook_PageEndingsCorrect.pdf. That is the document that results if I convert each document to PDF separately and then merge the PDFs. The page breaks are in the right places when I do that. However, the page numbers come out wrong in this case. I can’t provide you with the actual “expected” document that you requested because I am not expert enough in Word to know how to combine two documents that have different formatting and keep the source formatting (including page length) in both documents.

Page 20 is an example of a page that differs between the two documents. Note that in addition to the page break issue, there is a footer logo showing in the merged document that should not be showing there. I have included a comparison screenshot of page 20. There are also several other similar problems with page endings on pages 25 through the end.

I am working on the console app you requested. The output to pdf is custom, so I’m not sure if I can give you that, but it just outputs the doc as it appears in Word, so you may not need that.

MergedPolicyBook_PageEndingsWrong.pdf (203.4 KB)
MergedPolicyBook_PageEndingsCorrect.pdf (359.5 KB)

Page20Bottom_ExpectedVsActual.png (123.0 KB)

@shammann,

Regarding Page numbering, you can use the following code to workaround this issue:

Document doc1 = new Document("D:\\temp\\PolicyBook\\Chapter1.docx");
Document doc2 = new Document("D:\\temp\\PolicyBook\\Chapter2.docx");
Document doc3 = new Document("D:\\temp\\PolicyBook\\Chapter3.docx");
Document doc4 = new Document("D:\\temp\\PolicyBook\\Chapter4.docx");

doc1.AppendDocument(doc2, ImportFormatMode.KeepSourceFormatting);
doc1.AppendDocument(doc3, ImportFormatMode.KeepSourceFormatting);
doc1.AppendDocument(doc4, ImportFormatMode.KeepSourceFormatting);

foreach(Section sec in doc1.Sections)
{
    sec.PageSetup.RestartPageNumbering = true;
    sec.HeadersFooters.LinkToPrevious(false);
}

doc1.Save("D:\\temp\\PolicyBook\\18.7.pdf"); 

Please also check the following article:
Controlling How Page Numbering is Handled

Regarding the layout issues, we have logged this problem in our issue tracking system. The ID of this issue is WORDSNET-17115. We will further look into the details of this problem and will keep you updated on the status of correction. We apologize for your inconvenience.

Thanks. Yes, I am able to get page numbering working when I use AppendDocument, but since it is not an option to have the layout wrong, that unfortunately does not do me any good. It’s a good point though that I should set LinkToPrevious to false. Do you still need the console app, or was this sufficient to show you the layout issues? Please do be sure to let me know if the layout issues are fixed so that I can consider using AppendDocument.

@shammann,

Yes, if your code is different than the one shared in my previous post. Otherwise, it is fine.

The layout issues logged as WORDSNET-17115 are pending for analysis and in the queue. We will inform you via this thread as soon as WORDSNET-17115 is resolved.

Thanks! My code is slightly different to get continuous page numbering. I’m also seeing slightly different page breaks with this than I did in my app, but I’m not sure why and if you fixed this example then it should be able to work in my app so I think this is fine for you to work with. Here’s your code changed slightly for continuous page numbering:

        License license = new License();
        var stream = new MemoryStream();
        var writer = new StreamWriter(stream);
        writer.Write("</License>");
        writer.Flush();
        stream.Position = 0;
        license.SetLicense(stream);
        Document doc1 = new Document("C:\\temp\\Chapter1.docx");
        Document doc2 = new Document("C:\\temp\\Chapter2.docx");
        Document doc3 = new Document("C:\\temp\\Chapter3.docx");
        Document doc4 = new Document("C:\\temp\\Chapter4.docx");

        doc1.AppendDocument(doc2, ImportFormatMode.KeepSourceFormatting);
        doc1.AppendDocument(doc3, ImportFormatMode.KeepSourceFormatting);
        doc1.AppendDocument(doc4, ImportFormatMode.KeepSourceFormatting);

        bool isFirstSection = true;

        foreach (Section sec in doc1.Sections)
        {
            if (isFirstSection)
            {
                sec.PageSetup.RestartPageNumbering = true;
                isFirstSection = false;
            }
            else
                sec.PageSetup.RestartPageNumbering = false;

            sec.HeadersFooters.LinkToPrevious(false);
        }

        doc1.Save("C:\\temp\\PolicyBook_18.7.pdf");

@shammann,

Thanks for the additional information. We will inform you via this thread as soon as WORDSNET-17115 is resolved.

Checking in what is the status on getting this layout issue fixed, as the bug has been outstanding since July.

Thanks!

@shammann

There are actually several problems with your documents. Regarding WORDSNET-17115, the scenario of combining four documents is relevant only to what you describe as incorrect footer with a logo here:

In this case, i think, it is your expectation that needs correction. The first paragraph on page 20 of the combined document defines the footer. The paragraph comes from the original doc2, so the footer matches doc2. We see no issues here.

There may be an issue with the footer of the next page, 21. It should be empty as there is no footer in doc3. However, you do not seem to be complaining about it. If this is an issue, then we may have to submit a new ticket for it.

However, we have submitted two separate issues (WORDSNET-17199 and WORDSNET-17200) for layout problems we see with Aspose.Words generated PDF output. They are reproducible just on conversion to PDF without even combining the documents. We have linked these two issues to your thread.

We believe that currently there is nothing to fix as per WORDSNET-17115. The incorrect page 20 footer issue is not a bug as Aspose.Words mimics the behavior of MS Word and the remaining issues should be covered by WORDSNET-17199 and WORDSNET-17200. However, we will revisit/review WORDSNET-17115 after fixing the other two issues.

Regarding WORDSNET-17199, the problem here is that on conversion of this simplified 17115a.zip (38.2 KB) to PDF, the contents of page 2 are closer to the top of the page in Aspose.Words generated output than in MS Word. Perhaps, Aspose.Words manages to put one more empty paragraph on page 1.

Regarding WORDSNET-17200, the problem here is that on conversion of this simplified 17115b.zip (25.2 KB) to PDF, the footer takes two lines in Aspose.Words output. In MS Word, it takes one line only.

We will inform you via this thread as soon as these issues are resolved or any further updates are available. We apologize for your inconvenience.

Thanks for the reply! Regarding your comments on page 20 and the following page and what I am / am not complaining about - every single way in which any of the documents does not look identical to the original in the merged document is a problem. I cited the issue on page 20 as an example only, because you requested that I point out a specific difference, but I cannot use your merged document feature unless the resulting combined document contains the original documents identical in every way, except for the change in the page numbering.

@shammann,

We have logged your concerns in our issue tracking system and will keep you posted on any further updates.

The issues you have found earlier (filed as WORDSNET-17199) have been fixed in this Aspose.Words for .NET 19.7 update and this Aspose.Words for Java 19.7 update.