Missing nodes on pages in certain cases

Hi Team!

I would like to insert a text shape to all pages of doc and docx documents. I have implemented a simple algorithm for this purpose that I am also attaching to this post in a form of a simple demo project but I experienced some strange behaviours in cases when the document contains a paragraph that spans multiple pages.

You can run the attached demo project that tries to add a shape to all the pages of 4 different input files and produces an output file in the project’s output folder. You can analyze the output files and the console logs as well - in this case, please always focus on “Paragraph 2” log section for every file because that is the only paragraph in the input files, the other 3 paragraphs were inserted by your lib I guess since I tried with an unlicensed version of Aspose.Words. You may try to run it with a licensed library and it will not be an issue anymore then I think.

The problem is that in test1.doc and in test2.doc some pages are skipped, i.e., there will be no shape on them at the end of the process. My idea was to retrieve the start page index of the actual node and if there is no shape on that page, then insert one. However, it is not sure that the nodes are located page by page - I can see from the logs that for test1.doc for example, the start page index for the 3rd node is 1 but the end page index is 3. I may resolve this by taking end page index into consideration too among start page index but still, what about page 2 in this case? Can I achieve somehow to retrieve a node from that page too in order to be able to add a shape to that page?

The situation is similar in case of test2.doc as well, however, as for test2.docx and test3.doc, their output files contain the shape on all of their pages, i.e., there are nodes on every page, although they consist of 1 single paragraph too. What is the cause of this difference compared to the other 2 files?

The issue occurs on Linux and Windows platforms as well. I used the latest Aspose.Words lib (22.4). For creating and opening input and output documents, MS Office 2016 was used on Windows platform.

Could you please provide any help or solution for my issue or is it a problem related to the structure of Word documents that cannot be resolved?

Thank you in advance,
Tamas Boldizsar

word_watermark.zip (95.3 KB)

@tamas.boldizsar The problem occurs because in your documents there are Run nodes with very very long text that spans several pages. See the screenshot:


So, you should split the run into parts to get the desired output. For example you can use code similar to the code suggested here.
Also, if you need simply insert a watermark into the document, you can use a build in feature Watermark.SetText. But in this case watermark is inserted into the document in header and appears in the background.

1 Like

@alexey.noskov

Thank you for the suggestion! I will try it.

Best regards,
Tamas Boldizsar

1 Like