Moving a paragraph to the top of the pages

One of our customers have lots of word documents, each document represent a work area, at the end of the process we merge the documents. And that’s pretty fine.
The document is extremely complicated and have to print it (Company Budget), so to make it readable they added at the top of each page the previous chapter name and sub- chapter as a context reference, not a brilliant idea but you know customers.
The problem as you can imagine is that when I merge the documents the title and sub-titles that should be at the top of each page can move and as side effect they need to go and correct the issue, because the process is dynamic this happens many times and I will like to write some code that do the following (one of two):
1- Remove the chapter and sub-chapter and create them dynamically at the top of each page, this requires find the current chapter and sub-chapter and spread it at the op of each page
How can I verify the position of the founded paragraph and move it to the next page top?
2- Move the chapter and sub-chapter to the top of each page
How can I move the cursor to the top of each page and add the correspondent text?

Many thanks for the help
Leo

@ahindila

The Aspose.Words.Layout namespace provides classes that allow to access information such as on what page and where on a page particular document elements are positioned, when the document is formatted into pages.

You can bookmark the paragraph and get its position using BookmarkStart and BookmarkEnd nodes.

We suggest you please insert the textbox at the top of each page and insert your desired content into it.

It would be great if you please ZIP and attach your simplified input and expected output documents. We will then provide you more information about your query.

Test.zip (177.9 KB)
I tried you suggestion but it doesnt work correctly, I attach the document and some code
What I am doing is searching for the text “המשך” in hebrew that is Cont
I add a bookmark and save it and its not in the correct place, I iiterate through the paragraphs, find the paragraph meeting the condition and then I add a bookmark and save but the bookarm is not exactly in the place, suggestions?

var builder = new Aspose.Words.DocumentBuilder(doc);
NodeCollection paragraphs = doc.GetChildNodes(NodeType.Paragraph, true);
for (int i = 0; i < paragraphs.Count; i++)
{
if (paragraph.GetText().Contains(paragraphOptions.SearchText)) //condition
{

                    // define bookmark
                    var page = $"{lc.GetStartPageIndex(paragraph)}{lc.GetEndPageIndex(paragraph)}"; //Hint 11, start and end on same page 1
                    var bk = $"BkInPage{page}index{bookmarkIndex}";
   for (int s = 0; s < doc.Sections.Count; s++) //in case there are multiple sections
                    {


                        try
                        {
                            Debug.WriteLine($">>{paragraph.GetText()}  in section {s} in page {page}");
                            builder.MoveToSection(s);
                            var pIndex = paragraphs.IndexOf(paragraph);
                            builder.MoveToParagraph(pIndex, 0);
                            builder.StartBookmark(bk);
                            //builder.Writeln("Text inside a bookmark.");
                            builder.EndBookmark(bk);
                            break;
                        }
                        catch (Exception ex)
                        {
                            Debug.WriteLine(ex.Message);
                        }
                    }
                    bookmarkIndex++;
                }
                catch (Exception ex)
                {
                    Debug.WriteLine(ex.Message);
                }

@ahindila

To ensure a timely and accurate response, please attach the following resources here for testing:

  • Please attach the expected output Word file that shows the desired behavior.
  • Please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.

As soon as you get these pieces of information ready, we will start investigation into your issue and provide you more information. Thanks for your cooperation.

PS: To attach these resources, please zip and upload them.

Attached the original document and the created document, the code used is a modified version of your example for dotnet, so I send also the class that creates the result. In the target document you can see the My_Bookmark1000, that is the first bookmark created in the loop by the code, that doesn’t fit, as all the rest the expected sentect, so I added in word the first bookmark named “ExpectedBookmark” that is correct, Hope that now is clear. What I decided to implement is for the customer an iterative validation process in the first instance, so each time they will the process and in case the text is not in the right place (AKA not at the top of the page) they will get a bookmark so they need to search page by page.
Bookmarks.zip (289.6 KB)

@kushnir_l-1

In your case, we suggest you please use find and replace feature of Aspose.Words. Please implement IReplacingCallback interface as shown below to get the desired output. Please read the following article.

Find and Replace

Document doc = new Document(MyDir + @"original.docx");
FindReplaceOptions findReplaceOptions = new FindReplaceOptions();
findReplaceOptions.ReplacingCallback = new FindandInsertBookmark();

doc.Range.Replace("(המשך)", "", findReplaceOptions);

doc.Save(MyDir + "output.docx");

private class FindandInsertBookmark : IReplacingCallback
{
    int index = 1;
    ReplaceAction IReplacingCallback.Replacing(ReplacingArgs e)
    {
        // This is a Run node that contains either the beginning or the complete match.
        Node currentNode = e.MatchNode;

        // The first (and may be the only) run can contain text before the match, 
        // In this case it is necessary to split the run.
        if (e.MatchOffset > 0)
            currentNode = SplitRun((Run)currentNode, e.MatchOffset);

        ArrayList runs = new ArrayList();

        // Find all runs that contain parts of the match string.
        int remainingLength = e.Match.Value.Length;
        while (
            (remainingLength > 0) &&
            (currentNode != null) &&
            (currentNode.GetText().Length <= remainingLength))
        {
            runs.Add(currentNode);
            remainingLength = remainingLength - currentNode.GetText().Length;

            // Select the next Run node. 
            // Have to loop because there could be other nodes such as BookmarkStart etc.
            do
            {
                currentNode = currentNode.NextSibling;
            }
            while ((currentNode != null) && (currentNode.NodeType != NodeType.Run));
        }

        // Split the last run that contains the match if there is any text left.
        if ((currentNode != null) && (remainingLength > 0))
        {
            SplitRun((Run)currentNode, remainingLength);
            runs.Add(currentNode);
        }

        DocumentBuilder builder = new DocumentBuilder((Document)e.MatchNode.Document);
        builder.MoveTo((Run)runs[0]);
        builder.StartBookmark($"My Bookmark{index}");
        builder.Writeln("Text inside a bookmark.");
        var startBk = builder.EndBookmark($"My Bookmark{index}");
        index++;

        // Signal to the replace engine to do nothing because we have already done all what we wanted.
        return ReplaceAction.Skip;
    }
}

private static Run SplitRun(Run run, int position)
{
    Run afterRun = (Run)run.Clone(true);
    afterRun.Text = run.Text.Substring(position);
    run.Text = run.Text.Substring((0), (0) + (position));
    run.ParentNode.InsertAfter(afterRun, run);
    return afterRun;
}

I actually found a way to select the bookmarks fine, with the builder iteration. But I must find a way to know where is the text in the page, to remember the issue, the text that I look for should be at the first line of the page, I am investing days for that, having the bookmarks still i dont have their top position. Any clue? Thanks

@ahindila

The Aspose.Words.Layout namespace provides classes that allow to access information such as on what page and where on a page particular document elements are positioned, when the document is formatted into pages.

You can use LayoutCollector.GetStartPageIndex method to check either your desired paragraph is first paragraph of page or not. This method returns the page number of Node (e.g. paragraph node).

In your code, when you get the text “(המשך)”, please get the current paragraph and previous paragraph of current Node using Node.PreviousSibling property. Once you have these two paragraph, please use LayoutCollector.GetStartPageIndex method to get the page number of both paragraphs. If page numbers are not same, its means that the text “(המשך)” is on the first line of page.

It doesn’t work, wow, I dont understand why but looks like a bug.
Using your code on this document I fount that on the third page the first line with (המשך) it’s fine but when I check the condition it doesn’t work, actually the condition is not like you say but
if (lc.GetStartPageIndex(currentNode) != lc.GetStartPageIndex(currentNode.ParentNode.PreviousSibling))
I attach the code and document and you can see that the previous sibling is 3 instead of 2 as it should be, I thought is a lack of sync so used the Ttansverse to check but it says that the number 142857 is in the same page as the last (המשך)
I am exhausted, maybe you can help me to know the rectangle of each paragraph with the המשך? Any ideas? I I knew the position of the paragraph in the page it could be fine

Bookmarks.zip (31.7 KB)

@ahindila

Please make sure that you have installed the ‘David’ font. We suggest you please use the latest version of Aspose.Words for .NET 20.7 and apply the license before using the code. If you do not have license, please get the 30 days temporary license and apply it.

The paragraph with text ‘142857’ is on 3rd page of document. Please check the attached image for detail. 3rd page.png (22.3 KB)

You can use following code example to get the page of desired text. Hope this helps you.

Document doc = new Document(MyDir + "BB.docx");
LayoutCollector collector = new LayoutCollector(doc);
NodeCollection nodes = doc.FirstSection.Body.GetChildNodes(NodeType.Paragraph, true);
foreach (Paragraph paragraph in nodes.Cast<Paragraph>().Where(p => p.ToString(SaveFormat.Text).Contains("(המשך)")))
{
    Console.WriteLine(collector.GetEndPageIndex(paragraph));
    Console.WriteLine(collector.GetEndPageIndex(paragraph.PreviousSibling));
}