How to check current SDT Content is Splited in two pages?

AlpeshChaudhari12345 · January 22, 2023, 10:51am

Hii team,
How can i check the current sdt content is splited in two or more pages ?

Snippet :

Aspose.Words.Document doc = new Aspose.Words.Document(@"C:\\SDT_Check.docx");
var sdts = doc.GetChildNodes(NodeType.StructuredDocumentTag, true)
    .Where(x => ((Aspose.Words.Markup.StructuredDocumentTag)x).Tag.StartsWith("Rangescop_"));

Attachments :
SDT_Check.docx (26.7 KB)

And I want to add bookmark to the before sdt first child of first page and next page.

alexey.noskov · January 22, 2023, 7:57pm

@AlpeshChaudhari12345 You can use LayoutCollector to detect page index where some node starts or ends. In your case you can use code like the following to insert a bookmark into SDT at the beginning of each page of SDT content:

Document doc = new Document(@"C:\Temp\in.docx");
LayoutCollector collector = new LayoutCollector(doc);

StructuredDocumentTag sdt = doc.GetChildNodes(NodeType.StructuredDocumentTag, true).Cast<StructuredDocumentTag>()
    .Where(x => x.Tag.StartsWith("Rangescop_")).FirstOrDefault();

// Check page number wherre SDT starts and ends.
int sdtPageStart = collector.GetStartPageIndex(sdt);
int sdtPageEnd = collector.GetEndPageIndex(sdt);

if (sdtPageStart != sdtPageEnd)
{
    // Split all Run nodes in the SDT to make them not more than one word.
    List<Run> runs = sdt.GetChildNodes(NodeType.Run, true).Cast<Run>().ToList();
    foreach (Run r in runs)
    {
        Run current = r;
        while (current.Text.IndexOf(' ') >= 0)
            current = SplitRun(current, current.Text.IndexOf(' ') + 1);
    }

    // Now update page layout and reset LayoutCollector to work with the updated document model.
    collector.Clear();
    doc.UpdatePageLayout();

    NodeCollection updatedRuns = sdt.GetChildNodes(NodeType.Run, true);
    int currentPageIndex = -1;
    // Loop through the runs and detect where page index changes.
    for (int i = 0; i < updatedRuns.Count; i++)
    {
        // Insert  bookmark before the Run where page index changes.
        Node currnetRun = updatedRuns[i];
        int runPageIndex = collector.GetStartPageIndex(currnetRun);
        if (runPageIndex != currentPageIndex)
        {
            BookmarkStart start = new BookmarkStart(doc, string.Format("sdt_start_{0}", runPageIndex));
            BookmarkEnd end = new BookmarkEnd(doc, start.Name);
            currnetRun.ParentNode.InsertBefore(start, currnetRun);
            currnetRun.ParentNode.InsertBefore(end, currnetRun);
            currentPageIndex = runPageIndex;
        }
    }
}

doc.Save(@"C:\Temp\out.docx");

private static Run SplitRun(Run run, int position)
{
    Run afterRun = (Run)run.Clone(true);
    run.ParentNode.InsertAfter(afterRun, run);
    afterRun.Text = run.Text.Substring(position);
    run.Text = run.Text.Substring(0, position);
    return afterRun;
}

out.docx (21.9 KB)

AlpeshChaudhari12345 · January 24, 2023, 9:00am

@alexey.noskov thanks…

AlpeshChaudhari12345 · January 31, 2023, 5:24am

collector.GetStartPageIndex(sdt);
this function take more time to find page index. Any other options for find page index ?
Or how can i optimize this function.

alexey.noskov · January 31, 2023, 6:53am

@AlpeshChaudhari12345 As you may know MS Word documents are flow document and do not have a concept of page. The consumer application (like MS Word or Open Office) builder the document page layout on the fly. The same does Aspose.Words. Unfortunately, to determine node’s page index it is required to build the document’s layout, this might be a quite time and resources consuming operation depending on the document size and complexity. I am afraid there is no way to optimize this.

AlpeshChaudhari12345 · January 31, 2023, 7:16am

ok thanks …how can i check the current table is splited on multiple pages and how to get index of second page first row ?

alexey.noskov · January 31, 2023, 7:25am

@AlpeshChaudhari12345 The same way - using LayoutCollector:

Document doc = new Document(@"C:\Temp\in.docx");
LayoutCollector collector = new LayoutCollector(doc);

// Get table.
Table table = doc.FirstSection.Body.Tables[0];

// Check whether table ison one page.
bool isOnTheSamePage = collector.GetStartPageIndex(table) == collector.GetEndPageIndex(table);

To determine the first row index on the next page, you should loop through all rows in the table and compare their page start indices.

The code provided in this thread might be useful for you:
https://forum.aspose.com/t/how-to-insert-paragraph-before-table-continuation-with-headingformat/246739