Is it possible to get the page number of a node?

WolveFred · July 5, 2010, 10:14am

Hi,

Is it possible to obtain the page number from an Aspose Node ? (for exemple a Paragraph)

Thanks.

AndreyN · July 5, 2010, 11:19am

Hi

Thank you for your interest in Aspose.Words. MS Word document is flow document and does not contain any information about its layout into lines and pages. Our Rendering Engine layouts documents into lines and pages.
But unfortunately, there is no public API, which allows you to determine where page starts or ends. Also, there is no way to find the position of the particular node on page. Your request has been linked to the appropriate issue. You will be notified as soon as this feature is supported.
Best regards,

adam.skelton · July 5, 2010, 6:53pm

Hi there,
This is possible using the current public API, although it takes a little extra coding. Please find an example implementation below. This method has only been tested a little so if you run into an unexpected error please post here and I will take a look.

private static int GetPageNumberOfNode(Node node)
{
    Document doc = (Document)node.Document;
    DocumentBuilder builder = new DocumentBuilder(doc);
    builder.MoveTo(node);
    Field field = builder.InsertField(@"PAGE \* Arabic", "1");
    doc.UpdatePageLayout();
    doc.UpdateFields();
    if (string.IsNullOrEmpty(field.Result))
        throw new Exception("Can't find the page number of a node inside a field or some special bookmarks");
    int pageNum = int.Parse(field.Result);
    field.Remove();
    return pageNum;
}

Thanks,

WolveFred · July 6, 2010, 2:44am

Many thanks !

But you sent me the GetFieldCode() methode instead of the GetFieldValue() method. I’m looking to guess what it should be inside.

Edit : ok, I suppose that it should be : return field.Result;

adam.skelton · July 6, 2010, 4:41am

Hi there,
I have edited my original post and included the correct GetFieldValue method, sorry for any inconvenience. Yes you are correct, instead of the entire field code, only the result content between the FieldSeparator and FieldEnd should be returned.
Thanks,

WolveFred · July 6, 2010, 5:26am

Ok, many thanks, that works.

Just a problem : as it is mentionned in the code comments, it is impossible to get the page number of fields. But that’s just my case : I would want to get the page number of cross-reference (that are Word Field - but not Aspose Field for the moment by the way).

So I have had the idea to get the page number of a node next to the cross-reference, and that works. But I’m not sure to get the ideal node. Because if I take the parent Paragraph, that will always work, but if this same paragraph is on two pages, the page number can be wrong (it can be the previous page). So I’m wondering what procedure should I use to get the ideal node to get the page number of the cross-reference.

WolveFred · July 6, 2010, 5:37am

Another comment : I noticed that the doc.UpdatePageLayout() method takes many many times, for exemple 3 or 4 s for a big complex document (33 pages). That is a problem for my application because I have to get page number for several nodes. The solution is easy (call doc.UpdatePageLayout(); once only) and here is it :

/// 
/// Get the page number of nodes
/// 
public static List<int> GetPageNumberOfNodes(List<Node> nodeList)
{
    Document doc = (Document)nodeList[0].Document;
    DocumentBuilder builder = new DocumentBuilder(doc);
    List<Field> fieldList = new List<Field>();
    List<int> indexList = new List<int>();
    foreach (Node run in nodeList)
    {
        if (run.GetAncestor(NodeType.HeaderFooter) != null)
            throw new Exception("Can't find the page number of a node inside a header or footer, it will return the last page. This is the correct behaviour");
        builder.MoveTo(run);
        Field field = builder.InsertField(" PAGE ");
        fieldList.Add(field);
    }
    doc.UpdatePageLayout(); // Long operation
    foreach (Field field in fieldList)
    {
        string number = GetFieldValue(field.Start);
        if (number.Equals("XXX"))
            throw new Exception("Can't find the page number of a node inside a field or bookmark");
        int pageNum = int.Parse(number);
        field.Remove();
        indexList.Add(pageNum);
    }
    return indexList;
}

adam.skelton · July 6, 2010, 6:11am

Hi there,
Thanks for this additional information. Could you please attach your template document here and prehapes a short example of what you are looking to achieve so I can test on my side and provide a suggestion.
Regarding the code you posted, it’s great you were able to further build on it, well done. You need to be careful though if you test a large number of nodes at once, you might be inserting a large number of fields into the document at one time. This might have a stacking effect and cause the document page layout and also the page numbering to change.
The UpdatePageLayout method takes a good deal of time as it is internally rebuilding the page layout of the document. We are looking into making it run more faster, although it may be difficult to get any increase in speed as it is a complex algorthim already running quite efficently.
You can find the thread in which I posted this code on and where this method originally came from through this link here. In this implemetation the method is developed to find a node from every page, but as there could be thousands to tens of thousands of nodes in a document, only one field is inserted at a time through binary sort.
Thanks,

WolveFred · July 6, 2010, 7:52am

Ok here is my document for test. There are inside two cross-references, one at the top page, and a second in the 4th paragraph in bold. The second is more problematic, because it is in a paragraph which is located at two pages.

Here is the code for test :

// Get all Runs
List<Run> RunList = doc.GetChildNodes(NodeType.Run, true).Cast<Run>().ToArray().ToList();
// Get all cross_references
List<Run> cross_reference_run = RunList.Where(r => r.Text.StartsWith(" REF _Ref")).ToList();
// Get all page numbers of cross-references
var page_number_list = WordUtility.GetPageNumberOfNodes(cross_reference_run.Cast<Node>().ToList());

(see GetPageNumberOfNodes() at my previous post)

For your remark :
" This might have a stacking effect and cause the document page layout and also the page numbering to change."
You are right, this could be a problem. But I think there will not have many cross-references by document, perhaps 3 or 4, or 10 max.

adam.skelton · July 6, 2010, 9:30am

Hi there,
For this situation you can use the code supplied below. It finds the field end assuming the node passed to the method is inside a field. It will then insert the Page field directly after the next sibling of the fieldend. If there is no node after it then it creates a run to insert the field into. You might want to keep a track of any of these temporary runs created and remove them at the same time as removing the temp fields.
This code fragment replaces the “builder.MoveTo(run)” in the GetPageNumberOfNodes method.

FieldEnd endnode = FindFieldEndFromNode(run);
Node nextRun = endnode.NextSibling;
if (nextRun == null)
{
    nextRun = new Run(doc);
    Paragraph para = endnode.ParentParagraph;
    para.InsertAfter(nextRun, endnode);
    tempRuns.Add(nextRun);
}
builder.MoveTo(nextRun);

private static FieldEnd FindFieldEndFromNode(Node node)
{
    Node nextNode = node;
    while (nextNode != null)
    {
        if (nextNode.NodeType == NodeType.FieldEnd)
            return (FieldEnd)nextNode;
        nextNode = nextNode.NextPreOrder(nextNode.Document);
    }
    return null;
}

Thanks,

WolveFred · July 6, 2010, 9:54am

Many many thanks, that works perfectly.

I’m posting on Aspose forums only since yesterday, and I am impressed of the reactivity and the efficiency of the team. Thanks

adam.skelton · July 6, 2010, 4:09pm

Hi there,
That’s great to hear. Thank you, please feel free to come back and ask a question any time.
Thanks,

aspose.notifier · December 3, 2012, 7:14am

The issues you have found earlier (filed as WORDSNET-3518) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(15)

aspose.notifier · February 3, 2013, 11:48am

The issues you have found earlier (filed as WORDSNET-2978) have been fixed in this .NET update and this Java update.

This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(39)