Find a PageNumber Field inserted into a Header

NTro · March 30, 2011, 5:16am

Hello
I’am still uncertain about PageNumber Fields / PageLayouting Engine.
When I have a PageNumber Field inserted into a Header Paragraph (but not kept around), and later want to find the actual value (i.e. the PageNumber shown) for a paragraph on say page 3/6: Ist it possible?
Can I read somehow the value 3 from the Field.Result for indidual pages?
Sub - Problem: How do I find the Pagenumber Field.
What I have so far is: See Code below.
I know that a the rendering engine does not provide any infos (up to now) about the dynamic page breaks.

But just thought, because I have a page number field, I could use it as a source of information.
So: Is it possible or not? How can I find the number field (anyway).
Thanks for any hints.
Code ------------------------------------------------------------------------------------

Aspose.Words.Section origSec = (Aspose.Words.Section)myFoundPara.GetAncestor(NodeType.Section);
if (origSec != null)
{
    HeaderFooter hd = origSec.HeadersFooters[HeaderFooterType.FooterPrimary];
    if (hd != null)
    {
        // ??
        foreach (Node nd in hd.ChildNodes)
        {
            if (nd.NodeType == NodeType.FieldStart)
            {
                // DOES NOT EXISTS:
                // Aspose.Words.Fields.Field fld = new Aspose.Words.Fields.Field(nd);
                // fld.Result == 3 ???
            }
        }
        // ?? hd.Paragraphs[0].ChildNodes
    }
}

alexey.noskov · March 30, 2011, 9:25am

Hi
Thanks for your request. I think, you can try using the same approach as Adam suggested in this thread:
https://forum.aspose.com/t/66186
Hope this helps.
Best regards,

adam.skelton · March 30, 2011, 4:07pm

Hi there,
I’m afraid the link Alexey posted is to the old version of the code, you should use the new version found here.
In addition, you cannot find a single page number for a header or footer as they can be repeated for many pages. For example if you have a section with four pages and a primary header, a page field in that header would be repeated for all four pages making it impossible to find a single page number where the header resides. When you try to find a field result of a field in a header or footer you will always get the last page (“4” in the above situation).
However you could still find all pages which contain this header or footer. Using the PageNumberFinder class you can get the parent section of a header or footer and call GetPage(Section) and GetPageEnd(Section) which will return the start and end page numbers of the section (and of the header footer).
Please note you will need a few extra lines of code when dealing with different first page header, even header etc. I think how they need to be handled is fairley straight forward
Thanks,

NTro · April 1, 2011, 5:12am

Hello, thanks for the link to the PageNumberFinder Class.
I put it in my project, does exactly what I searched for!
For small documents tested (about 10 pages) - all ok.
Then, for my performance tests, with a document about 425 pages, I found
performance problems.
Perhaps my document is overloaded with nodes/tables, I do not know…
But generation time is about 10 secs: Still fast, no problem here.
Doing the PageLayout takes: about 60 sec - still ok, no problem.
But in the PageNumberFinder, I found 2 x places with performance problems.
The first problem I guess is perhaps an error (?) in the code (converting a List to an array in a foreach loop for every call?). I guess my code resoved this
(see tempRuns -> Remove/Clear)
The second problem is UpdateFields(), which in this case has to update the 127’000 Field (Page-Fields) generated in the Finder Code: This takes more than 20 Minutes !!!

Could it be, that in the internal code of UpdateFields() is a similar error( ? ) like with the tempRuns ?? !!

Could you please investigate into this?
PS: The upper limit our project/programm should handle in the end is actually 2 times bigger than this test document mentioned (i.e. goes up to about 800 generated pages, A4).
Best regards, NTro
Code sample from PageNumberFinder.FindPagesOfNodes(), modified and commented:

// --------------------------------------------------------------------------
// ((...))
// Workaround: Is it ok? Still correct, doing what it should do? I guess yes.
// -> Ok, fast, no performance problem.
Run[] runArr = tempRuns.ToArray();
foreach(Run r1 in runArr)
{
    r1.Remove();
}
tempRuns.Clear();
// Update the document to calculate the page numbers of each node. Due to a change in
// field update in recent versions this must be done different depending on the version of
// Aspose.Words used.
if (isNewFieldEngine)
{
    // NTro: Performance Problem 2 here, unresolved !!!!!!!
    // -----------------------------------------------------
    // Test Document: 425 pages, 59'000 Paragraphs, 4865 Tables; Here we have: nodes.Count = 178'000;
    // Generating the document was fast: about 10 sec !
    // // fieldList.Count = 127'000 (i.e. that number of Field-Objects was filled in above)
    // Time: UpdatePageLayout() or UpdateFields() called on orig Doc, outside this PageNumberFinder: about 1 Min.
    // (slow, but still ok, usable)
    // // But this next line, UpdateFields() [i.e. the 127'000): Time: Takes> 25 Min!!!
    // currentDoc.UpdateFields();
}
else
{
    currentDoc.UpdatePageLayout();
}
// -----------------------------------------------------

NTro · April 1, 2011, 5:14am

Sorry, forgot the first some lines, with the orig code of problem 1 in my previous post:
This is the orig code, which I outcommented:

//// Remove any temporary runs created above.
// foreach (Run run in tempRuns.ToArray())
// run.Remove();

adam.skelton · April 1, 2011, 5:53am

Hi Nathan,
Thanks for this additional information.
Yes I agree there is an unnessary call to ToArray in the for loop which should be removed. The way you modified it looks fine.
Regarding the update time, it’s hard to tell if this is expected or not. In order to find the page numbers of each field the document has to be laid out in memory into lines and pages. Aspose.Words can do this pretty fast, but with the number of pages and the addition of the many fields this may take some time. I will do some further invesigation myself and perhaps pass this onto the developer responsible for the field update. I will keep you informed.
Normally you would want to find the page numbers of all nodes in one go (by inserting many fields). This would be much more efficient then updating for every field. However perhaps things will be faster in your situaiton if the class focuses only on select node or nodes.
If you explain generally how you are using the class and what nodes you are looking to find I can most likely create some code which will target just this and reduce the time required to update the fields.
Thanks,