Get page number quickly

Hi all.

We want to get all of paragraph's page number from a big word.

I do it like this:https://forum.aspose.com/t/65265

the code below use about 10 minutes. how to improve this problem.

public class CatalogInfo
{
public int PageNum;
public double Left;
public double Top;
public string Capiton;
public OutlineLevel Outline;
}

public void split_test(string docPath)
{
Aspose.Words.Document doc = new Aspose.Words.Document(docPath);

//get nodes
NodeCollection nc = doc.GetChildNodes(NodeType.Paragraph, true);

Aspose.Words.DocumentBuilder builder = new Aspose.Words.DocumentBuilder(doc);

Dictionary fieldList = new System.Collections.Generic.Dictionary();
foreach (Aspose.Words.Paragraph p in nc)
{
if (p.ParagraphFormat.OutlineLevel != OutlineLevel.BodyText)
{
if (p.GetAncestor(NodeType.HeaderFooter) != null)
continue;

builder.MoveTo(p);

CatalogInfo c = new CatalogInfo();
c.Capiton = p.Range.Text;
c.PageNum = 0;
c.Outline = p.ParagraphFormat.OutlineLevel;
c.Top = builder.PageSetup.LeftMargin;//HeaderDistance

//slowly
Aspose.Words.Fields.Field field = builder.InsertField("PAGE");//this code use 2-3s.
fieldList.Add(field, c);//fieldList.length is 250.
}
}

doc.UpdatePageLayout();//it's slowly but only once.

foreach (KeyValuePair field in fieldList)
{
string number = GetFieldValue(field.Key.Start);

if (number.Equals("XXX"))
throw new Exception("Can't find the page number of a node inside a field or bookmark");

int pageNum = int.Parse(number);
field.Key.Remove();

Console.WriteLine(field.Value.Capiton);
Console.WriteLine(field.Value.Outline);
Console.WriteLine(pageNum);
Console.WriteLine(System.Environment.NewLine);
}

}

private static string GetFieldValue(Aspose.Words.Fields.FieldStart fieldStart)
{
//StringBuilder builder = new StringBuilder();
string rtn = "";
bool isAtSeparator = false;
for (Node node = fieldStart; node != null && node.NodeType != NodeType.FieldEnd; node = node.NextPreOrder(node.Document))
{
if (node.NodeType == NodeType.FieldSeparator)
isAtSeparator = true;

if (isAtSeparator && node.NodeType == NodeType.Run)
rtn = node.GetText();
//builder.Append(node.GetText());
}

return rtn;
}

----

Thanks.

Hi there,


Thanks for your inquiry.

You can try using the PageNumberFinder class attached to this thread here: https://forum.aspose.com/t/77148

Hopefully this is faster, please let us know how you go.

Thanks,

The issues you have found earlier (filed as WORDSNET-2978) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.
(35)

The issues you have found earlier (filed as ) have been fixed in this update. This message was posted using BugNotificationTool from Downloads module by MuzammilKhan