How can I look for specific text and collect nodes, then get their pageIndex ?
On the attached file I get the list of Nodes, then when I use layoutCollector.GetEntity → it returns null
Thanks a lot in advance
public static Dictionary<HeaderFooter, List<Node>> GetFootersWithNodes(Document doc, List<Node> nodes)
{
var layoutCollector = new LayoutCollector(doc);
var layoutEnumerator = new LayoutEnumerator(doc);
// Dizionario footer -> lista nodi
Dictionary<HeaderFooter, List<Node>> footerNodesMap = new Dictionary<HeaderFooter, List<Node>>();
// Before the layout collector, we need to call the "UpdatePageLayout" method to give us
// an accurate figure for any layout-related metric, such as the page count.
layoutCollector.Clear();
doc.UpdatePageLayout();
foreach (Node node in nodes)
{
Section section = node.GetAncestor(NodeType.Section) as Section;
if (section == null) continue;
int pageIndex = ReportEngineHelper.GetNodePageIndex(layoutCollector, layoutEnumerator, node);
int firstPageIndex = ReportEngineHelper.GetNodePageIndex(layoutCollector, layoutEnumerator, section.Body.FirstChild);
bool differentFirstPage = section.PageSetup.DifferentFirstPageHeaderFooter;
bool oddEvenPages = section.PageSetup.OddAndEvenPagesHeaderFooter;
bool isFirstPage = (pageIndex == firstPageIndex);
HeaderFooterType footerType;
if (isFirstPage && differentFirstPage)
{
footerType = HeaderFooterType.FooterFirst;
}
else if (!isFirstPage && oddEvenPages && (pageIndex % 2 == 0)) // pagina pari
{
footerType = HeaderFooterType.FooterEven;
}
else
{
footerType = HeaderFooterType.FooterPrimary;
}
HeaderFooter footer = section.HeadersFooters[footerType];
if (footer != null)
{
if (!footerNodesMap.TryGetValue(footer, out List<Node> nodeList))
{
nodeList = new List<Node>();
footerNodesMap[footer] = nodeList;
}
nodeList.Add(node);
}
}
return footerNodesMap;
}
private static int GetNodePageIndex(LayoutCollector layoutCollector, LayoutEnumerator layoutEnumerator, Node node)
{
object renderObject = layoutCollector.GetEntity(node.GetAncestor(NodeType.Paragraph)); //row.FirstCell.FirstParagraph);
layoutEnumerator.Current = renderObject;
//// Ottieni l'entità layout del nodo
//object entity = layoutCollector.GetEntity(node);
//// Passa il layoutEnumerator all'entità
//layoutEnumerator.Current = entity;
// Ora puoi leggere la proprietà PageIndex
int pageIndex = layoutEnumerator.PageIndex;
return pageIndex;
}
@Jackomino
To find specific text in a document and collect nodes along with their page indices using Aspose.Words, you can utilize the LayoutCollector
and LayoutEnumerator
classes. The method GetStartPageIndex
from the LayoutCollector
class can be particularly useful for obtaining the page index of a node.
Here’s a refined approach based on your implementation:
-
Ensure Layout is Updated: Before using the LayoutCollector
, make sure to call doc.UpdatePageLayout()
to ensure that the layout metrics are accurate.
-
Get Node’s Page Index: Instead of trying to get the entity directly from the node, you can use the GetStartPageIndex
method directly on the LayoutCollector
, which returns the 1-based index of the page where the node begins.
Here’s an updated version of your GetNodePageIndex
method:
private static int GetNodePageIndex(LayoutCollector layoutCollector, Node node)
{
// Use the GetStartPageIndex method to get the page index directly from the node
int pageIndex = layoutCollector.GetStartPageIndex(node);
return pageIndex;
}
Example Usage
Here’s how you can use this in your existing method:
foreach (Node node in nodes)
{
Section section = node.GetAncestor(NodeType.Section) as Section;
if (section == null) continue;
int pageIndex = GetNodePageIndex(layoutCollector, node);
// Continue with your logic to determine the footer type and map nodes...
}
Important Notes
- If
layoutCollector.GetEntity(node)
returns null, it might be because the node type does not correspond to a layout entity. In such cases, directly using GetStartPageIndex
is more reliable.
- Ensure that the nodes you are passing to the
GetNodePageIndex
method are valid and can be mapped to a page.
This approach should help you effectively retrieve the page indices for the nodes you are interested in. If you continue to face issues, consider checking the types of nodes you are working with and ensure they are compatible with the layout collector’s methods.
@Jackomino LayoutCollector.GetEntity
returns null
if the node is in the document header/footer or inside shape. Headers/footers are repeated on each page of the section, so their layout is built separately from the main body, this makes it difficult to provide layout information for them. The same applied to the content inside shapes - shapes layout is build separately from the main body, so LayoutCollector
and LayoutEnumenrator
does not work with content inside shapes either.
Thanks a lot for information, really helpful.
We decided not to use footer as it cannot be different on each page, except using section break and this will be hard to keep managed in our Word templates. So now we create floating TextBox shape to write into
We’re struggling with internal margin and border line… but keep pushing !
Jack
@Jackomino Do you mean shape internal margins and borders? If so you can use TextBox class properties to configure margins. Ans Shape.Stroke property to set borders.
@Jackomino So, have you managed to achieve what you need?
Unfortunately not, I’m trying to learn better Word document layout as I guessed that using BottomMargin+LeftMargin, refernce poit would be the one marked by arrow in the attached image

Using
marker = new Shape(document, ShapeType.TextBox)
{
Width = 400,
Height = 20,
Left = 0,
Top = 0, // Depends on page dimensions and margins
//RelativeHorizontalPosition = RelativeHorizontalPosition.Page,
//RelativeVerticalPosition = RelativeVerticalPosition.Page,
RelativeHorizontalPosition = RelativeHorizontalPosition.LeftMargin,
RelativeVerticalPosition = RelativeVerticalPosition.BottomMargin,
//HorizontalAlignment = HorizontalAlignment.Left,
//VerticalAlignment = VerticalAlignment.Top,
BehindText = false,
WrapType = WrapType.None,
TextBox =
{
InternalMarginTop = 0,
InternalMarginBottom = 0
},
FillColor = Color.Transparent,
StrokeColor = Color.Transparent // Use the "StrokeColor" property to set the color of the outline of the shape.
};
marker.AppendChild(new Paragraph(document));
marker.FirstParagraph.ParagraphFormat.Alignment = ParagraphAlignment.Left;//Center;
marker.FirstParagraph.AppendChild(new Run(document, textToShow));
the TextBox is not drawn at that point but in a different position
I’m checking “timing” about creating TextBox
@alexey.noskov , Thanks a lot for you interest !
I noticed that the frame is just a shape added to the page, not a specific recognizable item. I’ll try to get footer top point and use it as reference point
@Jackomino If you would like to set absolute position of the shape, you should use RelativeHorizontalPosition.Page
and RelativeVerticalPosition.Page
accordingly. In this case Shape.Top
and Shape.Left
properties allows to specify absolute position of the shape calculating the position from the top and left edges of the page.
1 Like
Is there a way to get the position of a shape, created by someone in a word document template, referred to Page corner ?
This will convert positioning data to Absoluite Page-related positioning: the same that is done using Word - Layout -Position form (attached), it does conversion changing values
Our idea is to grab the shape the customer creates, clone it, removeit and then on each reated page insert the saved cloned shape.
Thanks
@Jackomino If the shape in is the main document body, you can use LayoutCollector and LayoutEnumerator to get absolute coordinates of the shape:
Document doc = new Document(@"C:\Temp\in.docx");
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);
Shape s = (Shape)doc.GetChild(NodeType.Shape, 0, true);
enumerator.Current = collector.GetEntity(s);
Console.WriteLine(enumerator.Rectangle);
We’re using mailmerge + LinqReporting
I’m trying to simulate the presence of a watermark on each page in order to have the chance to specify a different text on each page: text for modified page is different from the no-mod ones.
We start from a template where a shape is placed in a fixed positio referred to the page
Then we look for a specific non-printable chart to identify modified node, collect the node page and store it in an hash
Then I loop on MainBody paragraph page by page and insert a clone of the shape in the starting template, placing it at the same coordinate on each page with a different text if page contains mod or not
int pageToInsert = dataFirstPageNumber;
foreach (Node nodeParagraph in document.GetChildNodes(NodeType.Paragraph, true))
{
var para = (Paragraph)nodeParagraph;
// Process only paragraphs in the main body.
if (para.GetAncestor(NodeType.Body) == null)
{
continue;
}
string modifiedMarkerText = GetModificationMarkerText(modifiedPageNumbers, pageToInsert, defaultModText, defaultNoModText);
if (!modifiedMarkerText.IsNullOrEmpty())
{
Shape shapeToUse = GeneratePipingClassReportTask.GetMarkerShapeToUse(document, shapeModMarker, modifiedMarkerText);
int paraPage = layoutCollector.GetEndPageIndex(para);
if (paraPage == pageToInsert)
{
pageToInsert++;
para.AppendChild(shapeToUse.Clone(true));
}
}
}
public static Shape GetShapeByTextContent(Document document, string textContext, ShapeType? shapeType = null)
{
Shape foundShape = null;
// Get all shapes in the document.
foreach (Node node in document.Sections)
{
IEnumerable<Shape> shapes = ((Section)node).Body.GetChildNodes(NodeType.Shape, isDeep: true).Cast<Shape>();
foreach (Shape shape in shapes)
{
if (shapeType != null)
{
if (shape.ShapeType != shapeType)
{
continue;
}
}
string shapeTextContent = shape.GetText().Replace("\r", string.Empty);
if (shapeTextContent != textContext)
{
continue;
}
foundShape = (Shape)shape.Clone(isCloneChildren: false);
foundShape.AppendChild(new Paragraph(document));
foundShape.FirstParagraph.ParagraphFormat.Alignment = ParagraphAlignment.Center;
foundShape.RelativeHorizontalPosition = RelativeHorizontalPosition.Page;
foundShape.RelativeVerticalPosition = RelativeVerticalPosition.Page;
foundShape.WrapType = WrapType.None;
foundShape.ZOrder = 100;
foundShape.BehindText = true;
foundShape.IsLayoutInCell = false;
shape.Remove();
}
}
return foundShape;
}
Unfortunately on one page shape it’s not displayed and in other shape is offset
template
output ok
output non ok - missing
output non ok - offset
Sorry for images but I cannot share data and I don’t know how to share our flow with you to let you understand better the workflow
Thank a lot agin for the help you gave us and for next too 
Best regard,
Jack
I noticed that in offset case the anchor point is changed
offset image.png (8.9 KB)
ok image.png (5.3 KB)
@Jackomino Unfortunately, it is difficult to analyze the problem by screenshots. If possible, please attach your problematic input and output documents. We will check the issue. You can anonymize the documents if they contain any confidential information. We simply need real documents for testing.
Hi,
here you can find
Anonim_template: the starting template using via MailMerge + ReportLinq. It contains the shape cloned and placed on result pages
Anonim_Result: the resulting report
Thanks a lot for the support !
Jack
Examples.zip (201.9 KB)
@Jackomino Thank you for additional information. The problem occurs because the shape is anchored in the table cell. MS Word applies different layout rules to the shapes in table cells. In older version of MS Word, you can set the following property:
shape.IsLayoutInCell = false;
And optimize document for older version of MS Word since this option does not work in new versions:
doc.CompatibilityOptions.OptimizeFor(MsWordVersion.Word2010);
Alternatively you can adjust shape position according to the cell position, where shape is anchored. Something like this:
Document doc = new Document("C:\\Temp\\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);
foreach (Shape s in doc.GetChildNodes(NodeType.Shape, true))
{
if (s.GetAncestor(NodeType.Body) == null)
continue;
double left = 0;
double top = 0;
if (s.GetAncestor(NodeType.Cell) != null)
{
enumerator.Current = collector.GetEntity(s);
while (enumerator.Type != LayoutEntityType.Cell)
enumerator.MoveParent();
left = -enumerator.Rectangle.X;
top = -enumerator.Rectangle.Y;
if (s.RelativeHorizontalPosition == RelativeHorizontalPosition.Page)
s.Left += left;
if (s.RelativeVerticalPosition == RelativeVerticalPosition.Page)
s.Top += left;
}
}
doc.Save("C:\\Temp\\out.docx");
Unfortunately, there is no simple solution for this problem.
Hi, I tried with parameters you suggested (shape.IsLayoutInCell = false; and CompatibilityOptions) and position seems to be better, but now I have 2 shape on the same page.
The section/patagraph expands on many pages…
Now I will try with suggested code
Thanks again
@Jackomino Most likely document layout engine detects page index improperly. This might occur if fonts used in your document are not available in the environment where the document is processed. In this case fonts are substituted and due to difference in font metrics there might be difference in document layout.