I have written some code based on your samples to output the absolute x and y position of row text in a word docs. The code is shown below. The issue I am facing is that the y position output for the row text is incorrect
LayoutCollector.GetEntity method works for only Paragraph nodes, as well as indivisible inline nodes, e.g. BookmarkStart or Shape. It doesnât work for Run, CellRow or Table nodes, and nodes within header/footer. LayoutCollector.GetEntity Method
If you need to navigate to a Run of text then you can insert bookmark right before it and then navigate to the bookmark instead.
If you still face problem, please ZIP and attach your input Word document and expected output here for testing. We will investigate the issue and provide you more information on it.
The issue I think I have is Evaluation version of aspose puts some Evaluation text in the docx which shifts everything down (see below) - is there any way to get round this behaviour as its making it hard to me to evaluate the suitability of your product? Otherwise I can provide a zip - Iâll have to work out how to do this and check whether I can share the word doc, but I presume youâre aware of the behaviour of the evaluation version of your component?
Please get the 30 days temporary license and apply it before importing document into Aspose.Wordsâ DOM. Please read the following article about applying license.
Thank you, the 30 day temporary licence does indeed appear to return the correct information. I have 2 further questions:
You say that LayoutCollector.GetEntity method works for only Paragraph nodes, is the following code problematic when the Node passed is a Row object (it seems to provide the correct results in my demo:
Is it possible in code to determine the page number of the Node, I am worried about false positives being matched where the elements are vertically aligned and in the same section but on different pages.
If you need to navigate to a Cell node then you can move to a Paragraph node in this cell and then ascend to a parent entity. The same approach can be used for Row and Table nodes. Please check the moveXXX methods of LayoutEnumerator class.
You can use LayoutCollector.GetStartPageIndex method to get the page number where node begins and LayoutCollector.GetEndPageIndex method to get the page number where node ends.
Hi, I have hit a new problem in the layout enumerator and have created a zip file to show the issue. You can see in the documentxxxx_asset_management.docx on p6, Landlordâs Surveyor and the text âa surveyor or member of a firm etcâ look horizontally aligned but using the Aspose tool I get the following info about the alignment:
a surveyor or member of a firm of surveyors who shall be a fellow or associate of the Royal Institution of Chartered Surveyors or the Incorporated Society of Valuers and Auctioneers or suitably experienced and such surveyor may be a person employed by the Landlord or a company which is a Group Company;
You are facing the expected behavior of Aspose.Words. You can use the following code example to get the height and width of paragraph. Hope this helps you.
Document doc = new Document(MyDir + "rubicon_asset_management.docx");
String str = "a surveyor or member of a firm of surveyors who shall be a fellow or associate of the Royal Institution of Chartered Surveyors or the Incorporated Society of Valuers and Auctioneers or suitably experienced and such surveyor may be a person employed by the Landlord or a company which is a Group Company";
for (Paragraph para : (Iterable<Paragraph>) doc.getChildNodes(NodeType.PARAGRAPH, true))
{
if(para.toString(SaveFormat.TEXT).contains(str))
{
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);
enumerator.setCurrent(collector.getEntity(para));
enumerator.moveParent(); // move to container line
System.out.println("Para width = " + enumerator.getRectangle().getWidth());
double bottom = enumerator.getRectangle().getY() + enumerator.getRectangle().getHeight();
while (enumerator.movePrevious())
{
} // move to the first line
double top = enumerator.getRectangle().getY();
System.out.println("Para height = " + (bottom - top));
break;
}
}
Thanks for the code sample. Unfortunately I am still having problems with the enumerator locating the correct x and y position. I added bookmarks to the word document before the paragraphs containing the text Landlordâs Surveyor and a surveyor or member of a firm of surveyors etc. The x and y reported by the bookmarks is:
landlordSurveyorLH
X : 150.9499969482422
Y : 534.6090087890625
landlordSurveyor
X : 286.79998779296875
Y : 534.0989990234375
With the enumerator I get the following outputs:
Text = Landlordâs Surveyor
Para width = 86.6500015258789
Para height = 10.850000381469727
x1 = 150.9499969482422
y1 = 503.04998779296875
x2 = 150.9499969482422
y2 = 415.1499938964844
Page start = 5
Page end = 5
Text = a surveyor or member of a firm of surveyors who shall be a fellow or associate of the Royal Institution of Chartered Surveyors or the Incorporated Society of Valuers and Auctioneers or suitably experienced and such surveyor may be a person employed by the Landlord or a company which is a Group Company;
Para width = 200.0
Para height = 12.050000190734863
x1 = 286.79998779296875
y1 = 618.6500244140625
x2 = 286.79998779296875
y2 = 295.45001220703125
Page start = 5
Page end = 5
We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-21776. You will be notified via this forum thread once this issue is resolved.
It is to inform you that the issue which you are facing is actually not a bug in Aspose.Words. So, we have closed this issue (WORDSNET-21776) as âNot a Bugâ.
Please use the latest version of Aspose.Words for Java 21.6 and following code example to get the position of bookmark and text.
public static void ExportLayoutContent(Document doc) throws Exception
{
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);
enumerator.reset();
while (true)
{
WriteCurrent(enumerator);
if (!enumerator.moveNext())
break;
}
}
private static void WriteCurrent(LayoutEnumerator e) throws Exception
{
System.out.println(e.getType() + " (" + e.getKind() + ")\t" + e.getRectangle() + "\t" + (e.getType() == LayoutEntityType.SPAN ? e.getText() : ""));
Object current = e.getCurrent();
if (e.moveFirstChild())
{
do
{
WriteCurrent(e);
}
while (e.moveNext());
e.moveParent();
}
}
Document doc = new Document(MyDir + "rubicon_bookmarked.docx");
ExportLayoutContent(doc);
Following is the output of code example. The position of Bookmark and Text is same.