I have attached a word document as a sample input document.I have highlighted the ending word of each line with yellow. I would like to get these words.
Sampeldoc.docx (16.2 KB)
@tanzeel123 Unfortunately, there is no easy way to achieve this. It is quite difficult to work with document layout, since MS Word documents are flow documents and does not contain any information about document layout. The consumer application builds document layout on the fly.
Aspose.Words provides LayoutCollector and LayoutEnumerator classes which allow to work with document layout. In you case it is required split Run nodes in the paragraph into parts so each Run
contain only one word. You can use the following method for splitting Run nodes into parts:
private static Run SplitRun(Run run, int position)
{
Run afterRun = (Run)run.Clone(true);
run.ParentNode.InsertAfter(afterRun, run);
afterRun.Text = run.Text.Substring(position);
run.Text = run.Text.Substring(0, position);
return afterRun;
}
Then you can insert a temporary bookmarks between Run nodes to use them as a marker for determining coordinates:
Document doc = new Document(@"C:\Temp\in.docx");
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);
// Determine coordinates of the bookmak.
enumerator.Current = collector.GetEntity(doc.Range.Bookmarks["test"].BookmarkStart);
Console.WriteLine(enumerator.Rectangle);
Comparing Y coordinate of the temporary bookmarks gives a hint where line break is, such way you can determine the last Run
in the line.