Get Position Coordinates of Word Document Elements Paragraphs Images Tables etc using Java API

Hello,
we are building a document converter into Word format using Aspose.Words and we have the requirement that the word document output should match as much as possible the input document format; this means that we need to cross-check the positions of the generated document items (paragraphs, images, tables) while building the document, in order to know immediately if we are on the right track during conversion.
We are of course aware that Word format is flowed and non-positional (like image and PDF outputs), and that the final document output rendering is in charge of Word application, but looking at the documentation (and some info found also on this forum) we have found that LayoutCollector/Enumerator may help on this topic.
We have built a small Java program to check the reliability of the coordinates returned by Aspose.Words layouter, that from your information should match any positional output like PDF. From the test program, that provides a coordinates check output log, we have found some issues, specially on coordinates returned on cross or near end page items.
The test program generates a couple of very simple examples: a document with a huge paragraph spanning across 3 pages and the same document of 3 pages but made of 6 lines max paragraphs.
On the log are reported the Y coordinate, the line number and the generated page number.
On both cases we have found that:

  1. coordinate of the first line is correct on page 1, but after a page change (from page 2 ahead) is not starting at the first available row anymore, but seems at an additional line (for unknown reason).
  2. on the huge paragraph example, the last line of page 2 is reported to be written on page 2 as there is enough space, while on word (and the output PDF) gets written on page 3.
  3. on the 6 lines paragraphs, at the beginning of page 2, the paragraph line position is returning inconsistent values (like explained in point 1), but additionally is returning either the same values for different lines or either growing/lowering when requiring the position of the same line in different times.
    In any case, the generated PDF output matches 100% the word output rendered using Microsoft word, while the layouter values are not matching the PDF result as asserted many times on this forum.

So could you please help us to understand if the above are Layouter issues to be fixed or if we are getting the Y coordinates in the wrong way ?
Attached the testing Java program that you can run directly as it doesn’t require any additional external resource (just adjust the output files directory and license file location).
Many thanks in advance.TestAsposeGetRectangle.zip (1.3 KB)

Aspose.Words version used: 19.7

@renato.mauro,

We have managed to reproduce the same behavior on our end and logged your usecase scenario in our issue tracking system. Your ticket number is WORDSNET-19725. We will further look into the details of this problem and will keep you updated on the status of the linked issue. We apologize for any inconvenience.

@renato.mauro,

Regarding WORDSNET-19725, we have completed the work on your issue and most likely will close this issue as Not a Bug. Please check below the analysis details:

All these issues would be resolved if you try to change your code as follows:

UpdateDocumentCurrentY();

 // Save current document as is
doc.save("E:\\Temp\\line_" + i + ".xps");

String sCurrY = String.format("%.6f", m_docCurrY);
System.out.println("CurLine = " + i + ": CurrY = " + sCurrY + " CurrPg = " + m_nCurrentPageNumber);

Essentially, this is paragraph rules at work, specifically Widow/Orphan rule. Hope, this helps.