How can I get the meta-information of a bookmark?

There is bookmark in our Word document, and the input is the bookmark name.

I need to get meta-information of the bookmark, such as in which page, the x- and y- position, width and height, etc.

I can use following code to get the bookmark properly:

Bookmark bookmark = doc.getRange().getBookmarks().get(bookmarkName);

then, how can I get such meta-information of the bookmark?

@zwei You can use LayoutCollector and LayoutEnumerator classes to get layout information of the bookmark or other nodes in the document. For example see the following code:

Document doc = new Document("C:\\Temp\\in.docx");
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);

// Get layout information of the bookmarks.
for (Bookmark bk : doc.getRange().getBookmarks())
{
    System.out.println("Bookmark name: " + bk.getName());
    System.out.println("Bookmark start page: " + collector.getStartPageIndex(bk.getBookmarkStart()));
    System.out.println("Bookmark end page: " + collector.getEndPageIndex(bk.getBookmarkEnd()));

    // Use LayoutEnumerator to calculate rectangle ocupped by the bookmark.
    // Code is simplified and handles only situation when whole bookmark is on page.
    enumerator.setCurrent(collector.getEntity(bk.getBookmarkStart()));

    // Print bounding box of the BookmarkStart
    Rectangle2D rect = enumerator.getRectangle();
    System.out.println("Page:" + enumerator.getPageIndex() + " X=" + rect.getX() + "; Y=" + rect.getY() + "; Width=" + rect.getWidth() + "; Height=" + rect.getHeight());

    // Do the same for the bookmark end.
    enumerator.setCurrent(collector.getEntity(bk.getBookmarkEnd()));
    rect = enumerator.getRectangle();
    System.out.println("Page:" + enumerator.getPageIndex() + " X=" + rect.getX() + "; Y=" + rect.getY() + "; Width=" + rect.getWidth() + "; Height=" + rect.getHeight());

    // If you need to calculate bounding box of the whole area ocuppied by bookmark,
    // you can calculate it as union of the start and end lines
    // if the bookmark start and end are located on the same page and the same txt column.
    // Otherwise it will be required to add additional logic to calculate bounding box
    // of bookmark'  located on different pages or text columns

    System.out.println("=============================");
}
1 Like

Thank you very much!

1 Like

The width/height result of Aspose is very good, however, there is deviation for X- and Y-position.

The test file is this one: AndyField2.docx (494,9 KB)

The Aspose position result of the Bookmark “SIGN_Geschaeftsfuehrer” is X=70.9000015258789; Y=556.8989868164062

The Aspose position result of the Bookmark “SIGN_Bereichsleiter” is X=311.8999938964844; Y=556.8989868164062

However, the expected position of “SIGN_Geschaeftsfuehrer” should be x=68, y=520, and the expected position of “SIGN_Bereichsleiter” should be x=305, y=520

@zwei As I can see coordinates determined by Aspose.Words are correct. For testing purposes I have drew a rectangle at the detected coordinates:

Document doc = new Document("C:\\Temp\\in.docx");
LayoutCollector collector = new LayoutCollector(doc);
LayoutEnumerator enumerator = new LayoutEnumerator(doc);

// Get layout information of the bookmarks.
for (Bookmark bk : doc.getRange().getBookmarks())
{
    System.out.println("Bookmark name: " + bk.getName());
    System.out.println("Bookmark start page: " + collector.getStartPageIndex(bk.getBookmarkStart()));
    System.out.println("Bookmark end page: " + collector.getEndPageIndex(bk.getBookmarkEnd()));

    // Use LayoutEnumerator to calculate rectangle ocupped by the bookmark.
    // Code is simplified and handles only situation when whole bookmark is on page.
    enumerator.setCurrent(collector.getEntity(bk.getBookmarkStart()));

    // Print bounding box of the BookmarkStart
    Rectangle2D startRect = enumerator.getRectangle();
    System.out.println("Page:" + enumerator.getPageIndex() + " X=" + startRect.getX() + "; Y=" + startRect.getY() + "; Width=" + startRect.getWidth() + "; Height=" + startRect.getHeight());

    // Do the same for the bookmark end.
    enumerator.setCurrent(collector.getEntity(bk.getBookmarkEnd()));
    Rectangle2D endRect = enumerator.getRectangle();
    System.out.println("Page:" + enumerator.getPageIndex() + " X=" + endRect.getX() + "; Y=" + endRect.getY() + "; Width=" + endRect.getWidth() + "; Height=" + endRect.getHeight());

    Rectangle2D result_rect = startRect.createUnion(endRect);
    // For testing purposes draw a red rectangle at the detected coordinates.
    Shape rect = new Shape(doc, ShapeType.RECTANGLE);
    rect.getStroke().setColor(Color.RED);
    rect.getStroke().setWeight(2);
    rect.setFilled(false);
    rect.setWrapType(WrapType.NONE);
    rect.setRelativeHorizontalPosition(RelativeHorizontalPosition.PAGE);
    rect.setRelativeVerticalPosition(RelativeVerticalPosition.PAGE);
    rect.setLeft(result_rect.getX());
    rect.setTop(result_rect.getY());
    rect.setWidth(result_rect.getWidth());
    rect.setHeight(result_rect.getHeight());
    // Since there is only one page, add the test shape in the first paragraph.
    doc.getFirstSection().getBody().getFirstParagraph().appendChild(rect);

    System.out.println("=============================");
}

doc.save("C:\\temp\\out.docx");

out.docx (414.0 KB)

If you see the out.docx file, you will find that the position of two rectangles are totally false. Such two Rectangles should be at “Signatur”, but the actual result of Aspose is not so: aspose-pos-result.JPG (104,7 KB)

However, if I switch MS-Word into “Compatibility Mode”, the display result of the out.docx is corrent: aspose-pos-compatible.JPG (114,7 KB)

My company must work with AdobeSign and DocuSign for Online Signature, and we must send the X- and Y-position of Signature Bookmarks to them. Therefore I hope you can understand, that the accurate position is mission critical for us.

@zwei Aspose.Words returns correct X and Y position when document is viewed in print layout. You can observe the same correct position is save the output document as PDF:
out.pdf (90.6 KB)

1 Like

But in protected mode of MS-Word, there is serious error. aspose-result-2.JPG (38,0 KB)

Now we will test Aspose result with AdobeSign/DocuSign together, if AS/DS work fine with Aspose result, then just let it be. Otherwise it will be a mission critical problem.

However, such error will disappear at MS-Word Compatibility Mode.aspose-result-3.JPG (76,0 KB)

It is the Aspose result file. out2.docx (19,4 KB)

If you open it in MS-Word “Protect View”, the display result contains serious error.

@zwei Aspose.Words layout engine uses rules applied to the document in print layout. Read and Web layout uses different rules.

1 Like

We will test Aspose with AdobeSign/DocuSign, and I will keep you informed for testing result.

1 Like

What a pity, our Aspose Live Test with AdobeSign/DocuSign failed.

Here it is an example with DocuSign Live Test: asposeTest.JPG (44,9 KB)

We use the X- and Y- Position generated by Aspose, then send the Docx file and X/Y-position to DocuSign Cloud Service, and the result is not good. It is necessary to point out, in this test, our Docx has only one page, however in reality, our documents usually have more than 100 pages, and the signaure place is always at the last page. In this case, the position of signature will be totally inacceptable.

It is the docx file used for the DocuSign Live Test AndyField2.docx (494,9 KB)

Here it is another Aspose DocuSign Live Test, and the result is total inacceptable.
aspose-result-4.JPG (42,6 KB)

Here it is the docx file for this test, the signature place is at the last page: Beispielvertrag_klein.docx (27,3 KB)

The accurate X- and Y-Position of signatures are a mission critical requirement of our company, is it possible for Aspose to improve this feature in short time? By now Aspose Evaluation Edition passed almost every tests, the only problem is the X- and Y-Position of signatures.

@zwei The problem might occur because fonts used in your document are not available in the environment where you process the document.
As you may know, MS Word documents are flow documents and do not contain any information about document layout. The consumer applications, like MS Word or Open Office builds document layout on the fly. Aspose.Words uses it’s own layout engine to build document layout while rendering the document to fixed page formats (PDF, XPS, Image etc.). The same layout engine is used for providing document layout information via LayoutCollector and LayoutEnumerator classes.
To built proper document layout the fonts used in the original document are required. If Aspose.Words cannot find the fonts used in the document the fonts are substituted . This might lead into the layout difference (incorrect coordinates returned by LayoutEnumerator), since substitution fonts might have different font metrics. You can implement IWarningCallback to get a notification when font substitution is performed.

The java implement runs on the same machine where the MS-Word installed. So I don’t think fonts is the problem.

@zwei Unfortunately, I cannot reproduce the problem on my side. Here is the output produced on my side:
out.pdf (61.5 KB)
As you can see rectangles are drawn on the correct coordinates. Please try using IWarningCallback and check whether fonts are substituted. Also, please try rendering document to PDF using Aspose.Words on your side and check whether output document layout matches what you see in MS Word.

The PDF Generation of Aspose works fine, we have tested it. However here we test if Aspose can work with AdobeSign/DocuSign together.

To reproduce our scenario, you need an access to AdobeSign and DocuSign.

We send our Docx file, and the x/y-position of signatures generated by Aspose to AdobeSign/DocuSign API, then comes our test result.

@zwei LayoutCollector and LayoutEnumerator use the same layout engine and Aspose.Words uses for rendering document to PDF and other fixed page formats. Also, as you can see in demo document Aspose.Words returns correct coordinates of objects in your source document. So I suppose the bug is in AdobeSign and DocuSign. Can use pass PDF document to these tools?