Differentiation between Comments and Reference links

LeahGreen · December 30, 2024, 8:34am

Hello Everyone and @alexey.noskov,
In my Java code I remove comments from my document.

When converting word to PDF using Aspose.words:

com.aspose.words.NodeCollection comments = (com.aspose.words.NodeCollection)doc.getChildNodes(NodeType.COMMENT, true);
if (comments != null) {
     comments.clear();
}

And when converting PDF to PDF using Aspose.pdf:

com.aspose.pdf.PageCollection pageCollection = document.getPages();
int index;
PdfAnnotationEditor annotationEditor = new PdfAnnotationEditor();
annotationEditor.bindPdf(document);
for (Page page : pageCollection) {
	for (index = 1; index <= page.getAnnotations().size(); index++) {
		String x = page.getAnnotations().get_Item(index).toString();
		if(!x.contains("pdf.WidgetAnnotation")) {
			annotationEditor.deleteAnnotation(page.getAnnotations().get_Item(index).getName());
		}
	}
}

By removing the comments/ annotations also the reference links (links within the document) are removed from my document. This is an unwanted behavior.
Is there a way to differentiate between comments (- witch I want to remove) to reference links (- that should remain)?

Thanks in advance!

alexey.noskov · December 30, 2024, 8:44am

@LeahGreen I am afraid your question is no clear enough. Could you please attach your input, output and expected documents here for our reference? We will check the issue and provide you more information.

LeahGreen · December 30, 2024, 9:16am

Thanks @alexey.noskov ,
I am attaching here an example document.

in the title: " Hello this is an example document" you can see I added a comment on the word “Is”.
this kind of comments I would like to remove from my documents when I generate them to PDF.
in the text: “Please refer to the Conclusion for more details.” if you click ctrl + mouse click on the word “conclusion” you will be referenced by a cross link to the conclusion at the bottom of the page. these kind of cross links I do not want to remove from my documents when generating to PDF.

In the previous message I shared the code that removes the comments from my documents.
Currently both, comments + cross links are being removed.
Is there a way to remove only comments and not cross links?

I hope now the question is more clear.
Thank you!

Example doc.docx (18.9 KB)

alexey.noskov · December 30, 2024, 9:19am

@LeahGreen The following code work fine:

Document doc = new Document("C:\\Temp\\in.docx");
doc.getChildNodes(NodeType.COMMENT, true).clear();
doc.save("C:\\Temp\\out.pdf");

out.pdf (20.6 KB)

LeahGreen · December 30, 2024, 9:54am

@alexey.noskov
Ok, thanks. I will see what has to be changed by me and test it.

What about PDF to PDF?
This is the code that am am using:

 com.aspose.pdf.PageCollection pageCollection = document.getPages();
 int index;
 PdfAnnotationEditor annotationEditor = new PdfAnnotationEditor();
 annotationEditor.bindPdf(document);
 for (Page page : pageCollection) {
 	for (index = 1; index <= page.getAnnotations().size(); index++) {
 		String x = page.getAnnotations().get_Item(index).toString();
 		if(!x.contains("pdf.WidgetAnnotation")) {
 			annotationEditor.deleteAnnotation(page.getAnnotations().get_Item(index).getName());
 		}
 	}
 }

but it removes both, comments and cross links.

Attached a PDF doc as example, with comment + cross links.
Thanks again.

LeahGreen · December 30, 2024, 10:38am

@alexey.noskov any idea for me?

alexey.noskov · December 30, 2024, 11:20am

@LeahGreen The question is no related to Aspose.Words. i will move it to Aspose.PDF forum. my colleagues will help you shortly .

LeahGreen · December 30, 2024, 11:44am

Thank you @alexey.noskov .

LeahGreen · December 30, 2024, 4:28pm

Hello Aspose.pdf teem.
Do you have an answer for me?

asad.ali · December 30, 2024, 8:30pm

@LeahGreen

In order to filter only text annotations and leave the link annotations intact, you can do something like below:

try (com.aspose.pdf.Document doc = new com.aspose.pdf.Document(dataDir + "test_document_hyperlinks_bookmarks.pdf")) {
    // Iterate through pages in the document
    for (Page page : doc.getPages()) {
        // Iterate through annotations on the page
        for (com.aspose.pdf.Annotation annot : page.getAnnotations()) {
            // Skip LinkAnnotations
            if (annot instanceof LinkAnnotation) {
                continue;
            }
            // else delete the annotation
        }
    }
    // Save the flattened document
    doc.save(dataDir + "flattened.pdf");
}

In case you still face any issues, please share the PDF file as sample source file so that we can further proceed accordingly.

LeahGreen · December 31, 2024, 8:11am

Hi @asad.ali
Thank you so much for your response.
I tried the code you suggested, it works great for the cross references, and they are not being deleted by generation.
At the same time, the comments are also remaining, while I want them to be removed.

I am attaching here a pdf document with one comment and one reference link. please guide me how to remove the comment and leave the cross reference link.
Thanks again.
Example doc.pdf (40.9 KB)

LeahGreen · December 31, 2024, 12:56pm

Hi @asad.ali
any comment?

asad.ali · December 31, 2024, 10:15pm

@LeahGreen

We checked your document and it seemed to have two annotations and both are LinkAnnotation. Therefore, the API is not able to detect any comment/TextAnnotation and delete it. We believe that the engine that you are using to convert source Word file into PDF documents, is not processing the comments as Text Annotations in converted PDF documents.

Do you happen to have such documents a lot or this is the only instance? We can log an investigation ticket in our issue management system to analyze the feasibility to differentiate for such files. Please provide your feedback so that we can further proceed accordingly.

LeahGreen · January 1, 2025, 8:15am

Hi @asad.ali ,
I used Word to create the document. Added a comment + Cross reference and finally did Export to PDF.
This is a scenario that our customers use very frequently and we had the issue raising over and over.

I will truly appreciate your investigation and adding the ability to define comments separately than links.

I attached screenshots of the Word options that I used.
Screenshot 2025-01-01 095711.png (15.3 KB)
Screenshot 2025-01-01 095933.png (16.3 KB)

Thanks again.

asad.ali · January 1, 2025, 9:04pm

@LeahGreen

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-44623

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

LeahGreen · January 9, 2025, 8:44am

Hi @asad.ali
Do you have a clear root cause for this bug so we can communicate it to our customers?

Also, is there is a known issue article on aspose site where the issue is explained?

Thanks!

asad.ali · January 9, 2025, 8:45pm

@LeahGreen

From initial investigation, it looks like the word document is converted this way i.e. Annotations are added differently in the exported PDF document. Furthermore, the ticket is pending for further analysis and will be investigated, resolved on first come first serve basis. As soon as we make some progress in this regard, we will inform you. Please be patient and spare us some time.

We are sorry for the inconvenience.