How do we identify comments in a pdf in aspose pdf for java

How do we identify comments in a PDF text.

The following code snippet also extracts the comment in PDF. This is something that we do not need

License license =new License();
license.setLicense(“C:\Aspose.Total.Java.lic”);
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(“D:/Hhh.pdf”);
TextFragmentAbsorber tfa = new TextFragmentAbsorber();
pdfDocument.getPages().accept(tfa);
TextFragmentCollection tfc = tfa.getTextFragments();
TextFragment tf;
String extractedTest;
for (Iterator iterator = tfc.iterator(); iterator.hasNext():wink: {
tf = iterator.next();
extractedTest = tf.getText();
System.out.println(extractedTest);
}

The following code does not extract the comment
IEnumerator< Page> ien = pdfDocument.getPages().iterator();
Page page;
while(ien.hasNext())
{
page = ien.next();
AnnotationCollection ac = page.getAnnotations();
IEnumerator iea = ac.iterator();
// The following condition is never met
while(iea.hasNext())
{
// I am expecting comment object here.
// But the execution never reaches this block although
// there are comments in the pdf.
}
}

Hi Sujith,


Thanks for your inquriy. You can get text from annotation rectangle as following. Hopefully it will help you to accomplish the task. However if the issue persist then please share your sample PDF document here, we will look into it and will guide you accordingly.

com.aspose.pdf.Document document = new com.aspose.pdf.Document(“input.pdf”);<o:p></o:p>

com.aspose.pdf.Page page = document.getPages().get_Item(1);<o:p></o:p>

AnnotationCollection annots = page.getAnnotations();<o:p></o:p>

for (int j = 1; j <= annots.size(); j++) {<o:p></o:p>

com.aspose.pdf.Annotation annot = annots.get_Item(j);<o:p></o:p>

com.aspose.pdf.Rectangle rect = annot.getRect();<o:p></o:p>

if (annot.getName() != null) {<o:p></o:p>

TextAbsorber absorber = new TextAbsorber();<o:p></o:p>

absorber.getTextSearchOptions().setLimitToPageBounds(true);<o:p></o:p>

absorber.getTextSearchOptions().setRectangle(rect);<o:p></o:p>

page.accept(absorber);<o:p></o:p>

String text = absorber.getText();<o:p></o:p>

System.out.println(text);<o:p></o:p>

<o:p></o:p>

}<o:p></o:p>

}


We are sorry for the inconvenience caused.

Best Regards,

Hello Tillal,

This does not work.

The following line does not compile.

absorber.getTextSearchOptions()

We have license for aspose.pdf-4.1.2.jar. I have also tried with the latest (aspose.pdf-11.2.0.jar). In both it does not compile.

I don’t need the above un-compiled code to extract comments and therefore commented it. But after that, the code does not extract the comment as needed. The size of the AnnotationCollection annots is zero.

I am attaching the sample pdf (WC500107886.pdf)

Regards

Sujith Babu

Hi Sujith,


Thanks for your feedback. The code complies without any issue at my end. However I have checked your shared document, the comments are not included as annotation so the code will not work for your sample document. You may try following documentation link to extract text from specific page area.


Please feel free to contact us for any further assistance.

Best Regards,

Hello Tillal,

This helps.
Thanks
Sujith Babu

Hi Sujith,


Thanks for your feedback. It is good to know that you have managed to accomplish your requirements.

Please keep using our API and feel free to contact us for any further assistance, we will be more than happy to extend our support.

Best Regards,