How do we identify comments in a PDF text.
The following code snippet also extracts the comment in PDF. This is something that we do not need
License license =new License();
license.setLicense(“C:\Aspose.Total.Java.lic”);
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(“D:/Hhh.pdf”);
TextFragmentAbsorber tfa = new TextFragmentAbsorber();
pdfDocument.getPages().accept(tfa);
TextFragmentCollection tfc = tfa.getTextFragments();
for (Iterator iterator = tfc.iterator(); iterator.hasNext()
{
extractedTest = tf.getText();
System.out.println(extractedTest);
}
The following code does not extract the comment
IEnumerator< Page> ien = pdfDocument.getPages().iterator();
AnnotationCollection ac = page.getAnnotations();
IEnumerator iea = ac.iterator();
// The following condition is never met
// I am expecting comment object here.
// But the execution never reaches this block although
// there are comments in the pdf.
Hi Sujith,
Thanks for your inquriy. You can get text from annotation rectangle as following. Hopefully it will help you to accomplish the task. However if the issue persist then please share your sample PDF document here, we will look into it and will guide you accordingly.
com.aspose.pdf.Document document = new com.aspose.pdf.Document(“input.pdf”);<o:p></o:p>
com.aspose.pdf.Page page = document.getPages().get_Item(1);<o:p></o:p>
AnnotationCollection annots = page.getAnnotations();<o:p></o:p>
for (int j = 1; j <= annots.size(); j++) {<o:p></o:p>
com.aspose.pdf.Annotation annot = annots.get_Item(j);<o:p></o:p>
com.aspose.pdf.Rectangle rect = annot.getRect();<o:p></o:p>
if (annot.getName() != null) {<o:p></o:p>
TextAbsorber absorber = new TextAbsorber();<o:p></o:p>
absorber.getTextSearchOptions().setLimitToPageBounds(true);<o:p></o:p>
absorber.getTextSearchOptions().setRectangle(rect);<o:p></o:p>
page.accept(absorber);<o:p></o:p>
String text = absorber.getText();<o:p></o:p>
System.out.println(text);<o:p></o:p>
<o:p></o:p>
}<o:p></o:p>
}
We are sorry for the inconvenience caused.
Best Regards,
Hello Tillal,
This does not work.
The following line does not compile.
absorber.getTextSearchOptions()
We have license for aspose.pdf-4.1.2.jar. I have also tried
with the latest (aspose.pdf-11.2.0.jar). In both it does not compile.
I don’t need the above un-compiled code to extract comments
and therefore commented it. But after that, the code does not extract the
comment as needed. The size of the AnnotationCollection annots is zero.
I am attaching the sample pdf (WC500107886.pdf)
Regards
Sujith Babu
Hi Sujith,
Thanks for your feedback. The code complies without any issue at my end. However I have checked your shared document, the comments are not included as annotation so the code will not work for your sample document. You may try following documentation link to extract text from specific page area.
Please feel free to contact us for any further assistance.
Best Regards,
Hi Sujith,
Thanks for your feedback. It is good to know that you have managed to accomplish your requirements.
Please keep using our API and feel free to contact us for any further assistance, we will be more than happy to extend our support.
Best Regards,