I need to delete any page in the aspose pdf document with a particular text tag (<test_attachments>) in it. I tried to use the TextFragmentObserver to find the text in the pages and then delete it.
However the textfragment observer is unable to find the text tag, the returned TextFragmentCollection is always empty.
Also tried using the delete functionality with a specific page number, when I verified the size of the pdf document I do see the pdf file size as 10 but has only 3 pages in the object, which is why I get this error
‘com.aspose.pdf.exceptions.IndexOutOfRangeException’ exception.
Please help me understand if I am missing anything while trying to delete a page from the pdf document
Language : Java 8
Aspose pdf : 22.6
Here is the code
Blockquote
private void deleteTagAndPages(ByteArrayOutputStream pdfOutputStream, SortedSet<Integer> attachmentTagLocations)
throws Exception {
com.aspose.pdf.Document pdfDocument = new com.aspose.pdf.Document(pdfOutputStream.toByteArray());
TextFragmentAbsorber textFragmentAbsorber = new TextFragmentAbsorber("<test_attachments>");
pdfDocument.getPages().accept(textFragmentAbsorber);
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();
textFragmentAbsorber.getTextSearchOptions();
for (TextFragment textFragment : textFragmentCollection) {
log.info("Pdf attachment page to be deleted " + textFragment.getPage().getNumber());
pdfDocument.getPages().delete(textFragment.getPage().getNumber());
//pdfDocument.getPages().delete(attachmentTagLocations.first());
}
}
Blockquote