Pdf modification - delete image based on text

Hi,

We have a pdf document with multiple images in it. Some of those images are inserted alongwith a particular text (say 'TEST') as an label.

While reading pdf, we can seperately find that text , we got TextFragment, and can clear text in it.

Then we can delet all images seperately using getResources().getImages()...

Our requirement is to delete only the particular text and delete only the image associated with it.

So how can we find ,if, after particular text fragment any image node is present?

e.g. if we find the TextFragment for the particular text 'TEST'. Is it possible to retrieve the image immediately inserted after it? Can we get parent paragraph of TextFragment and delete image from that paragraph?

Please provide sample pdf document with text and images. and possible workaround (using aspose.pdf or aspose.pdf.kit) for deleting particular image based on text preceding it.

Thanks.

-Sonali

Hi Sonali,


Thanks for your inquiry. I am afraid currently Aspose.Pdf does not support the feature to delete Image on the basis of text. However we have logged a new feature request as PDFNEWJAVA-34028 for the purpose in our issue tracking system. We will notify you as soon as it is implemented.

However, we will request you to share your sample document here to ensure that your scenario is addressed exactly in the feature implementation.

We are sorry for the inconvenience caused.

Best Regards,

Hi,

When can we expect this to be available? Is it possible to provide us fix before this weekend? (we are using aspose pdf 4.1.) Patch should be fine instead waiting for next release.

Thanks.

-Sonali

Hi Sonali,

As we recently have been able to notice this issue, the development team requires a little time to investigate and figure out the approach to implement this new feature. However, as soon as we make some definite progress toward its implementation, we would be more than happy to update you with the status of the correction.

Hi Sonali,

Thanks for your patience. We have further investigated your requirement and would like to suggest that you can easily find and delete desired images in any directions from a text pattern by comparing the coordinates of a resulting TextFragment and pictures. Please find below the code snippet that delete an image on page 12 that places below text “Don’t let me hear the name again” no far than 10 pixels. Hopefully it will help you to accomplish the task.

com.aspose.pdf.Document document = new com.aspose.pdf.Document("Alice in Wonderland.pdf");

com.aspose.pdf.TextFragmentAbsorber textFragmentAbsorber = new com.aspose.pdf.TextFragmentAbsorber("Don't let me hear the name again");
document.getPages().accept(textFragmentAbsorber);

double textYCoord = textFragmentAbsorber.getTextFragments().get_Item(1).getPosition().getYIndent();
int textPageNumber = textFragmentAbsorber.getTextFragments().get_Item(1).getPage().getNumber();

ImagePlacementAbsorber imageAbsorber = new ImagePlacementAbsorber();
document.getPages().accept(imageAbsorber);

for (ImagePlacement imagePlacement : (Iterable<ImagePlacement>) imageAbsorber.getImagePlacements()) {
    double pictureUpperYCoord = imagePlacement.getRectangle().getURY();
    int picturePageNumber = imagePlacement.getPage().getNumber();

    if (textPageNumber == picturePageNumber && textPageNumber < pictureUpperYCoord + 10) {
        imagePlacement.getImage().delete();
        System.out.println("Image on page " + textPageNumber + " was deleted ");
    }
}

document.save("Alice in WonderlandNew.pdf");

Please feel free to contact us for any further assistance.

Best Regards,