Hidden Text Detection always returns true

We are currently on release 21.8 and use ….getTextState.isInvisible() to determine when a document has hidden text.

We are currently trying to upgrade to the latest release of aspose (21.11) but when we do so the check for invisible text always returns true for every text fragment it hits.

I have created a test case that uses one of our test files to reproduce the issue. Find the code below and the file attached to this post.

@ExtendWith(MockitoExtension.class)
public class AsposeTest {

    private static final Logger logger = LogManager.getLogger(AsposeTest.class);

    @Test
    void TestForInvisibleText() throws IOException {
        //  Load file however you please
        Document pdf = new Document(new ClassPathResource("pdf/testExtraction.pdf").getInputStream());

        Page page = pdf.getPages().get_Item(1);
        TextFragmentAbsorber textFragmentAbsorber = new com.aspose.pdf.TextFragmentAbsorber();
        page.accept(textFragmentAbsorber);
        TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();

        int fragmentsCount = textFragmentAbsorber.getTextFragments().size();
        int invisibleCount = 0;

        Iterator tmp0 = ( textFragmentCollection).iterator();
        while (tmp0.hasNext())
        {
            com.aspose.pdf.TextFragment fragment = (com.aspose.pdf.TextFragment)tmp0.next();
            logger.info("\"{}\" is invisible? {}", fragment.getText(), fragment.getTextState().isInvisible());
            if (fragment.getTextState().isInvisible())
                invisibleCount++;
        }

        logger.info("Found {} instances of invisible text", invisibleCount);
    }
}

testExtraction.pdf (402.5 KB)

Running this test on release 21.8 see’s no hidden text found, on 21.11 every text fragment is found to be hidden.

@bpalkoPA

We were able to reproduce the issue in our environment. For the sake of further investigation, we have logged it as PDFJAVA-41066 in our issue management system. We will further look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

The issues you have found earlier (filed as PDFJAVA-41066) have been fixed in Aspose.PDF for Java 22.4.