We are currently on release 21.8 and use ….getTextState.isInvisible() to determine when a document has hidden text.
We are currently trying to upgrade to the latest release of aspose (21.11) but when we do so the check for invisible text always returns true for every text fragment it hits.
I have created a test case that uses one of our test files to reproduce the issue. Find the code below and the file attached to this post.
@ExtendWith(MockitoExtension.class)
public class AsposeTest {
private static final Logger logger = LogManager.getLogger(AsposeTest.class);
@Test
void TestForInvisibleText() throws IOException {
// Load file however you please
Document pdf = new Document(new ClassPathResource("pdf/testExtraction.pdf").getInputStream());
Page page = pdf.getPages().get_Item(1);
TextFragmentAbsorber textFragmentAbsorber = new com.aspose.pdf.TextFragmentAbsorber();
page.accept(textFragmentAbsorber);
TextFragmentCollection textFragmentCollection = textFragmentAbsorber.getTextFragments();
int fragmentsCount = textFragmentAbsorber.getTextFragments().size();
int invisibleCount = 0;
Iterator tmp0 = ( textFragmentCollection).iterator();
while (tmp0.hasNext())
{
com.aspose.pdf.TextFragment fragment = (com.aspose.pdf.TextFragment)tmp0.next();
logger.info("\"{}\" is invisible? {}", fragment.getText(), fragment.getTextState().isInvisible());
if (fragment.getTextState().isInvisible())
invisibleCount++;
}
logger.info("Found {} instances of invisible text", invisibleCount);
}
}
testExtraction.pdf (402.5 KB)
Running this test on release 21.8 see’s no hidden text found, on 21.11 every text fragment is found to be hidden.