Generate accessible/tagged PDFs in Java using Aspose.PDF - Tagged Table Issues

When using a screen reader, like JAWS, tagged tables are being skipped. I checked to make sure I wasn’t doing something incorrectly so I tried using an Aspose PDF example. Using Adobe Acrobat I can see tag structure and it looks correct. So I’m not sure what is inherently wrong with recognizing and reading tables using the PDF API for Java.

@jtmille3

The Aspose.PDF mimics the behavior of Adobe Reader and generates PDF document by following standards specified by Adobe. However, we will investigate why you are facing such issue and what is the reason of it. For the purpose, we have logged an investigation ticket as PDFJAVA-39918 in our issue tracking system. We will inform you as soon as it is resolved. Please be patient and spare us some time.

Thanks @asad.ali . I’ve been working with @jtmille3 on this problem. One of the problems seems to be that the generated code is passing Acrobat’s built-in accessibility checker, however when you try to actually read it with a screen reader, the table gets skipped or most of the contents don’t get read. Using Adobe’s built-in read-aloud functionality is not enough to determine if a document is fully accessible. I should be able to use common screen readers such as JAWS or NVDA to read the data in each cell and navigate cell-by-cell to read the data in other cells.

Also, @asad.ali if you take your generated PDF, then in Acrobat delete all of the tags and the page structure, then have Acrobat auto tag the document, the table reads correctly then.

@grkrau

Thanks for sharing your findings further.

We have updated the logged ticket accordingly and will surely check provided information during its investigation. We will notify you once the ticket is resolved. Please give us some time.

We are sorry for the inconvenience.

Any status update for ticket PDFJAVA-39918, @asad.ali ?

I am also experiencing the same issue in Aspose.PDF library for .NET. My internet searching finally lead me here. Tagged PDFs with tagged tables are passing accessibility validation but screen readers like NVDA are not recognizing the content in the table. Other content within the tagged PDF (headers, paragraphs, spans, etc.) is recognized by the screen reader.

The issue can be recreated by using NVDA to read the PDF output from the example code provided here: https://docs.aspose.com/pdf/net/working-with-table-in-tagged-pdfs/. While the syntax will be different, I’m assuming the logic around tagged tables in the Java library will be quite similar to the .NET library.

Thank you!

@nettango

Your understandings are correct. The Classes in Java API have been ported from Aspose.PDF for .NET API and every issue that is fixed in .NET API will become a part of Java API. We have recorded your concerns and will further let you know as soon as the fix to this issue is available in Aspose.PDF for .NET as well. Please spare us some time.

We are sorry for the inconvenience.

Hello again - any status update for ticket PDFJAVA-39918 , @asad.ali ? Or is there a .NET equivalent ticket I can follow?

This issue is quite easy to replicate and is holding up required features in a production environment. Any status update would be appreciated.

Thanks!

@nettango, @grkrau, @jtmille3

We didn’t find any problem with created tags. It seems that Adobe Acrobat doesn’t allow screen reader to read tags. Do you have any PDF document that JAWS/NVDA can read? Could you share any document that work as per your expectations? That can help with investigation and understanding what we can do with it?

Sure thing! I have attached 2 sample documents - one where tables are readable by NVDA and another that is not. The readable version was edited by a contact who specializes in accessibility for the visually impaired. I’m not entirely sure what the differences are, but it was related to the table container in the content panel. Both files are considered valid by the Validate() function on the Document class and by PDF Accessibility Checker (PAC) 3.

I do see what you mean about this issue possibly being isolated to how Adobe is presenting the PDF. I was able to get NVDA to read the tables while viewing tagged PDFs using Chrome, but it was not consistent. Adobe is the most common application for our users to load PDFs, so we would like to get screen readers to recognize tables from Adobe.

sample_no_read_table.pdf (118.7 KB)
sample_yes_read_table.pdf (106.1 KB)

@nettango

Thanks for sharing the requested information. We will include it in our investigation process and let you know as soon as we have some results of the investigation. Please spare us some time.