Working with Tagged PDF documents using Aspose.PDF - Making stamped content accessible

When creating a stamp from a page in one PDF file and stamping it into a page in another PDF file, is there a method to make the stamped content accessible such that the native Adobe Reader screen reader will read the stamped content?

Thanks
Tim

@twf

Once a stamp is added to a PDF document, you can convert it into tagged/accessible PDF document using the following code snippet:

Document pdf = new Document(dataDir + "attachment.pdf");
pdf.Convert(dataDir + "out.log", PdfFormat.PDF_UA_1, ConvertErrorAction.Delete);
pdf.Save(dataDir + "output.pdf");

In case you face any issue, please feel free to let us know.

Thanks @asad.ali. This did not work. However I have learned some more.

Firstly the PDF document I am creating the stamp from works with the Adobe Reader screen reader function.

Secondly if I generate a new PDF with Aspose.PDF and stamp it from the other document, then save, the newly stamped PDF document works with the Adobe Reader screen reader function.

Thirdly. The real PDF I am stamping is created by converting a Word document to PDF using Word’s Save As PDF function. If one of the options, Document Structure Tags for Accessibility is checked, then when content is stamped into it, it does not work with the Adobe Reader screen reader function. If it is unchecked, the resulting PDF does work when stamped.

It is important to have this option checked as it provides the assistive technology important information about the flow of the document. We are dealing with local government documents available to the public. Yet in doing so it precludes the stamped content coming in from being usable by the assistive technology.

How would one bring stamped content in, then modify the Tagged Content to ensure a the reader function can see the stamped content?

Thanks again
Tim

@twf

Would you kindly share the sample Word document with us along with the screenshot of checkbox which you are checking while converting it to PDF? Also, please explain a bit more about stamping into it by sharing sample code snippet. It would also help if you can please share problematic output along with expected output PDF. We will investigate the scenario at our end and share our feedback with you accordingly.

Yes certainly.

Attached are the following

  1. Doc1.Docx - A 2 page Word document. Page 1 has “Hello World”, page 2 is blank.
  2. Doc1_Tagged.PDF - The Word document converted to PDF (via Word) with tagged option set
  3. Doc1_Untagged.PDF - The Word document converted to PDF (via Word) with tagged option unset
  4. ScreenShot1.jpg - A screen shot showing the Word Save As, PDF Option to control tagging
  5. 3 page colour - Sports_Club_Fsheet.pdf - A PDF from which we will make a stamp from page 1.
  6. Doc1_Tagged_Stamped.PDF - Doc1_Tagged.PDF with page 2 stamped from stamp created in 5)
  7. Doc1_Untagged_Stamped.PDF - Doc1_Untagged.PDF with page 2 stamped from stamp created in 5)
  8. ScreenShot2.jpg - A screen shot showing Adobe Reader, Read Out Loud function.

The stamping code to do this is…
Dim loToDoc As New Document(“Doc1_Tagged.PDF”)
Dim loFromDoc As New Document(“3 page colour - Sports_Club_Fsheet.PDF”)
Dim loPageStamp As PdfPageStamp = New PdfPageStamp(loFromDoc.Pages(1))
loToDoc.Pages(2).AddStamp(loPageStamp)
loToDoc.Save(“Doc1_Tagged_Stamped.PDF”)
…similar code for Untagged

The screen reader results for each PDF are
2) “Hello World” is read
3) “Hello World” is read
5) The contents are read
6) “Hello World” is read, but none of the stamped content is read
7) “Hello World” is read and the stamped content is read

It is desired that for 6) (the tagged document) that the stamped content is read as well as the original document content.

Cheers
Tim
Files.zip (836.5 KB)

@twf

Thanks for sharing further details and sample files.

We have logged an investigation ticket as PDFNET-49475 for the sake of your particular requirements. We will further look into its details and keep you posted with the status of its resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

This issue is taking a long time to progress. Could we see stamped content into tagged PDFs be accessible soon please?

@twf

The issue has already been escalated to the highest priority and it is currently under the phase of investigation. We are afraid that this new feature could not get implemented yet. Nevertheless, we have recorded your concerns and will surely consider them during ticket resolution. We will notify you as soon as some significant progress is made towards issue resolution. Please spare us some time.

We are sorry for the inconvenience.

The issues you have found earlier (filed as PDFNET-49475) have been fixed in this update.

Many thanks, I have tested this and it is now behaving.
Tim