Modifying PDF file with tagged data

Hello,
I have few questions regarding PDF tagging, maybe you could help me.
we are working with Java aspose.words 23.8 version and aspose.pdf 22.2 versions.

  1. We have PDF document that was created apose.words lib with setExportDocumentStructure set to false, so no tagging exists in this file. Is there a way to convert this PDF using aspose.pdf lib so that autotags would be created? As far as I tested I was not able to do that. If this possible, then would be an improvement from just by setting setExportDocumentStructure to true when creating PDF document using aspose.words?

  2. Have PDF created from apose.words(autotagged or no-tags). Is there a good way to edit this file using apose.pdf lib and add tags/modify/remove them to already existing content without creating/removing new content? As far as I see in this lib tags could be created only by adding new content but it cannot update tag structure.

Thank you.

@ANDREA.FARRIS

In order to convert a PDF into tagged PDF, you can use below code snippet:

Document document = new Document(dataDir + "input.pdf");
document.convert(dataDir + "taggedpdf.pdf", PdfFormat.PDF_UA_1, ConvertErrorAction.Delete);

For further working with the tagged PDFs, please try to check the respective documentation section and feel free to let us know if you are unable to find the desired functionality.

I have tried you code snippet on provided example.pdf that is not tagged and created pdf result.pdf seems to be invalid PDF file - I cannot open it.
example.pdf (13.0 KB)

result.pdf (883 Bytes)

@ANDREA.FARRIS

First of all, we are sorry that there was a mistake in the earlier shared code snippet. The PDF format was incorrect (we have corrected it now). We tested the scenario using the latest version of the API in our environment and did notice that the generated output was invalid/damaged. Therefore, an issue as PDFJAVA-43261 has been generated in our issue management system to further investigate this case. We will look into its details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for the inconvenience.