Create a PDF/A2-a

Hi, I developed some code to create a pdf file from scratch, I use object like:
TextSegment
FloatingBox
TextFragment
Line
Image
everything is seen correctly inside the pdf. Now I would like to modify the code to make it tagged, so as I insert these elements into the pdf I would also like to insert them into the tag structure.
How can I do it?

Example:

            FloatingBox floatingBox = new FloatingBox();
            floatingBox.setLeft(x);
            floatingBox.setTop(y);
            floatingBox.setWidth(width);
            floatingBox.setHeight(height);
            
			com.aspose.pdf.MarginInfo margin = new com.aspose.pdf.MarginInfo(leftPadding, bottomPadding, rightPadding, topPadding);
            floatingBox.setPadding(margin);
            floatingBox.setVerticalAlignment(verticalAlignment);
            floatingBox.setHorizontalAlignment(horizontalAlignment);
			
            com.aspose.pdf.TextFragment fragmentColl = new com.aspose.pdf.TextFragment();
            TextSegmentCollection sc = fragmentColl.getSegments();
            if(lstTextSegment !=  null && lstTextSegment.size() > 0) {
                        for (int i = 0; i < lstTextSegment.size(); i++) {
                            com.aspose.pdf.TextSegment segment = lstTextSegment.get(i);
                            sc.add(segment);
                        }
                        floatingBox.getParagraphs().add(fragmentColl);
            }	
			currentAsposePage.getParagraphs().add(floatingBox);

where and how can I act in this piece of code to set the graphic elements to ARTIFACT and tag the text? then create a link between the element inserted in the pdf and the tag structure

Grazie.

@Davide_C

You can simply convert the generated PDF file into tagged PDF using below code snippet:

Document document = new Document(dataDir + "input.pdf");
document.convert(dataDir + "taggedpdf.xml", PdfFormat.PDF_UA_1, ConvertErrorAction.Delete);
document.save(dataDir + "PDFwithTagged.pdf");

Thanks, so how do I have to save the file, reopen it and convert it? But when you have thousands of files to produce, this step is too long. Isn’t it possible to tag while creating the PDF? Also, if I wanted to create custom tags, how do I do it?

I have to save according to the instructions in the log file. I have these errors, how do I solve them?
XObject object not tagged
Text object not tagged
Path object not tagged

How do I specify the connection between structural element ParagraphElement and content element FloatingBox,TextFragment,TextSegment?

@Davide_C

You can save directly as tagged PDF once you are done with adding the content in it. You do not need to save and then re-open to carry out the conversion.

Have you checked below documentation article(s) for the code samples? They contain guidelines on how to create a tagged PDF document from the scratch:

Hi, I have this code, very simple:


Calendar now = Calendar.getInstance();
now.setTimeZone(TimeZone.getTimeZone(“UTC”));

document = new com.aspose.pdf.Document();

document.getInfo().setTitle(“title”);
document.getInfo().setAuthor(“author”);
document.getInfo().setSubject(“subject”);
document.getInfo().setKeywords(“keywords”);
document.getInfo().setCreator(“creator”);
document.getInfo().setCreationDate(now.getTime());
document.getInfo().setModDate(now.getTime());
document.setDisplayDocTitle(true);

document.getTaggedContent().setTitle(“Documento PDF Esempio”);
document.getTaggedContent().setLanguage(“it-IT”);

Page page = document.getPages().add();

String text = “TESTO”;

TextFragment textFragment = new TextFragment(text);
com.aspose.pdf.Font font = FontRepository.findFont(“Arial”);
textFragment.setText(text);

page.getParagraphs().add(textFragment);

document.validate(“c:\butta\logtaggedpdf.xml”, PdfFormat.PDF_UA_1);

document.convert(“c:\butta\taggedpdf.xml”, PdfFormat.PDF_UA_1, ConvertErrorAction.Delete);

document.save(os);


can’t convert it to PDF_UA_1

this is the list of errors:

	<General>
		<Problem Severity="Error" Clause="7.1" ObjectID="" Page="1" Convertable="False" Code="7.1:1.1(14.8)">Text object not tagged</Problem>
		<Problem Severity="Need manual check" Clause="7.1" ObjectID="" Page="" Convertable="False" Code="7.1:5">Color contrast</Problem>
	</General>
	<Text>
		<Problem Severity="Need manual check" Clause="7.2" ObjectID="" Page="" Convertable="False" Code="7.2:1">Logical Reading Order</Problem>
	</Text>
	<Fonts>
		<Problem Severity="Warning" Clause="7.21.4.2" ObjectID="16" Page="1" Convertable="True" Code="7.21.4.2">CIDSet is missing or incomplete for font '16'</Problem>
	</Fonts>

how do I solve “Text object not tagged” ?

@Davide_C

Please apply following changes in the code snippet and see if it helps:

page.getParagraphs().add(textFragment);
document.processParagraphs();
document.convert(“c:\butta\taggedpdf.xml”, PdfFormat.PDF_UA_1, ConvertErrorAction.Delete);
document.save(os);

The above code should help you obtaining the tagged PDF as output. In case issue still persists, please share the generated output with us from your system along with the log file i.e. taggedpdf.xml. We will generate an investigation ticket in our issue tracking system and share the ID with you.

Hi, I applied the changes but it still does not generate the tagged pdf as I expected.
I attach the pdf file and the log xml

test2.pdf (84,5 KB)

taggedpdf.7z (816 Byte)

I await news on the resolution. Thanks.

@Davide_C

We are checking it and will get back to you shortly.

@Davide_C

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFJAVA-45040

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hi, I don’t understand such a simple example needs a ticket in production. But has this procedure to create tagged PDFs ever been tested? Excuse me, but have you done the same test on your PC to see if it works? How many developers use this library that is also paid and no one has ever had this problem?

@Davide_C

The creation of tagged PDF from scratch is a functionality that is being worked on currently and we are fixing issues related to this feature in the API. The code sample that you are using is also tested in our environment and we also noticed that the output PDF was not tagged.

On the other hand, the documentation related to this feature is sadly not enhanced and we have been planning on updating the code sample in the official documentation of the API as well.

The ticket has been logged to investigate this behavior of the API internally and determine correct approach to achieve your requirements. We do understand your concerns and value them at the same time. These concerns have been recorded to escalate the ticket internally and as soon as we have some findings and resolve the ticket, we will share with you. We apologize for the inconvenience caused.

Hello,
I would like to point out that Accessibility in PDF production is becoming a very important issue for all public sector institutions as the European Accessibility Act comes into force in Europe in June 2025.
The European Accessibility Act will come into effect on 28 June 2025, and will mark a significant shift in the EU market. Failing to comply with the law could result in penalties including fines for European public-sector organisations.

Is Aspose ready to support us in this challenge?

@fabio.parise

Sure and yes, we have been working on it and we began this implementation since it was speculated to become a standard. The implementation and improvements to the feature is an on-going process that we tend to continue alongside fixing other reported issues in parallel.

Nevertheless, the ticket has been updated and loaded with all of the concerns your shared with us already. We will consider them for sure and as soon as we have some updates to share, we will inform you in this very thread. Please spare us some time.

We apologize for the inconvenience caused.