How to fix Title, Primary language, and Tagged PDF Meta Data of PDF output for Accessibility Check

aclaudio · October 23, 2020, 12:53am

I would just to have like to take the HTML to PDF result, and then auto-tag it - because at the moment since there is no tagged content, I can’t do anything about programmatically tagging those elements.

asad.ali · October 23, 2020, 5:21pm

@aclaudio

It is similar to convert the PDF result into tagged PDF and we have tried that already. We tried to convert the document directly into PDF/UA but image was not tagged in the output for alternate text. We then tried to tag the images by accessing ITaggedContent but API did not find any StructureElement in the PDF.

HtmlLoadOptions htmlLoadOptions = new HtmlLoadOptions();
htmlLoadOptions.PageInfo.Width = Aspose.Pdf.PageSize.PageLegal.Width;
htmlLoadOptions.PageInfo.Height = Aspose.Pdf.PageSize.PageLegal.Height;
htmlLoadOptions.PageInfo.Margin = new MarginInfo(0, 48, 0, 36);

Document doc = new Document(dataDir + "testing_data.html", htmlLoadOptions);
doc.Convert(dataDir + "validationlog.xml", PdfFormat.PDF_UA_1, ConvertErrorAction.Delete);
doc.Save(dataDir + "pdf209.pdf");

doc = new Document(dataDir + "pdf209.pdf");
ITaggedContent taggedContent = doc.TaggedContent;
StructureElement rootElement = taggedContent.RootElement;

// Set title for tagged pdf document
taggedContent.SetTitle("Document with images");
foreach (FigureElement figureElement in rootElement.FindElements<FigureElement>(true))
{
 // Set Alternative Text  for Figure
 figureElement.AlternativeText = "Figure alternative text (technique 2)";
}
doc.Save(dataDir + "taggedimage.pdf");

This scenario has already been logged under the ticket ID PDFNET-48846 which is currently under the phase of investigation. We will surely inform you as soon as the ticket is resolved. However, if there is something which we missed or could not understand correctly, please let us know.

aclaudio · October 23, 2020, 8:17pm

Right I know the image element could not be tagged - however this is also true for the rest of the elements correct, that there is no way to tag any of the elements that are existing there already via the HTML to PDF approach I provided examples of?

asad.ali · October 26, 2020, 8:05pm

@aclaudio

Yes you are right. And all this information has been logged under the related ticket. We will investigate it from every aspect and will inform you for sure as soon as it is resolved. We highly appreciate your patience and comprehension in this regard. Please give us some time.

We apologize for the inconvenience caused.

alok.agarwal1979 · July 14, 2021, 7:13am

Any update on PDFNET-48846 is it implemented in aspose 21.5 version?

asad.ali · July 14, 2021, 4:03pm

@alok.agarwal1979

We are afraid that the earlier logged ticket is not yet resolved. We will surely update this forum thread as soon as the ticket is fully investigated and fixed. Please give us some time.

We are sorry for the inconvenience.

alok.agarwal1979 · July 30, 2021, 4:59am

Hello,
Do we have any support for creating pdf without any accessibility issues from html page using Aspose. I looked around in the documentation and support, could not get any satisfactory search.
Regards,
Alok

asad.ali · July 30, 2021, 5:56pm

@alok.agarwal1979

We are afraid that required functionality is currently under the phase of investigation. The related ticket is PDFNET-48846 which has been linked to this thread. As soon as the feature is available, we will update in this forum thread. Please be patient and give us some time.

We are sorry for the inconvenience.

oglesbyw · March 11, 2023, 3:03am

Was this ever resolved? Is there any support for creating a pdf without any accessibility issues from html page using Aspose?

asad.ali · March 11, 2023, 10:18am

@oglesbyw

We are sorry to share that the investigation of the earlier logged ticket is not yet completed due to other high-priority and blocker tasks. Nevertheless, the ticket has been escalated to the next level of priority and your concerns have been recorded. We will inform you as soon as we have some news about the ticket resolution ETA. We apologize for the delay and the inconvenience.

bach97 · September 18, 2023, 12:42pm

So we are still not able to convert HTML to Tagged PDF in any way with Aspose?

asad.ali · September 18, 2023, 7:51pm

@bach97

We are afraid that it is not yet implemented. The feature is under the consideration. However, it is taking more time to get implemented. As soon as we implement it, we will update you in this forum thread. We apologize for the inconvenience.