We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Incorrect tagging for a table which has vertical headers

Hi,
We have Aspose.Total license and we are using Aspose v21.7.0 for converting HTML to Word and PDF document.

I have a table that has vertical headers, that is, out of the two columns, all the cells of the first column are TH elements while all the cells of the second column are TD elements.

But, specifically in the first row of the table, both the elements are being tagged as TH, instead of a TH and TD respectively, when converted from HTML to pdf. The elements in the rest of the rows get appropriately tagged as TH and TD respectively. Please refer to the screenshots for better understanding:
Actual.JPG (31.0 KB)
Expected.JPG (30.5 KB)
.

What I am expecting is:
col1 --------------- col2
Header1 ----------- Data
(tagged as th)-------- (tagged as td)

Header2 ----------- Data
(tagged as th)-------- (tagged as td)

Header3 ----------- Data
(tagged as th)-------- (tagged as td)

And what I am getting instead is:
col1 --------------- col2
Header1 ----------- Data
(tagged as th)-------- (tagged as th)

Header2 ----------- Data
(tagged as th)-------- (tagged as td)

Header3 ----------- Data
(tagged as th)-------- (tagged as td)

TH_TD_Tag_Aspose.zip (488 Bytes)

Please find attached the sample HTML which can be used to reproduce the issue.

As we are validating the generated document for ADA compliance, it is really important to us that the TH and TD tagging is done as per the expected guidelines. We are using Adobe Acrobat tool to validate the tags.

@SanjanaG

Can you please share your sample code snippet that you are using to generate PDF and Word files from this HTML? We will test the scenario in our environment and address it accordingly.

@asad.ali Please find attached the sample HTML in this word file:
Sample_Code.docx (12.4 KB)

@SanjanaG

Thanks for sharing the sample HTML. However, we requested for the sample code snippet that you are using in C# or Java to converting the HTML into PDF and Word using Aspose.PDF or Aspose.Word.

This can be reproduced in the online converter for a quick test (if that helps): https://products.aspose.app/words/conversion/html-to-word

Here is the sample code requested by you @asad.ali

// This is an abstract of what we have in code

	
var inputStream = /* Stream of the html */
var doc = new Aspose.Words.Document(inputStream, new LoadOptions { Encoding = Encoding.UTF8 });

var filename = /* some pdf file name */
var stream = new MemoryStream();
var saveOptions = SaveOptions.CreateSaveOptions(filename);

doc.Save(stream, saveOptions);

@SanjanaG You have encountered the expected behavior. The first row or rows marked as header row are tagged as TH. The first column also is tagged as TH. Aspose.Words mimics MS Word behavior here. If you open your HTML document in MS Word and save it as PDF you will notice the same. image.png (12.5 KB)