Issue with Unwanted Blank Page in Tagged Table PDF

Hi all,
I am trying to create a PDF where I need to insert a tagged table. I want to write the table directly manipulating the content, and then associate the tags using BDC. I can do everything, but I don’t understand why, in the end, it adds a blank page containing only the table marker (/Table << /MCID 15 >> BDC EMC).

Below my code :

var document = new Document();
ITaggedContent content = document.TaggedContent;
StructTreeRootElement structTreeRootElement = content.StructTreeRootElement;
StructureElement rootElement = content.RootElement;
rootElement.ClearChilds();

Page p = document.Pages.Insert(1);
p.SetPageSize(595.276, 841.8898);
​​

BDC bdcTable = new BDC(“Table”, new BDCProperties(“en”));
BDC bdcTR1 = new BDC(“TR”, new BDCProperties(“en”));
BDC bdcTH1 = new BDC(“TH”, new BDCProperties(3, “en”));
BDC bdcTH2 = new BDC(“TH”, new BDCProperties(4, “en”));
BDC bdcTR2 = new BDC(“TR”, new BDCProperties(“en”));
BDC bdcTD1 = new BDC(“TD”, new BDCProperties(6, “en”));
BDC bdcTD2 = new BDC(“TD”, new BDCProperties(7, “en”));
BDC bdcPar1 = new BDC(“P”, new BDCProperties(8, “en”));

string insidefontName = “”;
var fonts = p.Resources.GetFonts(true);
if (fonts.Count == 0)
fonts.Add(FontRepository.FindFont(“ArialMT”), out insidefontName);

p.Contents.Add(new Aspose.Pdf.Operators.GSave());
p.Contents.Add(new Aspose.Pdf.Operators.BT());
p.Contents.Add(bdcTable);
p.Contents.Add(bdcTR1);
p.Contents.Add(bdcTH1);
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(bdcTH2);
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(bdcTR2);
p.Contents.Add(bdcTD1);
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(bdcTD2);
p.Contents.Add(bdcPar1);
p.Contents.Add(new Aspose.Pdf.Operators.SetRGBColor(0, 0, 0));
p.Contents.Add(new Aspose.Pdf.Operators.SelectFont(insidefontName, 12));
p.Contents.Add(new Aspose.Pdf.Operators.SetTextMatrix(1, 0, 0, 1, 50, 750));
p.Contents.Add(new Aspose.Pdf.Operators.ShowText(“hello hello”, fonts[1]));
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(new Aspose.Pdf.Operators.EMC());
p.Contents.Add(new Aspose.Pdf.Operators.ET());
p.Contents.Add(new Aspose.Pdf.Operators.GRestore());

​TableElement table = content.CreateTableElement();
rootElement.AppendChild(table);

TableTHeadElement thead = table.CreateTHead();
var tbody = table.CreateTBody();

TableTRElement row1 = thead.CreateTR();
TableTHElement cellH1 = row1.CreateTH();
cellH1.Tag(bdcTH1);
TableTHElement cellH2 = row1.CreateTH();
cellH2.Tag(bdcTH2);

TableTRElement row2 = tbody.CreateTR();
TableTDElement cellD1 = row2.CreateTD();
cellD1.Tag(bdcTD1);
TableTDElement cellD2 = row2.CreateTD();

var par1 = content.CreateParagraphElement();
cellD2.AppendChild(par1);
par1.Tag(bdcPar1);


document.Save(output);​

The result is a two-page PDF, where the first page contains the tagged logical table with only one paragraph in the last cell, and the second page is empty, containing only the table marker​​

Why is the second page created?

@Santo85

The issue of an unwanted blank page appearing in your tagged table PDF is likely due to the way the content is being structured and added to the document. When you manipulate the content directly and use the BDC (Begin Dictation Context) and EMC (End Dictation Context) operators, it’s essential to ensure that the logical structure of the document is correctly maintained.

In your code, after adding the table marker with the BDC operator, you should ensure that all content related to the table is properly encapsulated within the BDC and EMC operators. If there are any mismatches or if the content is not correctly associated with the table structure, it can lead to the creation of an empty page.

Here are a few suggestions to troubleshoot and potentially resolve the issue:

  1. Check the Structure of BDC and EMC: Ensure that every BDC has a corresponding EMC. In your code, you have multiple EMC calls, but make sure they are correctly paired with their respective BDC calls.

  2. Review the Content Addition: When you add content to the page, ensure that it is being added in the correct order and that the logical structure is maintained. For example, if you are adding a table, make sure that all rows and cells are properly defined and that they are not inadvertently causing a page break.

  3. Use the CreateTableElement Method: Instead of manually adding the table structure using BDC and EMC, consider using the CreateTableElement method provided by Aspose. This method automatically handles the logical structure for you, which can help avoid issues with empty pages.

  4. Debugging: You can add some debugging statements to check the state of the document after each major operation. This can help you identify where the unwanted page is being introduced.

Here is a simplified example of how to create a tagged table using the CreateTableElement method:

Document document = new Document();
ITaggedContent taggedContent = document.TaggedContent;
StructureElement rootElement = taggedContent.RootElement;

TableElement tableElement = taggedContent.CreateTableElement();
rootElement.AppendChild(tableElement);

TableTHeadElement thead = tableElement.CreateTHead();
TableTBodyElement tbody = tableElement.CreateTBody();

TableTRElement headerRow = thead.CreateTR();
TableTHElement headerCell = headerRow.CreateTH();
headerCell.SetText("Header");

TableTRElement bodyRow = tbody.CreateTR();
TableTDElement bodyCell = bodyRow.CreateTD();
bodyCell.SetText("Content");

document.Save("output.pdf");

This approach ensures that the table is correctly structured and should help eliminate the unwanted blank page issue.

If the problem persists, consider checking the Aspose documentation or community forums for additional insights on handling tagged PDFs and troubleshooting similar issues.

Sources:
[1]: CreateTableElement.cs
[2]: InlineStructureElements.cs
[3]: Working with Accessibility and Tagged PDFs

Hi,

The problem remains the same. Do you have any other solutions?
thanks

@Santo85

Have you tried with 24.11 version? Can you please share the PDF document generated using this code at your end?

Hi,

Yes, I am already testing with version 24.11. Attached you will find the generated PDF. Thank you.
test.pdf (79,3 KB)

@Santo85

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-58893

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.