HTML-to-Tagged-PDF: CSS border/background path objects not tagged as Artifacts — PDF/UA 7.1:1.1 on every page (v26.x)

<!doctype html>

Bug Report: Path Objects Not Tagged as Artifacts in Tagged PDF Output body { font-family: Segoe UI, Arial, sans-serif; font-size: 10pt; line-height: 1.5; color: #222; max-width: 900px; margin: 20px auto; padding: 0 16px; } h1 { font-size: 16pt; color: #1a1a2e; border-bottom: 3px solid #1a1a2e; padding-bottom: 4px; margin-bottom: 8px; } h2 { font-size: 13pt; color: #16213e; border-bottom: 1px solid #ccc; padding-bottom: 3px; margin: 20px 0 8px 0; } h3 { font-size: 11pt; color: #0f3460; margin: 14px 0 4px 0; } pre { background: #f5f5f5; border: 1px solid #ddd; border-radius: 4px; padding: 10px 14px; font-family: Consolas, monospace; font-size: 9pt; overflow-x: auto; white-space: pre-wrap; line-height: 1.4; } code { font-family: Consolas, monospace; font-size: 9pt; background: #f0f0f0; padding: 1px 4px; border-radius: 2px; } table.info { border-collapse: collapse; margin: 8px 0; font-size: 9pt; } table.info th, table.info td { border: 1px solid #999; padding: 4px 8px; text-align: left; vertical-align: top; } table.info th { background: #e8eaf6; width: 160px; } .error { background: #fdecea; border-left: 4px solid #d32f2f; padding: 8px 12px; margin: 8px 0; } .note { font-style: italic; color: #555; margin: 6px 0; } .tag { display: inline-block; font-size: 8pt; font-weight: bold; padding: 1px 6px; border-radius: 3px; background: #d32f2f; color: #fff; } hr { border: none; border-top: 1px solid #ddd; margin: 20px 0; }

Bug Report: Path Objects Not Tagged as Artifacts in Tagged PDF Output

ProductAspose.HTML for .NET + Aspose.PDF for .NET
VersionsAspose.HTML 26.1.0, Aspose.PDF 26.2.0
Platform.NET 10, Windows
SeverityPDF/UA-1 FAILURE — every page, every document
PDF/UA Clause7.1:1.1 (ISO 14289-1, Section 14.8)
Error Message"Path object not tagged"

Summary

When converting HTML to tagged PDF using Aspose.Html.Converters.Converter.ConvertHTML() with IsTaggedPdf = true, decorative vector path operators generated from CSS borders and background fills are not wrapped in BMC("Artifact")/EMC marked-content sequences. This causes a "Path object not tagged" PDF/UA-1 validation error on every page of every converted document.

These paths are purely decorative (CSS border on headings, table cell borders, background-color fills) and should be marked as Artifacts per PDF/UA-1, which requires that all content in a tagged PDF is either part of the structure tree or explicitly marked as an artifact.


Reproduction Steps

Minimal HTML (test_path_bug.html)

<!doctype html>
<html lang="en">
<head>
  <meta charset="utf-8" />
  <style>
    h2 {
      background-color: #D3D3D3;
      border: 1px solid #000;
      padding: 4px 6px;
      font-size: 11pt;
    }
    table {
      width: 100%;
      border-collapse: collapse;
    }
    th, td {
      border: 1px solid #000;
      padding: 4px 5px;
    }
    th { background-color: #D9D9D9; }
  </style>
  <title>Path Bug Reproduction</title>
</head>
<body>
  <h1>Test Document</h1>
  <h2>Section with border and background</h2>
  <table>
    <thead>
      <tr><th scope="col">Column A</th><th scope="col">Column B</th></tr>
    </thead>
    <tbody>
      <tr><td>Value 1</td><td>Value 2</td></tr>
    </tbody>
  </table>
</body>
</html>

C# Code

using var htmlDoc = new Aspose.Html.HTMLDocument("test_path_bug.html");

var options = new Aspose.Html.Saving.PdfSaveOptions();
options.IsTaggedPdf = true;  // Enable tagged/accessible PDF output

Aspose.Html.Converters.Converter.ConvertHTML(htmlDoc, options, "output.pdf");

// Validate
using var pdfDoc = new Aspose.Pdf.Document("output.pdf");
bool valid = pdfDoc.Validate("validation.xml", Aspose.Pdf.PdfFormat.PDF_UA_1);
// valid == false, validation.xml contains "Path object not tagged" on every page

Expected vs Actual

Expected Decorative path operators (CSS borders, background fills) are wrapped in BMC("Artifact") ... EMC marked-content sequences, so they are excluded from the structure tree and pass PDF/UA-1 validation.
Actual Path operators (re, m, l, S, f, etc.) appear in the content stream outside any marked-content sequence. The PDF/UA validator reports "Path object not tagged" for every page.

Validation Output

Error on every page:
Severity="Error" Clause="7.1" Code="7.1:1.1(14.8)" — "Path object not tagged"

Test Results from Real Documents

DocumentPages"Path not tagged" Errors
bret_v2.html (definition lists + 1 table)1212 (one per page)
dannie_v2.html (large table with nested sub-tables)22 (one per page)
Minimal repro (heading + table)11

The error occurs regardless of document complexity. Any HTML with CSS borders or background colors will trigger it on every page.


Content Stream Analysis

Inspecting the PDF content stream shows the problem clearly. The path operators that draw CSS borders and background fills sit outside any BDC/BMC/EMC marked-content block:

% === Tagged text content (correct) ===
/P <</MCID 0>> BDC      % Begin marked content (paragraph)
  BT
    /F1 11 Tf
    (Section with border) Tj
  ET
EMC                          % End marked content

% === Decorative border (BUG: no BMC/EMC wrapper) ===
0.502 0.502 0.502 rg        % Set gray fill color
56.7 725.3 680.0 18.5 re    % Draw rectangle (heading background)
f                            % Fill
0 0 0 RG                    % Set black stroke
56.7 725.3 680.0 18.5 re    % Draw rectangle (heading border)
S                            % Stroke
% ^^^ These operators are NOT inside any marked content sequence

What the output should look like:

% Decorative border (correct: wrapped as Artifact)
BMC /Artifact                % Mark as artifact
  0.502 0.502 0.502 rg
  56.7 725.3 680.0 18.5 re
  f
  0 0 0 RG
  56.7 725.3 680.0 18.5 re
  S
EMC                          % End artifact

CSS Properties That Trigger This Bug

CSS PropertyPDF Path OperationTagged?
border: 1px solid #000re ... S (rectangle + stroke)No
background-color: #D3D3D3re ... f (rectangle + fill)No
border-collapse: collapse on tablem ... l ... S (line segments)No
border-bottom: 1px solidm ... l ... S (line + stroke)No
hr elementm ... l ... SNo

Every CSS property that generates a vector drawing operation in the PDF content stream is affected.


Impact

  • 100% failure rate — every HTML document with any CSS borders or backgrounds fails PDF/UA validation
  • Cannot be worked around from code — the path operators are generated internally by Aspose.HTML's rendering engine; there is no API to control artifact tagging during conversion
  • Post-processing is extremely fragile — manually inserting BMC("Artifact")/EMC operators after conversion requires matching Aspose's internal operator ordering, which can change between versions
  • ADA/Section 508 compliance — this is the single remaining blocker for full PDF/UA-1 compliance in our conversion pipeline (all other issues have been resolved via post-processing or HTML author fixes)

Workaround Attempted (Not Viable)

We considered post-processing the PDF content stream to wrap orphaned path operators in BMC("Artifact")/EMC blocks. This approach is not viable because:

  1. It requires parsing Aspose's internal operator ordering to distinguish "orphaned" paths from paths that are already inside a marked-content sequence
  2. The operator indices and groupings can change with any Aspose version update
  3. There is no reliable way to distinguish decorative paths (should be artifacts) from meaningful paths (e.g., SVG content that should be in the structure tree)
  4. Inserting operators at wrong positions can corrupt the content stream

Environment

Aspose.HTML26.1.0 (NuGet)
Aspose.PDF26.2.0 (NuGet)
Runtime.NET 10.0, Windows 10/11 x64
Conversion APIAspose.Html.Converters.Converter.ConvertHTML() with PdfSaveOptions.IsTaggedPdf = true
Validation APIAspose.Pdf.Document.Validate(path, PdfFormat.PDF_UA_1)

Request

Please update Aspose.HTML's tagged PDF renderer to wrap all decorative path operators (those generated from CSS borders, background colors, and fills) in BMC("Artifact") ... EMC marked-content sequences. This is required by PDF/UA-1 clause 7.1 (ISO 14289-1, 14.8) which states that all content in a conforming tagged PDF must be either part of the structure tree or marked as an artifact.

Windsor Solutions — February 2026

@ted-1

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): HTMLNET-6957

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.