InsertHtml formatting

Hello,
i have problem with tables. When I use the insertHtml function, when I open a later pdf document using Screen reader I get thead tags automatically. The problem I have is that when I use TableStyleOptions.None later, when I inject with insertHtml the th tag, it is automatically ignored. What to do so that the th tag is only used when I want it.

@kris12 Could you please elaborate the problem in more detains and provide a simple code that will allow us to reproduce the problem? Also it would be handy of you attach your current and expected output documents. We will check the issue and provide you more information.

@alexey.noskov

<table style="border-collapse: collapse; width: 89.8381%; height: 73px" border="1">
	<thead>
		<tr style="height: 18px">
			<td style="width: 20%; height: 18px">
				<h2>asdf</h2>
			</td>
			<td style="width: 14.3646%; height: 18px">
				<h2>asdf</h2>
			</td>
			<td style="width: 25.7653%; height: 18px">
				<h2>asdf</h2>
			</td>
			<td style="width: 20.0015%; height: 18px">
				<h2>asdf</h2>
			</td>
			<td style="width: 11.2551%; height: 18px">
				<h2>&nbsp;</h2>
			</td>
		</tr>
	</thead>
	<tbody>
		<tr style="height: 18px">
			<td style="width: 20%; height: 19px">asdf</td>
			<td style="width: 14.3646%; height: 19px">"</td>
			<td style="width: 25.7653%; height: 19px">
				<table style="border-collapse: collapse; width: 100%" border="1">
					<tbody>
						<tr>
							<td style="width: 50%">ddsads</td>
							<td style="width: 50%">dsadsa</td>
						</tr>
						<tr>
							<td style="width: 50%">dsadsa</td>
							<td style="width: 50%">addas</td>
						</tr>
					</tbody>
				</table>
			</td>
			<td style="width: 20.0015%; height: 19px">4</td>
			<td style="width: 11.2551%; height: 19px">
				<div>
					<div>
						<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed non nibh magna. Nullam id elementum est. Nullam eget lobortis nisi. Aenean lobortis condimentum eleifend. Nulla posuere neque laoreet lacus interdum placerat. Pellentesque hendrerit sem nec odio molestie lacinia. Aenean felis nisl, volutpat sit amet mi non, ullamcorper sagittis tortor. Aliquam sagittis massa ipsum, vel lacinia mi semper quis. Curabitur maximus, nisl a aliquam ornare, nunc justo finibus massa, sed egestas lorem magna sit amet erat. Donec velit nibh, elementum vitae nisl non, sagittis lobortis nibh. Sed erat sem, pulvinar ac tincidunt id, rhoncus id orci. Phasellus sit amet odio vel arcu scelerisque venenatis. Nunc non odio ac enim ullamcorper scelerisque sit amet ac leo. Donec et nisi sed urna cursus pulvinar ut eget massa.</p>
					</div>
				</div>
			</td>
		</tr>
		<tr style="height: 18px">
			<td style="width: 20%; height: 18px">“</td>
			<td style="width: 14.3646%; height: 18px">as</td>
			<td style="width: 25.7653%; height: 18px">ad</td>
			<td style="width: 20.0015%; height: 18px">das</td>
			<td style="width: 11.2551%; height: 18px">&amp;ldquo;</td>
		</tr>
	</tbody>
</table>

I am injecting this html
here you have code snippet

Document document = args.Document;
DocumentBuilder builder = new DocumentBuilder(document);
builder.MoveToMergeField(args.DocumentFieldName);
builder.InsertHtml(args.FieldValue.ToString(), HtmlInsertOptions.PreserveBlocks);
foreach (Table table in document.GetChildNodes(NodeType.Table, true))
{
    table.StyleOptions = TableStyleOptions.None;
}

and this is the result:

In this case should be in the first row th tag. if i dont use tyblestyleoptions, i got th tag in all first cells in rows and columns. How to customise formatting of this.

@kris12 Please use the following code:

foreach (Table table in doc.GetChildNodes(NodeType.Table, true))
{
    table.StyleOptions = TableStyleOptions.None | TableStyleOptions.FirstRow;
}

@alexey.noskov
Unfortunately, this also sets th in other tables in Document, where it is not set
obraz.png (36,5 KB)

@kris12 Yes, the is expected, since the code loop over all tables in the document. Try using the following code if you need to modify only the just inserted table:

// Get the last table inserted by DocumentBuilder
Node currentNode = builder.CurrentParagraph;
while (currentNode != null && currentNode.NodeType != NodeType.Table)
    currentNode = currentNode.PreviousSibling;
Table table = (Table)currentNode;
if (table != null)
    table.StyleOptions = TableStyleOptions.None | TableStyleOptions.FirstRow;

@alexey.noskov the following code doesn’t work. Result is like default setting of table. I used document object directly before, because changes called on the builder did not bring any changes to the document.

@kris12 I have used the following code for testing:

Document doc = new Document();
DocumentBuilder builder = new DocumentBuilder(doc);
builder.InsertHtml(File.ReadAllText(@"C:\Temp\in.html"));

// Get the last table inserted by DocumentBuilder
Node currentNode = builder.CurrentParagraph;
while (currentNode != null && currentNode.NodeType != NodeType.Table)
    currentNode = currentNode.PreviousSibling;
Table table = (Table)currentNode;
if (table != null)
    table.StyleOptions = TableStyleOptions.None | TableStyleOptions.FirstRow;

doc.Save(@"C:\Temp\out.pdf", new PdfSaveOptions() { Compliance = PdfCompliance.PdfUa1 });

out.pdf (58.8 KB)

@alexey.noskov
I tried 2 ways(referring after the builder has no effect at all).
Here is the first:

Node currentNode = builder.CurrentParagraph;
while (currentNode != null && currentNode.NodeType != NodeType.Table)
    currentNode = currentNode.PreviousSibling;
Table currentTable = (Table)currentNode;
if (currentTable != null)
{
    NodeCollection tables = document.GetChildNodes(NodeType.Table, true);
    var ctText = currentTable.GetText();
    var tableByText = tables.Where(x => x.GetText() == ctText).FirstOrDefault();
    (tableByText as Table).StyleOptions = TableStyleOptions.None | TableStyleOptions.FirstRow;
}

and this is result:


Change only in the middle table

And the second one:

foreach (Paragraph paragraph in document.GetChildNodes(NodeType.Paragraph, true))
{
    Node currentNode = paragraph.PreviousSibling;
    //if (currentNode == builder.CurrentParagraph)
    //{
    Table tableaCurrentNode = currentNode as Table;
    if (tableaCurrentNode != null)
    {
        if (args.FieldValue.ToString().Contains("<thead>"))
        {
            tableaCurrentNode.StyleOptions = TableStyleOptions.None | TableStyleOptions.FirstRow;
        }
        else
        {
            tableaCurrentNode.StyleOptions = TableStyleOptions.None;
        }
    }
    //}
}

In commented code i tried to get appropriate paragrahp. This piece of code unfortunately changes the entire document. Only this paragraph is to be edited in the solution.
Do you have any idea, how to make it without document builder?

@kris12 The first approach is incorrect because instead of using the currentTable you use another table detected by this code var tableByText = tables.Where(x => x.GetText() == ctText).FirstOrDefault();.

Could you please create a simple application that will allow us to reproduce the problem? We will check it on our side and provide you more information.

@alexey.noskov Unfortunately, I cannot share more code. Generally, my suspicions are directed to the fact that the file is created in several places in the code. Here is the part that pulls the hmtl from user and it uses the Aspose.words library . This is further transferred using streaming and the writing itself is already done using the Aspose.pdf library.

@kris12 I am afraid it is difficult to say what the problem is without ability to reproduce the problem on our side. As shown in my previous reply the PDF produced by Aspose.Words is correct.