On HTML conversion, background color on heading paragraph is shifted to the right after empty paragraph removal

Dear all,

This piece of code below cleans the document from empty paragraphs before HTML conversion:

Private Function isEmptyParagraph(ByVal p As Words.Paragraph) As Boolean
	' An empty paragraph is either a Paragraph Node without child or
	' a paragraph containing only "Run" (text) nodes which are
	' blank or null strings. If anyCount = runCount, then this Paragraph
	' contains only text.
	Dim anyCount As Integer = p.GetChildNodes(Words.NodeType.Any, True).Count
	Dim runCount As Integer = p.GetChildNodes(Words.NodeType.Run, True).Count

	Return anyCount = 0 OrElse
		runCount = anyCount AndAlso
		p.ToString(Words.SaveFormat.Text).Trim = vbNullString
End Function

Dim doc = New Words.Document("C:\Path\To\heading_background.docx")
Dim ps As Words.NodeCollection = doc.GetChildNodes(Words.NodeType.Paragraph, True)
For Each p As Words.Paragraph In ps
	If isEmptyParagraph(p) Then
		p.Remove()
		Continue For
	End If
Next
doc.Save("C:\Path\To\heading_background.html")

As you can see in the attachment, the grey background is shifted to the right. Did I do something wrong?

We currently use Aspose.Words 18.10.

Best regards,
Monir

heading_background.zip (25.7 KB)

@monir.aittahar

You are getting the correct output according to the code example. The background color is set for the paragraph. If you remove the empty paragraph using MS Word and convert DOCX to HTML, you will get the same output.

@tahir.manzoor,

Thanks for your reply. Indeed Words shifts the background to the right, and even if the empty paragraphs aren’t removed. But when it does so, the background starts perfectly with the beginning of the paragraph content.

However, Aspose.Words doesn’t shift the background if the empty paragraph is not removed, and does it when they are, but in this case it starts a little more to the right.

In the attachment I just added, you’ll find the following files with the obtained results:

  • heading_background.docx, the original DOCX file: the background covers all the paragraph, including the heading number
  • heading_background_converted_by_word.htm, converted by Word as is: the background covers only the heading paragraph content
  • heading_background_converted_by_word-without_empty_ps.htm, empty paragraphs removed manually, converted by Word: the background covers only the heading paragraph content
  • heading_background-converted_by_aspose.html,converted by Aspose, “as is”: the background covers the whole heading paragraph content
  • heading_background-converted_by_aspose-without_empty_ps.html, empty paragraphs removed programatically (with the code embedded in my first post) and converted by Aspose: the background starts slightly after the “C” of “Conditions lorem ipsum”.

heading_background-detailed.zip (48.4 KB)

@monir.aittahar

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-18637. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

@tahir.manzoor,

Thak you very much for your reply.

Best regards,
Monir

The issues you have found earlier (filed as WORDSNET-18637) have been fixed in this Aspose.Words for .NET 22.9 update also available on NuGet.