When converting a DOCX to HTML, a page break right before a heading breaks it


#1

I tried to convert a document with the code below (Aspose.Words 18.10, we cannot currently test with the last version since we didn’t yet renewed the license):

' VB.NET code
Dim doc As Words.Document = New Words.Document(srcPath)
doc.Save(destPath)

This document have a page break right before a heading title. The HTML conversion shows right after the number of the title a line break:

<h2 style="margin-top:12pt; margin-left:28.8pt; margin-bottom:12pt; text-indent:-28.8pt; page-break-after:avoid; font-size:13pt"><span style="font-family:'Times New Roman'">1.1</span>
<!-- a bunch of span elements and named anchors -->
<br style="page-break-before:always; clear:both">
<!-- a bunch of span elements and named anchors -->
<span style="font-family:'Times New Roman'">ipsum</span></a>
<span style="font-family:'Times New Roman'"> </span></h2>

The DOCX sample and its HTML conversion are attached as a zip file.

Is this yet fixed in a newer version?

pagebreak-header_rendering.zip (30.1 KB)


#2

@monir.aittahar

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-18511. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.


#3

@tahir.manzoor,

Got some news about this issue. Sounds like the PB inside a run are causing this bug. For the moment, the workaround I use is reproduced below (thanks to your page How to Remove Page and Section Breaks):

For Each p As Words.Paragraph In doc.GetChildNodes(Words.NodeType.Paragraph, True)
    For Each r As Words.Run In p.Runs
        If r.GetText.Contains(Words.ControlChar.PageBreak) Then
            r.Text = r.GetText.Replace(Words.ControlChar.PageBreak, vbNullString)
        End If        
    End For
End For

#4

@monir.aittahar

It is nice to hear from you that you have found the workaround of your issue. We will inform you via this forum thread once this issue is resolved.


#5

The issues you have found earlier (filed as WORDSNET-18511) have been fixed in this Aspose.Words for .NET 19.8 update and this Aspose.Words for Java 19.8 update.


#6

Dear all,

Thank you, I confirm this is fixed.

Best Regards.