Word to HTML Conversion Stuck

While converting this document (see attached) from Word to HTML we experience some kind of infinite loop

We use a custom saving strategy that looks like:

// Convert to HTML using HtmlFixedSaveOptions
var options = new HtmlFixedSaveOptions
{
  CssClassNamesPrefix = "pre-",
  PageIndex = 1,
  PageCount = document.PageCount,
  ShowPageBorder = false,
  PageSavingCallback = new DataFlowPageSavingCallback(DataflowPipelineUnit),
  ResourceSavingCallback = new DataFlowResourceSavingCallback(AnotherPipelineUnit),
  ExportEmbeddedFonts = false,
  ExportEmbeddedCss = true,
  FontFormat = ExportFontFormat.Ttf
}

using (var ms = Stream.Null)
{
  document.Save(ms, options);
}

A subsequent call to document.GetPagesMetadata() reveals that there are 21 pages (which is true).
The PageSavingCallback is only called 20 times, indicating that processing is stuck for one of the pages.

After testing multiple changes, it seems that we’ve narrowed it down to the Table of Contents.
Putting an empty paragraph (hitting enter) between the section break and the Table of Contents allows the document to be converted successfully.

The working theory is that it is some weird interaction with Section Breaks and the Table of Contents.

I’ve attached a minimal version of the document that still has the issue.
Aspose Minimal Reproduction.zip (104.4 KB)

@EHailey

We have converted the shared document to HtmlFixed using the latest version of Aspose.Words for .NET 19.10 and have not found the shared issue. So, please use Aspose.Words for .NET 19.10.

@tahir.manzoor
After attempting to reproduce what you may have been seeing, I believe that I have found the root cause.

This document has 2 pages (verified with document.PageCount).
The PageSavingCallback is called by Aspose 2 times (once for each page).
After calling document.Save(), document.PageCount is updated to 3 pages.

Console Output

PAGES BEFORE SAVE: 2
Saving page 1 // PageSavingCallback
Saving page 2 // PageSavingCallback
PAGES AFTER SAVE: 3

Our code checks for completion by comparing the number of saved pages to document.PageCount.
Since there are only 2 pages saved, 2 will never equal 3 and we get an infinite loop.

Why is document.PageCount set to 3 after saving when there are only 2 pages to save?

@EHailey

Please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing. We will investigate the issue on our side and provide you more information.

Hello @tahir.manzoor,

I have added a link to a console application that demonstrates the issue.

Since there are only two pages produced, we would expect that the PageCount would be set to 2, but it is instead set to 3 after the save function is called.

Dropbox link: Dropbox - AsposeTest.zip - Simplify your life
It was too big to attach.

@EHailey

Please call Document.UpdatePageLayout method as shown below to get the correct output.

using (var ms = Stream.Null)
{
    document.Save(ms, options);
}
document.UpdatePageLayout();
Console.WriteLine($"PAGES AFTER SAVE: {document.PageCount}");

@tahir.manzoor

I have applied the recommended change and while it did work for this specific document, it is not a reliable fix as it did not work for a different document.

When running the second document, the output is:
Pages before saving: 43
Pages after saving: 44
Pages after calling UpdatePageLayout: 44

@EthanHailey

Could you please attach the input Word document for which you are facing this issue? We will investigate the issue on our side and provide you more information.

@tahir.manzoor

I have managed to create a redacted version of the document that demonstrates the bug even when the suggested fix is applied.

Dropbox Link: Dropbox - AsposeTestv2.zip - Simplify your life

edit: You will need to provide a license since the doc is 40+ pages

@EHailey

We have tested the scenario and have managed to reproduce the same issue at our side. For the sake of correction, we have logged this problem in our issue tracking system as WORDSNET-19395. You will be notified via this forum thread once this issue is resolved.

We apologize for your inconvenience.

The issues you have found earlier (filed as WORDSNET-19395) have been fixed in this Aspose.Words for .NET 22.2 update also available on NuGet.