Converting DOCX to PDF producing wrong numbers in HTML ordered lists

I have a simple Word dcoument that contains two ordered lists (whch were added as HTML altchunks). Each of these lists contains two list items, numbered 1. and 2. respectively.

When converting this Word document to PDF, the PDF ends up containing the wrong numbers in the second list. Instead of being numbered 1. and 2., the items in the second list are numbered 3. and 4.

It appears that Aspose Words is not resetting the number to 1 when it encounters a new ordered list. Instead, it seems to just continue the sequence of numbers as if the list items (across both lists) are all part of one big list.

Just for reference, below is the actual HTML that was embedded into the Word document for each of the two ordered lists.

<html>
  <body>
    <ol>
      <li>aaaaaaaaaaaa</li>
      <li>bbbbbbbbbbbbb</li>
    </ol>
  </body>
</html>

Attached is the input file (DOCX), as well as the output file (PDF) which was generated using Aspose Words v22.3.0 in a .NET 5.0 console application.

sample.docx (10.9 KB)
sample.pdf (15.5 KB)

@pkozul I was managed to reproduce your issue on my side. I have logged it as WORDSNET- 23604 in our defect tracking system. We will keep you informed and let you know once it is resolved.

1 Like

Hi @Konstantin.Kornilov

Thank you for logging this defect.

Just wanted to add some more information, after doing some more tests. It appears that all lists in the document (whether they are <ul> or <ol>) seem to base themselves on the very first list.

For example, if the document contains a bulleted list (<ul>), and then an ordered list (<ol>) below it, then when coverting to PDF, the second list ends up also being a bulleted list.

And the same in revese. If the document contains an ordered list (<ol>), and then a bulleted list (<ul>) below it, then when coverting to PDF, the second list ends up also being an ordered list, and its numbers continue on from the sequence of numbers in the first list.

@pkozul Thank you for additional details. I have added your analysis to the defect description.

@pkozul Could you please attach the sample documents that demonstrates the behavior you have described in your last post, to make sure we tested your scenario properly?

Hi @alexey.noskov

Attached are the files (Word dcoument, and converted PDF file).

You can see the Word document contains a numbered list, followed by a bulleted list.

In the converted PDF file, the first list is correct (items numbered 1. and 2.) but the second list does not contain bulleted items (instead, the items are actually numbered 3. and 4.)

Cheers,
Petar

mixed_lists.pdf (17.9 KB)
mixed_lists.docx (10.9 KB)

@pkozul Thank you for additional information. The files has been added to the defect.

The issues you have found earlier (filed as WORDSNET-23604) have been fixed in this Aspose.Words for .NET 22.5 update also available on NuGet.

Hi again,

Thanks for the fix. Looks like there are scenarios where this issue is still not resolved (in the latest version).

I have attached an example Word document that contains a simple table with the following three HTML altchunks embedded in separate rows.

<html><body><ol><li>One</li><li>Two</li></ol></body></html>
<html><body><p>Both lists should have items numbered 1. and 2.</p></body></html>
<html><body><ol><li>One</li><li>Two</li></ol></body></html>

When converting to PDF, the ordered list in the third row ends up with items being numbered 3. and 4. instead of starting at 1. and 2.

One interesting thing I have observed is that removing the second altchunk results in the second numbered list ended up with the correct numbers, so it looks like the Aspose conversion to PDF doesn’t handle HTML list-based altchunks that are separated by simple paragraph-based HTML altchunks.

Attached is the Word document, along with the converted PDF document.

numbered_lists.docx (37.5 KB)
numbered_lists.pdf (37.1 KB)

Can you please have a look?

Thanks,
Petar

@pkozul Thank you for additional information. I have managed to reproduce the problem on my side. For a sake of correction it has been logged as WORDSNET-23844. We will let you know once it is resolved or we have additional information for you.

1 Like

The issues you have found earlier (filed as WORDSNET-23844) have been fixed in this Aspose.Words for .NET 22.6 update also available on NuGet.