Convert HTML to DOCX in C# .NET

Hi, Im using ASPOSE HTML .net package to convert HTML output to DOCX. I have noticed below issues. The sample files are attached.

  1. The paragraphs are not converted as free word texts, instead the sentences are broken into multiple textboxes without a particular pattern. after conversion , the document should be editable as a normal word file sentences / paragraphs would
  2. If page margins are added, the table and images are not properly positioned.
  3. The table also not formatted as a word table, instead have textboxes. Aspose Word samples.zip (62.0 KB)

@sathsaranim

Please note that HTML documents can have very complex structure, which could not be represented using simple DOCX text paragraphs. That’s why we use “TextBox” elements which allows us to preserve documents layout. We can implement a new type of layout which will use text paragraphs, but the original HTML documents structure will be lost then.

Furthermore, regarding the issue of page margins and content positioning, we will further investigate this case as it has been logged as HTMLNET-3652 in our issue tracking system. We will let you know as soon as the logged ticket is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

thanks. Having the textboxes for paragraphs is okay as long as its a single textbox, but in the output its separated into multiple textboxes which makes it hard to edit the converted document.
can you please confirm whether the same behavior exists if I use the Aspose.Words , if this package supports html conversion.

@sathsaranim

We will surely consider your these concerns while investigating the ticket and let you know once we have some feedback to share regarding its resolution.

We request you create a new topic/post in Aspose.Words forum category where you will be assisted appropriately.