Convert Some Pages of DOCX Word Document to HTML format using C# .NET | Split DOCX File by Pages

I want to convert only some pages of documents (docx) into html. is this possible?

@ashishsinghvi,

You can meet this requirement by implementing the following workflow:

  • Split nodes in the document into separate pages and save all (or selected) pages to the disk (or memory) as separate documents by using the ‘PageSplitter’, ‘DocumentPageSplitter’, ‘PageNumberFinder’, ‘SectionSplitter’ classes available in PageSplitter.cs file.
  • Merge these one page documents into a single document.
  • Convert the final document to HTML format

Alternatively, you can save all (or selected) pages in Word document to HTML Fixed format by using the following simple code:

Document doc = new Document(@"input.docx");
HtmlFixedSaveOptions options = new HtmlFixedSaveOptions();
options.PageCount = 1;
for (int pageCount = 0; pageCount < 4; pageCount++)
{
    options.PageIndex = pageCount;
    doc.Save("out_" + pageCount + ".html", options);
}
1 Like

Aspose.Words.Saving.HtmlSaveOptions options = new Aspose.Words.Saving.HtmlSaveOptions(Aspose.Words.SaveFormat.Html);
options.ExportImagesAsBase64 = true;

I am using this option to export images to base64. is there any chance to use the same property with HtmlFixedSaveOptions

@ashishsinghvi,

Please use the HtmlFixedSaveOptions.ExportEmbeddedImages property to embed images into HTML document in Base64 format.