Extract the HTML from a range of pages on word document

raghuureddy · June 23, 2014, 4:19pm

Hi,

Is it possible to extract the HTML content from a word document for a particular page / page range?

If possible, could you please share the sample code. I did search, but could not find. Thanks in advance.

Regards,

Raghu

tahir.manzoor · June 24, 2014, 8:36am

Hi Raghu,

Thanks
for your inquiry. Sure, you can achieve this using the “PageSplitter”
example project. You can find PageSplitter project in Aspose.Words for .NET examples repository at GitHub. Please let us know if we can be of any further assistance.

If you still face any issue, please share your input and expected output document with us. We will then provide you more information about this along with code.

Document doc = new Document(docName);

// Create and attach collector to the document before page layout is built.

LayoutCollector layoutCollector = new LayoutCollector(doc);

// This will build layout model and collect necessary information.

doc.UpdatePageLayout();

// Split nodes in the document into separate pages.

DocumentPageSplitter splitter = new DocumentPageSplitter(layoutCollector);

Document newDoc = splitter.GetDocumentOfPageRange(3, 5);

newDoc.Save(MyDir + "Out.docx");