Free Support Forum - aspose.com

Extracting Header and Footer Text as HTML

Hi,

I want to extract header and footer as html, basically i m able to extarct them as text from range.Txt

method. As aspose.words document.save method writes the document converted html

to memory stream and we can use it later as string in our applications.

Is there a way to extract specific document section as html.

thanks

Hi

Thank you for your interest in Aspose products. Unfortunately, you can’t extract specific document section as html. But you can convert whole document to HTML and then parse it.

Best regards.

hi,

I solved it by doing some workaround i m able to extract header, footer and maincontent as html strings

here's the code for that

public void asposeReturnHtmlFromDoc(string docPath, ref string mainContent,

ref string header, ref string footer)

{

//open the document

Document myDoc = new Document(docPath);

Document myDocHeader = new Document(docPath);

Document myDocFooter = new Document(docPath);

try

{

//remove Only Header Footer From the Obj

myDoc.FirstSection.ClearHeadersFooters();

//Remove All the Content

foreach (Section sec in myDocHeader.Sections)

{

sec.Body.RemoveAllChildren();

}

//Remove Footer

foreach (HeaderFooter myFooter in myDocHeader.FirstSection.HeadersFooters)

{

if (myFooter.IsHeader == false)

{

myFooter.RemoveAllChildren();

}

}

//Remove All Content

foreach (Section sec in myDocFooter.Sections)

{

sec.Body.RemoveAllChildren();

}

//Remove Header

foreach (HeaderFooter myHeader in myDocFooter.FirstSection.HeadersFooters)

{

if (myHeader.IsHeader == true)

{

myHeader.RemoveAllChildren();

}

}

//Memory stream To Read The different Section

MemoryStream msOut = new MemoryStream();

MemoryStream msHeadOut = new MemoryStream();

MemoryStream msFooterOut = new MemoryStream();

//read to memorystreams

myDoc.Save(msOut, SaveFormat.Html);

myDocHeader.Save(msHeadOut, SaveFormat.Html);

myDocFooter.Save(msFooterOut, SaveFormat.Html);

//extract html string from streams

Encoding enc = Encoding.UTF8;

mainContent = enc.GetString(msOut.GetBuffer());

header = enc.GetString(msHeadOut.GetBuffer());

footer = enc.GetString(msFooterOut.GetBuffer());

}

catch (Exception ex)

{

}

finally

{

myDoc = null;

myDocFooter = null;

myDocHeader = null;

GC.Collect();

}

}

This Will Only Work For Single Header or Footer I think.

thanks

Hi

Also you can try to use NodeImporter to import headers and footers. See the following code.

//open sourse document

Document doc = new Document(@"306_100188_medtrans\in.doc");

//create new document

Document header = new Document();

//create node importer

NodeImporter importer = new NodeImporter(doc, header, ImportFormatMode.KeepSourceFormatting);

//import Header form sourse document

header.FirstSection.AppendChild(importer.ImportNode(doc.FirstSection.HeadersFooters[HeaderFooterType.HeaderPrimary], true));

//save header as HTML

header.Save(@"306_100188_medtrans\out.html", SaveFormat.Html);

I hope that this will help you.

Best regards.