Converting Document to HTML

Hi,


As per one of my customer requirement, i need to convert word document into html with inline SVG. But with Aspose word for .NET i am not able to do per page level as there is no api for that. Am i missing something?


Hi Prasanna,


Thanks for your inquiry. You can use the following code to meet this requirement.

Document doc = new Document(MyDir + @“input.docx”);

HtmlFixedSaveOptions opts = new HtmlFixedSaveOptions();

opts.PageCount = 1;

for (int i = 0; i < doc.PageCount; i++)

{

opts.PageIndex = i;

doc.Save(MyDir + @"16.2.0_" + i + ".html", opts);

}


Hope, this helps.

Best regards,

Hi Awais,


Thanks for the reply. I tried your code and i am able to get the desired output. Will it be possible to get each page as a single file with the svg embedded inline?

Few PDF’s are not parsing and throws exception. Can you try the attached PDF.

Also is there a solution like for Words.net?


Hi Prasanna,


Thanks for your inquiry. What I understand is that you want to export each page in “Deforestation.pdf” to individual HTML files. I am moving your thread in Aspose.Pdf forum where you’ll be guided appropriately.

Best regards,

Hi Prasanna,


Thanks for contacting support.

In order to accomplish your requirement, please split PDF file to individual page documents and then convert the files to HTML with all resources embedded. Please take a look over following code snippet.

[C#]

Document doc = new
Document(“c:/pdftest/Deforestation.pdf”);<o:p></o:p>

foreach (Page page in doc.Pages)

{

// Load source PDF file

Document tempdoc = new Document();

tempdoc.Pages.Add(page);

// Instantiate HTML Save options object

HtmlSaveOptions newOptions = new HtmlSaveOptions();

// Enable option to embed all resources inside the HTML

newOptions.PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml;

// This is just optimization for IE and can be omitted

newOptions.LettersPositioningMethod = HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss;

newOptions.RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground;

newOptions.FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats;

// Output file path

string outHtmlFile = @"c:\pdftest\Deforestation_Page"+ page.Number+".html";

tempdoc.Save(outHtmlFile, newOptions);

}