Converting Document to HTML

Hi,


As per one of my customer requirement, i need to convert word document into html with inline SVG. But with Aspose word for .NET i am not able to do per page level as there is no api for that. Am i missing something?


Hi Prasanna,

Thanks for your inquiry. You can use the following code to meet this requirement.

Document doc = new Document(MyDir + @"input.docx");

HtmlFixedSaveOptions opts = new HtmlFixedSaveOptions
{
    PageCount = 1
};

for (int i = 0; i < doc.PageCount; i++)
{
    opts.PageIndex = i;
    doc.Save(MyDir + $"16.2.0_{i}.html", opts);
}

Hope, this helps.

Best regards,

Hi Awais,


Thanks for the reply. I tried your code and i am able to get the desired output. Will it be possible to get each page as a single file with the svg embedded inline?

Few PDF’s are not parsing and throws exception. Can you try the attached PDF.

Also is there a solution like for Words.net?


Hi Prasanna,


Thanks for your inquiry. What I understand is that you want to export each page in “Deforestation.pdf” to individual HTML files. I am moving your thread in Aspose.Pdf forum where you’ll be guided appropriately.

Best regards,

Hi Prasanna,

Thanks for contacting support.

In order to accomplish your requirement, please split PDF file to individual page documents and then convert the files to HTML with all resources embedded. Please take a look over following code snippet.

[C#]

Document doc = new Document("c:/pdftest/Deforestation.pdf");

foreach (Page page in doc.Pages)
{
    // Create a temporary document for each page
    Document tempDoc = new Document();
    tempDoc.Pages.Add(page);

    // Instantiate HTML Save options object
    HtmlSaveOptions newOptions = new HtmlSaveOptions
    {
        // Enable option to embed all resources inside the HTML
        PartsEmbeddingMode = HtmlSaveOptions.PartsEmbeddingModes.EmbedAllIntoHtml,

        // Optimization for IE (optional)
        LettersPositioningMethod = HtmlSaveOptions.LettersPositioningMethods.UseEmUnitsAndCompensationOfRoundingErrorsInCss,

        // Embed raster images as parts of the PNG page background
        RasterImagesSavingMode = HtmlSaveOptions.RasterImagesSavingModes.AsEmbeddedPartsOfPngPageBackground,

        // Save fonts in all formats
        FontSavingMode = HtmlSaveOptions.FontSavingModes.SaveInAllFormats
    };

    // Output file path for each page
    string outHtmlFile = @"c:\pdftest\Deforestation_Page" + page.Number + ".html";

    // Save the page as HTML
    tempDoc.Save(outHtmlFile, newOptions);
}