About HTML to Word and PowerPoint Conversion Capabilities

Hi,

We are currently exploring solutions for converting rich HTML content (including custom bullet points, table styles, images, and layouts) into a Word document that includes a cover page. Our current library requires manual XML modifications, which is cumbersome. Can Aspose help streamline this process? Which product would you recommend?

Additionally, we will soon need to convert HTML content into PowerPoint presentations, maintaining the layout style. Does Aspose offer a solution for this as well?

Thank you for your assistance.

Best regards,

@d.ams You can use Aspose.Words to convert HTML to MS Word. Please see our documentation for more information:
https://docs.aspose.com/words/net/convert-a-document/

Code for conversion is pretty simple:

Document doc = new Document(@"C:\Temp\in.html");
doc.Save(@"C:\Temp\out.docx");

Also, you can use DocumentBuilder.InsertHtml method to insert HTML into existing document.

You can use Aspose.Slides to work with PowerPoint presentations.

Awesome ! Thank you very much. I come back to you if we have other questions
Best regards,

1 Like

After a talk with the product team, we have more specific questions.

Currently, we use Pandoc to convert HTML to Docx. It’s working well. But, we’re struggling applying specific word templates with styles.

Our customers need to export the HTML content to their docx template with:

  • a cover page, header, footer,
  • table of contents,
  • paragraph styles, bullet styles, table styles
  • positioned and sized images

We used to make a concatenation between the docx converted from HTML and the customer template but styles (bullet point and tables) aren’t correctly applied.

So we’re looking for other solutions, like Aspose, and we have several questions:

  • Is it possible to add HTML content to an existing Word Template (or Word document) ?
  • Does the inserted and converted HTML content correctly apply the template’s default styles for : Title, Paragraph, List, Table, Character, Section and Footnote styles?
  • Is it possible, with this HTML content insertion, to maintain a cover page, a table of contents and the Template’s headers and footers?
  • Can images also be inserted using this method, while maintaining the sizes, alignments and proportions defined in the HTML?
  • Is it possible to add CSS to HTML for formatting adjustments, and have this taken into account in the conversion, while still applying the Template’s default styles?
  • Which licence do you recommend for doing on-prem (private cloud) with our customers ?
  • What’s your advise: convert first html to docx and then apply a style ? Or other practice are better ?
  • Is your solution “plug-and-play” ?

Very urgent subject for us. If you can come back to me soon, it would really be appreciated.

Thank you

Best regards,

@d.ams

Yes, you can add HTML content to an existing Word document. For example you can insert bookmark in your document where HTML content should be inserted and then use DocumentBuilder.InsertHtml method:

Document doc = new Document(@"C:\Temp\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
// Move to bookmark.
builder.MoveToBookmark("InsertHtmlHere");
// Insert HTML
builder.InsertHtml("<b>This is my cool <i>HTML</i><b>");
doc.Save(@"C:\Temp\out.docx");

While inserting HTML you can control how HTML is inserted. For example you can specify HtmlInsertOptions.UseBuilderFormatting option. In this case font and paragraph formatting specified in DocumentBuilder will be used as base formatting for text inserted from HTML.

Sure, see the example provided in the first question answer.

Generally yes. But you should note, however, that HTML documents and MS Word documents object models are quite different and it is not always possible to provide 100% fidelity after conversion one format to another. In most cases Aspose.Words mimics MS Word behavior when work with HTML.

The same answer as above.

It would be better to contact our sales team in Aspose.Purchse forum. My colleagues from sales team will help you to select the right license for your needs.

Aspose.Words is a class library. It does not require any special configuration or additional software to work. So generally, yes, you can consider Aspose.Words solution as “plug-and-play”.

Thank you for this replies ! That’s great, we’re currently testing thanks to your replies.

One more question, do you manage layouts like “columns” ? For example, I want my text on the left and an image on the right. In HTML, it’s easy to organize this type of layout, but is it possible to convert this kind of layout to a DOCX file?

I’m part of the product team of “d.ams”,

I would need more information about the style when adding html in a document.
When I insert html to the builder, I expect the html to take the styles of the builder. But tables don’t use the default style of the builder. they use a default bland style.

PS: I used HtmlInsertOptions.UseBuilderFormatting

@samthink

You can use table with two columns to achieve this.

Could you please provide you sample HTML, template, output and expected output documents here for our reference. As I have mentioned earlier, Aspose.Words is designed to work with MS Word documents. HTML documents and MS Word documents object models are quite different and it is not always possible to provide 100% fidelity after conversion one format to another. In most cases Aspose.Words mimics MS Word behavior when work with HTML.

Hi,

This is the output by including the html fragment in the docx, as you can see you the tables don’t have the style of the original document:

1879f215-1513-46a3-bc04-b56874d1cf23(1).docx (131,8 Ko)

This is the main document:

Template Dossier de compétences référence(2).docx (39,3 Ko)

And finally the inserted html:

<html>
  <h1>
    Fiche de poste :
    <span
      data-id="018c68d6-d807-735b-a79a-8bad0ca16f08"
      data-icon="fa-tag"
      data-color="6f3bc4ff"
      data-name="Titre du poste"
      class="attribute"
      >Customer Success Agent</span
    >
  </h1>
  <p>## Poste: Customer Success Agent</p>
  <h3>Compétences Requises</h3>
  <table>
    <thead>
      <tr>
        <th>
          <strong>Compétence</strong>
        </th>
        <th>
          <strong>Niveau</strong>
        </th>
        <th>
          <strong>Description</strong>
        </th>
        <th>
          <strong>Exemple</strong>
        </th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>
          <em>Gestion de la relation client</em>
        </td>
        <td>
          <strong>Avancé</strong>
        </td>
        <td>Capacité à maintenir des relations positives avec les clients.</td>
        <td>Résolution des soucis clients via téléphone.</td>
      </tr>
      <tr>
        <td>
          <strong>Communication</strong>
        </td>
        <td>Avancé</td>
        <td>Excellentes capacités de communication orale et écrite.</td>
        <td>RĂ©pondre aux emails clients de maniĂšre claire.</td>
      </tr>
      <tr>
        <td>
          <em>Analyse de données</em>
        </td>
        <td>Moyenne</td>
        <td>Capacité à interpréter et analyser les données clients.</td>
        <td>Analyser les feedbacks clients pour améliorer le service.</td>
      </tr>
      <tr>
        <td>
          <strong>Connaissance du produit</strong>
        </td>
        <td>Avancé</td>
        <td>Bonne comprĂ©hension des produits et services de l’entreprise.</td>
        <td>Formation des nouveaux clients sur le produit.</td>
      </tr>
    </tbody>
  </table>
  <h3>Missions</h3>
  <table>
    <thead>
      <tr>
        <th>
          <strong>Mission</strong>
        </th>
        <th>
          <strong>TĂąche</strong>
        </th>
        <th>
          <strong>Fréquence</strong>
        </th>
        <th>
          <strong>Objectif</strong>
        </th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>
          <em>Assister les clients</em>
        </td>
        <td>Répondre aux questions et résoudre les problÚmes.</td>
        <td>Quotidienne</td>
        <td>Garantir la satisfaction client.</td>
      </tr>
      <tr>
        <td>
          <strong>Suivi des comptes clients</strong>
        </td>
        <td>Surveiller l’utilisation des produits par les clients.</td>
        <td>Hebdomadaire</td>
        <td>Assurer une utilisation optimale.</td>
      </tr>
      <tr>
        <td>
          <em>Fournir des formations</em>
        </td>
        <td>DĂ©velopper et dispenser des formations sur le produit.</td>
        <td>Mensuelle</td>
        <td>Faciliter l&#39;auto-suffisance des clients.</td>
      </tr>
      <tr>
        <td>
          <strong>Collecte de feedbacks</strong>
        </td>
        <td>Recueillir et analyser les retours clients.</td>
        <td>Permanente</td>
        <td>Améliorer les offres produits.</td>
      </tr>
    </tbody>
  </table>
  <h3>Qualités Requises</h3>
  <ul>
    <li>
      <strong>Empathie</strong>
    </li>
    <li>
      <em>Patience</em>
    </li>
    <li>
      <strong>Capacité d&#39;adaptation</strong>
    </li>
    <li>
      <em>Polyvalence</em>
    </li>
    <li>
      <strong>Rigueur</strong>
    </li>
  </ul>
  <h3>Soft Skills Souhaités</h3>
  <ol>
    <li>
      <strong>Esprit d&#39;équipe</strong> : Capacité à travailler efficacement
      avec les autres membres de l&#39;Ă©quipe.
    </li>
    <li>
      <em>Résilience</em> : Capacité à gérer le stress et à surmonter les
      Ă©checs.
    </li>
    <li>
      <strong>Gestion du temps</strong> : Efficacité dans la priorisation et la
      gestion des tĂąches multiples.
    </li>
    <li>
      <em>Proactivité</em> : Habileté à anticiper les problÚmes et à proposer
      des solutions.
    </li>
  </ol>
  <p>
    Ces éléments donnent un aperçu structuré et détaillé des attentes et besoins
    pour le poste de Customer Success Agent avec un niveau d&#39;expérience
    confirmé de 3 à 5 ans.
  </p>
  <h1>Titre 1</h1>
  <h2>Titre 2</h2>
  <h3>Titre 3</h3>
  <h4>Titre 4</h4>
  <h5>Titre 5</h5>
  <h6>Titre 6</h6>
  <p>
    <strong>Texte en gras</strong>
  </p>
  <p>
    <em>Texte en italique</em>
  </p>
  <p>
    <strong>
      <em>Texte en gras et italique</em>
    </strong>
  </p>
  <p>Bullet liste manuelle :</p>
  <ul>
    <li>
      <p>Bullet 1</p>
    </li>
    <li>
      <p>Bullet 2</p>
    </li>
  </ul>
  <p>Liste numérotée manuelle :</p>
  <ol>
    <li>
      <p>Point 1</p>
    </li>
    <li>
      <p>Point 2</p>
    </li>
  </ol>
  <p>Tableau hors IA</p>
  <table style="minwidth: 508px">
    <colgroup>
      <col />
      <col style="width: 178px" />
      <col style="width: 280px" />
      <col />
    </colgroup>
    <tbody>
      <tr>
        <th colspan="1" rowspan="1">
          <p>Colonne Ă©troite</p>
        </th>
        <th colspan="1" rowspan="1" colwidth="178">
          <p>Colonne moyenne</p>
        </th>
        <th colspan="1" rowspan="1" colwidth="280">
          <p>Colonne large</p>
        </th>
        <th colspan="1" rowspan="1">
          <p>Colonne Ă©troite</p>
        </th>
      </tr>
      <tr>
        <td colspan="1" rowspan="1">
          <div custom-style="">
            <img
              src="https://api.v3.dev.thinkeo.dev/attachments/945fec6b-525f-4e27-9a76-b6df0ec3d6d1"
              width="100%"
            />
          </div>
        </td>
        <td colspan="1" rowspan="1" colwidth="178"></td>
        <td colspan="1" rowspan="1" colwidth="280"></td>
        <td colspan="1" rowspan="1"></td>
      </tr>
      <tr>
        <td colspan="1" rowspan="1"></td>
        <td colspan="1" rowspan="1" colwidth="178">
          <p>Test</p>
        </td>
        <td colspan="1" rowspan="1" colwidth="280"></td>
        <td colspan="1" rowspan="1"></td>
      </tr>
      <tr>
        <td colspan="1" rowspan="1"></td>
        <td colspan="1" rowspan="1" colwidth="178"></td>
        <td colspan="1" rowspan="1" colwidth="280">
          <div custom-style="">
            <img
              src="https://api.v3.dev.thinkeo.dev/attachments/efc74e89-a8fe-44bc-9e26-ae497d32640e"
              width="100%"
            />
          </div>
        </td>
        <td colspan="1" rowspan="1"></td>
      </tr>
      <tr>
        <td colspan="1" rowspan="1"></td>
        <td colspan="1" rowspan="1" colwidth="178"></td>
        <td colspan="1" rowspan="1" colwidth="280"></td>
        <td colspan="1" rowspan="1"></td>
      </tr>
    </tbody>
  </table>
  <p>Tableau sans images mais colonnes ajustées</p>
  <table style="minwidth: 623px">
    <colgroup>
      <col style="width: 126px" />
      <col style="width: 472px" />
      <col />
    </colgroup>
    <tbody>
      <tr>
        <th colspan="1" rowspan="1" colwidth="126">
          <p>Colonne Ă©troite</p>
        </th>
        <th colspan="1" rowspan="1" colwidth="472">
          <p>Colonne trĂšs large</p>
        </th>
        <th colspan="1" rowspan="1">
          <p>Ă©troite</p>
        </th>
      </tr>
      <tr>
        <td colspan="1" rowspan="1" colwidth="126"></td>
        <td colspan="1" rowspan="1" colwidth="472">
          <p>
            Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus
            semper nisi felis, sed efficitur turpis cursus id.
          </p>
        </td>
        <td colspan="1" rowspan="1"></td>
      </tr>
      <tr>
        <td colspan="1" rowspan="1" colwidth="126"></td>
        <td colspan="1" rowspan="1" colwidth="472"></td>
        <td colspan="1" rowspan="1"></td>
      </tr>
    </tbody>
  </table>
  <p>Alignement droite 80%</p>
  <div style="text-align: right" custom-style="AlignRight">
    <img
      style="text-align: right"
      src="https://api.v3.dev.thinkeo.dev/attachments/945fec6b-525f-4e27-9a76-b6df0ec3d6d1"
      width="80%"
    />
  </div>
  <p>Alignement centre 60%</p>
  <div style="text-align: center" custom-style="AlignCenter">
    <img
      style="text-align: center"
      src="https://api.v3.dev.thinkeo.dev/attachments/beed1c93-ac92-4964-8cc3-5190c37df370"
      width="60%"
    />
  </div>
  <p>alignement gauche 20%</p>
  <div custom-style="">
    <img
      src="https://api.v3.dev.thinkeo.dev/attachments/45df78cb-d6da-462a-b211-387b97ac44ca"
      width="20%"
    />
  </div>
</html>

Thanks for your time

@samthink You can apply the table style after inserting HTML. For example see the following code:

Document doc = new Document(@"C:\Temp\in.docx");
DocumentBuilder builder = new DocumentBuilder(doc);
builder.MoveToDocumentEnd();
builder.InsertHtml(File.ReadAllText(@"C:\Temp\in.html"));

// Apply styles to the tables. 
foreach (Table t in doc.GetChildNodes(NodeType.Table, true))
    t.StyleName = "Tableau par defaut";

doc.Save(@"C:\Temp\out.docx");

out.docx (124.5 KB)

Is this the output you would like to get?

Indeed, we succeeded adding the styles to the table, but we cannot reproduce the dynamic size of the images. In our case the images in the docx are all the same size. I’m using the java version could it be the reason ?

@samthink Please try specifying image size in absolute units in your HTML.

In the html we shared we used relative ‘%’ units and your converted docx was correct , is this an special case ? Having absolute units can be difficult on our side

@samthink It is not a special case or something other. This is HTML. As I have already mentioned HTML and MS Word document object models are different and it is impossible to convert one into another without loses.

I really understand, but your converter and our differ with the same code(I think) and the sames input files. Our output is different in the image handling size. Yours is correct and ours is not. Did you modify the html to put absolute units ?

@samthink No, there where no HTML modification on my side. I have used the above provided code with the latest 24.5 version of Aspose.Words.

Our version was different I was using 24.1 I thought it was the last version. My fault, thanks for your time and help. So now sizes are now correctly handled even with the relative size, like your output.

1 Like

Sorry to bother again,

As you can see in the out.docx, we have an specific case where we have images in tabs. In the main document the size of the image (in %) is correctly handled,but if the image is in a tab cell, the image doesn’t fit and doesn’t correspond to the html.

Is there an option for this ?

Also I couldn’t find any documentation specifying with HTML tags where handled by the InsertHtml

@samthink I am afraid, there is not such option. As it was mentioned above there is no way to retain the same HTML content formatting as in browser after inserting HTML into the document.

Unfortunately, there is no such documentation. In most cases Aspose.Words mimics MS Word behavior when work with HTML.