DOCX to PDF conversion issue with resource folder and inline styles using .NET

Hi,

Below are the settings I tried:

HtmlFixedSaveOptions htmlSaveOptions = new HtmlFixedSaveOptions();
htmlSaveOptions.UseHighQualityRendering = true;
htmlSaveOptions.JpegQuality = 100;
htmlSaveOptions.CssClassNamesPrefix = “pfx_”;
htmlSaveOptions.ExportEmbeddedFonts = false;
htmlSaveOptions.ExportEmbeddedCss = false;
htmlSaveOptions.ExportEmbeddedSvg = false;
htmlSaveOptions.ExportEmbeddedImages = false;
htmlSaveOptions.SaveFontFaceCssSeparately = true;
htmlSaveOptions.ColorMode = ColorMode.Normal;

I am trying to achieve the following:

  1. Export all font and alignment styles into external css.
    Irrespective of htmlSaveOptions.ExportEmbeddedFonts set to True/False, the font styles are always displayed inline.

  2. I would like to have all the sizes in ‘ems(em)’ rather than ‘points(pt)’.
    I didn’t find a property that I can set to achieve this.

  3. I would like to save the exported resources in a folder with specific name.
    I tried setting the property htmlSaveOptions.ResourcesFolder = styleFolderName; but it doesn’t result in resources placed in that resource folder.

Could you please help with the above issues. Let me know if you need any additional information.

Thanks,
Jay

@jjanapareddy

Please note that HTML and HtmlFixed are two different file types. The HtmlFixed is fixed page format. This format saves the document in the HTML format using absolutely positioned elements.

Please note that formatting is applied on a few different levels. For example, let’s consider formatting of simple text. Text in documents is represented by Run element and a Run can only be a child of a Paragraph. You can apply formatting

  1. to Run nodes by using Character Styles e.g. a Glyph Style .
  2. to the parent of those Run nodes i.e. a Paragraph node ( possibly via paragraph Styles ).
  3. you can also apply direct formatting to Run nodes by using Run attributes ( Font ). In this case the Run will inherit formatting of Paragraph Style, a Glyph Style and then direct formatting.

The direct formatting applied to text is exported as inline styles. Please remove the direct formatting of text and apply desired style to paragraph to achieve your requirement.

Please ZIP and attach your input and expected output documents for our reference. We will then provide you more information on it.

  • Please create a standalone console application ( source code without compilation errors ) that helps us to reproduce your problem on our end and attach it here for testing.
  • Please ZIP and attach your input and problematic output documents.

We will investigate the issue and provide you more information on it.

Hi Tahir,

Please find the uploaded solution, in the folder Conversion Docs, you will find WORD and PDF documents converted to HTML.

The HTML created from PDF is the correct format I am looking for and I am trying to achieve the same output by converting WORD.

The code has all the settings that I used, please look into it.

Let me know if you need any more information.

Thanks,
JayAspose HTML Conversion.zip (6.1 MB)

@jjanapareddy

Please specify the physical folder where resources (images, fonts, css) are saved when exporting a document to Html format. E.g. please check the following line of code.

htmlSaveOptions.ResourcesFolder = @"c:\temp\html_resources\";

We have logged this feature as WORDSNET-20889 in our issue tracking system. You will notified via this forum thread once this feature is available.

Could you please share some detail why you need to use the unit em instead of pt?

@jjanapareddy

Please note that “em” units cause positioning issues due to rounding errors in some browsers. So, we decided not to implement this feature.