Free Support Forum - aspose.com

Tab identification in generated HTML from word document



Hi,

We are using Aspose Total Lic version 10.0.0.0 to generate word document, where input is word document.

For generating word doc we are taking the Aspose HTML format of the input word document using the following code:

string htmlText = string.Empty;
string lStrImageFolder = string.Empty;
lStrImageFolder = FolderPath + “/” +

ConfigurationManager.AppSettings[“HTMLFileFolderName”].ToString();
string tempDir = Path.Combine(Server.MapPath(lStrImageFolder));
if (!Directory.Exists(tempDir))
Directory.CreateDirectory(tempDir);

Aspose.Words.Saving.HtmlSaveOptions saveOptions = new Aspose.Words.Saving.HtmlSaveOptions();
saveOptions.ImagesFolder = tempDir;
saveOptions.CssStyleSheetType = Aspose.Words.Saving.CssStyleSheetType.Embedded;
saveOptions.SaveFormat=SaveFormat.Html;
saveOptions.ImagesFolderAlias =

ConfigurationManager.AppSettings[“HTMLFileFolderName”].ToString();
MemoryStream htmlStream = new MemoryStream();
doc.Save(htmlStream, saveOptions);
htmlText = Encoding.UTF8.GetString(htmlStream.GetBuffer());
htmlStream.Close();

In our input document (PFA) there are some tabs between data but in the Aspose HTML we are getting there is no identifier for that tab

So is there any way we can idetify a tab in the generated HTML

Please provide solution asap.

Thanks,
Samanvay

Hi

Thanks for your request. As you may know HTML format does not directly support tabs. So Aspose.Words exports tabs into HTML as a sequence of whitespaces. You can find elements like the following in your HTML:

<span style="font-family:Arial; font-size:10pt"> </span>

These spans with a set of whitespaces are tabs exported to HTML from MS Word.

Best regards,