Unsupported HTML elements causing issue in excel export

I provided sample input but missed to remove tags on both sides. It’s not about sample input. But how var is processed.

span style="background-color: var(–color-field-bg); color: var(–color-field-text); ">Test1</span

span style="background-color: var(–color-field-bg); font-color: rgba(0, 0, 0, 0); ">Test</span

@Bhumika.Shah
Which method are you used to import html : new Workbook(html file) or Cell.HtmlString?

sheet.Cells.ImportData

It also looks like rgba(0,0,0,0) and rgba(100,0,0,0) are processed same way.

@Bhumika.Shah
1,Thanks for your info. ImportData method calls Cell.HtmlString to set html string value to each cell.
2, We have fixed the issue of 0/1px font size. It will be released in the next version 24.4
3, Var color in html string
We have found your metioned issue when setting html string value to the cell. I have log it with issue id CELLSNET-55416 in inner issue system.

And sorry for confusion, we always thought you were loading HTML files with a new workbook in previous posts. Now we get your need. If you have other issues about importing html data with Cells.ImportData, you can simply use Cell.HtmlString= “…” to reproduce your issue.

@Bhumika.Shah

In Excel, ARGB.A will be ignored in the file. It seems we should convert it to RGB first ,then apply to the cell when importing html. It was logged as CELLSNET-55417.

@Bhumika.Shah
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSNET-55416,CELLSNET-55417

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Please take note of rgba behavior mentioned above.

@Bhumika.Shah
Thanks for your reminder. We will look into how to preserve or convert ARBG.A when importing html string soon.

@amjad.sahi

Do we have an option to export datatable to excel with plain text, ignoring html tags ?
Thanks.

@Bhumika.Shah,

You may simply use ImportTableOptions.IsHtmlString to set to “false” when importing data from datatable to Excel spreadsheet using Cells.ImportData() method/overload. This way, your HTML will be imported as plain string/text. But if you need to import HTML string excluding relevant tags, e.g., an HTML string like:
<i>Aniseed</i> Syrup
to be rendered as:
Aniseed Syrup
this is not possible. You have to extract the text separated from tags by yourselves. Either you have to do it in datatable or after you have imported data to worksheet cells via Aspose.Cells APIs.

@amjad.sahi

Ok. I get that.

  1. Can you please also update on when you are planning to release the version. So, we can plan accordingly. We have multiple clients waiting for the bug fix.

  2. Can you please check with dev team and confirm that this time they are not planning to replace var with any color which is same for foreground and background. Removal seems to be only option.

@Bhumika.Shah,

We have scheduled to release Aspose.Cells for .NET v24.4 next week. However, it is also possible that the new version will be published before the end of this week. We cannot provide a specific date as releases are published once they are ready.

Your understanding is correct.

1 Like

@amjad.sahi

Thanks. During certain input, exporting to csv resulted to strange output as below for some user and not every. Can you please check.

Input
<font color=““rgba(0, 0, 0, 0)””><span style=““background-color: var(–color-field-bg); color: var(–color-field-text); font-size: 0px;””>Test</span>

Output
<span class=““ui-provider a b c d e f g h i j k l m n o p q r s t u v w x y z ab ac ae af ag ah ai aj ak”” dir=““ltr””><font color=““rgba(0, 0, 0, 0)””><span style=““background-color: var(–color-field-bg); color: var(–color-field-text); font-size: 0px;””>Test</span>

@Bhumika.Shah,

Are you using Cell.HtmlString attribute to set the HTML or are you using the Cells.ImportData() method to import from some data source (e.g., DataTable, etc.)? We would appreciate it if you could provide/paste your complete (runnable) code that could be executed standalone to reproduce the issue on our end. We will check it soon.

PS: When writing a simulation application, if you are importing from a DataTable, please use .NET APIs (e.g., System.Data) to create a DataTable with field(s) containing HTML tags in the code.

@Bhumika.Shah
By using the following sample code for testing on the latest version v24.3, we were able to reproduce the exception issue. FormatException occurred when setting Cell.HtmlString.

The sample code as follows:

Workbook workbook = new Workbook();
Cells cells = workbook.Worksheets[0].Cells;
Cell b6 = cells["B6"];
b6.HtmlString = "<font color=\"rgba(0, 0, 0, 0)\"><span style=\"background-color: var(–color-field-bg); color: var(–color-field-text); font-size: 0px;\">Test</span>";

workbook.Save(filePath + "out.csv");

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSNET-55436

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

1 Like

Thank you. Also just wanted to check if you see an option of plain text in export of any use in future? For now, csv is not readable with both html tags and plain text present together. Although there is also a chance that tags are not started or ended correctly which needs to be taken care of. For ex. <span Test</span

@Bhumika.Shah
Thank you for your feedback. The current issue is due to an error in the parsing of rgba color. After the FormatException is fixed, we will further test it. Once there are updates, we will notify you promptly.

1 Like

I understand this fix is regarding additional class of “a b c d e f …” for csv export.

@Bhumika.Shah,

No, we logged a ticket (“CELLSNET-55436”) as we found an exception (“System.FormatException : Input string was not in a correct format”) when setting your HTML string using Cell.HtmlString attribute (see the post for your reference).

For your issue (“additional class of “a b c d e f … for csv export”), we again request you to provide your complete (runnable) code or a standalone sample application to reproduce the issue on our end, so we could resolve it as well.