Unsupported HTML elements causing issue in excel export

@Bhumika.Shah
Thank you for your feedback. We will evaluate your issue as soon as possible. Once we have any new information, we will share it with you. We will get back to you soon.

1 Like

@Bhumika.Shah ,

The rgba is processed with white as the forecolor, the calculation method is as following:

R = (int)(a * r + 255  * (1 - a));
G = (int)(a * g + 255  * (1 - a));
B = (int)(a * b + 255  * (1 - a));

We will consider using the “background-color” of the style as forcolor in the next version.

I understand equations but did not get what you have said. Can you please explain.

@Bhumika.Shah

Thank you for your feedback. Sorry for any confusion caused to you. By encapsulating the provided test data into an HTML file, manually opening the HTML file using Excel and comparing it with the output results of Aspose.Cells, we can see some text colors are white, which is different from the display results in Excel. Please refer to the attachment. ResultFile.zip (60.7 KB)

The sample code as follows:

Workbook wb = new Workbook(filePath + "a.html");
wb.Save(filePath + "out_net.xlsx");

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSNET-55501

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@Bhumika.Shah ,

If a(alpha) of font RGBA color is 0, the text color is transparent, so it is same as background color (White) in browser then according to the above formula (R= 0*r+255*(1-0)), the resulting rgb is (255,255,255).
The text color is same with background in the browser, so you can not see text either.
In order to make the text visible, we are considering ignoring font rgba color if a(alpha) is 0.

BTW, “font-color” is not a valid attribute of <style>, so no matter what its value is, the font color is always black.

1 Like

@Bhumika.Shah ,

We are pleased to inform you that your issue (filed as CELLSNET-55501) has been resolved. The fix will be included in an upcoming release (Aspose.Cells v24.5) that we plan to release in the first half of May 2024. You will be notified when the next version is released.

The issues you have found earlier (filed as CELLSNET-55501) have been fixed in this update. This message was posted using Bugs notification tool by johnson.shi

Can you please try exporting below test data to excel

Test
Test
Test

Test -  Test -&

Output is same as input.

Expected is as below
Test
Test
Test
Test -
Test -&

@Bhumika.Shah
Aspose.Cells follows the rules and specifications of Excel. When importing HTML, we strive to obtain the same results as Excel as much as possible. We have created a sample file based on the provided detailed information, and through testing on the latest version v24.5, we can obtain the same results as Excel. Please refer to the attachment. result.zip (72.7 KB)

The sample code as follows:

Workbook wb = new Workbook(filePath + "a.html");
wb.Save(filePath + "out_net.xlsx");

If you still have any questions, please provide your sample file and test code, and we will check it soon.

I think there are 2 types of code that you use to test samples. Can you please try other way. Also please note I am using Aspose version 24.4 right now. Thanks.

@Bhumika.Shah,

Please zip your input HTML file and output Excel file and attach here. Also, share your sample (runnable) code that you are using? We will check your issue soon.

I am not using HTML as input but datatable. Or maybe you can try string input as earlier. Thanks.

@Bhumika.Shah,

You need to apply wrapping text style/formatting to the cells to make it work. See the following sample code for your reference.
e.g.,
Sample code:

System.Data.DataTable table = new System.Data.DataTable();
table.Columns.Add("Field1");
string[] addrow = {"<div>Test</div><div>Test<br></div>Test<div><br></div>Test -&nbsp;Test -&amp;"};
table.Rows.Add(addrow);

var workbook = new Workbook();
var ws = workbook.Worksheets[0];
ws.Cells.ImportData(table, 0, 0, new ImportTableOptions
{
    IsFieldNameShown = true,
    ConvertNumericData = true,
    IsHtmlString = true


}); 

//Apply wrapping text style to the cell(s).
Style style = workbook.CreateStyle();
style.IsTextWrapped = true;
ws.Cells["A2"].SetStyle(style);

workbook.Save("e:\\test2\\out1.xlsx");

out1.zip (6.0 KB)

Hope, this helps a bit.

I think it was misunderstood. The purpose is not to make them visible. But to have tags processed and removed which were not and appeared as it is. For ex.
div>Test</div
div>Test br></div
Test div> br></div
Test - 
Test -&"

Should be
Test
Test
Test
Test -
Test -&

but instead appeared as it without being processed. Please refer to the attached snapshot.

Please add starting and ending < and > tags.

image.png (3.0 KB)

@Bhumika.Shah
By using the following sample code for testing, we can find that the HTML tags have been successfully parsed. Please refer to the attachment. out_net.zip (6.7 KB)

The sample code as follows:

DataTable table = new DataTable();
table.Columns.Add("Test Field");
string[] dataRow = { "<div>Test</div><div>Test<br></div>Test<div><br></div>Test -&nbsp;Test -&amp;" };
table.Rows.Add(dataRow);

Workbook workbook = new Workbook();
Worksheet sheet = workbook.Worksheets[0];
Cells cells = sheet.Cells;
cells.ImportData(table, 0, 0, new ImportTableOptions
{
    IsFieldNameShown = true,
    IsHtmlString = true
});

//Set wrapping text style to the cell.
Style style = workbook.CreateStyle();
style.IsTextWrapped = true;
cells["A2"].SetStyle(style);

workbook.Save(filePath + "out_net.xlsx");

If you still have any questions, would you like to provide your test code? We will check it soon.

Can you please use test only below input. What is output with/without IsTextWrapped set to true. Please add < > to the tags.
Test div> br></div

@Bhumika.Shah,

I tested using the following sample code and it works fine with your new HTML string (I added “<>” to the tags).
e.g.,
Sample code:

System.Data.DataTable table = new System.Data.DataTable();
table.Columns.Add("Test Field");
string[] dataRow = {"Test<div><br></div>"};
table.Rows.Add(dataRow);

Workbook workbook = new Workbook();
Worksheet sheet = workbook.Worksheets[0];
Cells cells = sheet.Cells;
cells.ImportData(table, 0, 0, new ImportTableOptions
{
    IsFieldNameShown = true,
    IsHtmlString = true
});

//Set wrapping text style to the cell.
Style style = workbook.CreateStyle();
style.IsTextWrapped = true;
cells["A2"].SetStyle(style);

workbook.Save("e:\\test2\\out_new1.xlsx");

Please find attached the output XLSX file for your reference.
out_new1.zip (6.0 KB)

Please note, if you do not set wrapping text (IsTextWrapped) on in code, there would be no line break in the cell.

Sorry I had wrong input previously. div and br tags are working fine. Thanks.

@Bhumika.Shah,

Thank you for confirming.

It’s good to know that DIV and BR tags are working fine on your end. Please feel free to write back to us if you have any further queries or comments.

There was 1 more value noticed “background-color: initial” causing issue. Is it possible that whenever it encounters invalid value for any tags, especially color, background-color and font-size tags, instead of showing error, it considers default value. Please note that below was noticed in client data but only background-color showed issue for now. Thanks.

background-image: initial; background-position: initial; background-size: initial; background-repeat: initial; background-attachment: initial; background-origin: initial; background-clip: initial;