Unsupported HTML elements causing issue in excel export

@Bhumika.Shah
Aspose.Cells follows the rules and specifications of Excel. When importing HTML, we strive to obtain the same results as Excel as much as possible. We have created a sample file based on the provided detailed information, and through testing on the latest version v24.5, we can obtain the same results as Excel. Please refer to the attachment. result.zip (72.7 KB)

The sample code as follows:

Workbook wb = new Workbook(filePath + "a.html");
wb.Save(filePath + "out_net.xlsx");

If you still have any questions, please provide your sample file and test code, and we will check it soon.

I think there are 2 types of code that you use to test samples. Can you please try other way. Also please note I am using Aspose version 24.4 right now. Thanks.

@Bhumika.Shah,

Please zip your input HTML file and output Excel file and attach here. Also, share your sample (runnable) code that you are using? We will check your issue soon.

I am not using HTML as input but datatable. Or maybe you can try string input as earlier. Thanks.

@Bhumika.Shah,

You need to apply wrapping text style/formatting to the cells to make it work. See the following sample code for your reference.
e.g.,
Sample code:

System.Data.DataTable table = new System.Data.DataTable();
table.Columns.Add("Field1");
string[] addrow = {"<div>Test</div><div>Test<br></div>Test<div><br></div>Test -&nbsp;Test -&amp;"};
table.Rows.Add(addrow);

var workbook = new Workbook();
var ws = workbook.Worksheets[0];
ws.Cells.ImportData(table, 0, 0, new ImportTableOptions
{
    IsFieldNameShown = true,
    ConvertNumericData = true,
    IsHtmlString = true


}); 

//Apply wrapping text style to the cell(s).
Style style = workbook.CreateStyle();
style.IsTextWrapped = true;
ws.Cells["A2"].SetStyle(style);

workbook.Save("e:\\test2\\out1.xlsx");

out1.zip (6.0 KB)

Hope, this helps a bit.

I think it was misunderstood. The purpose is not to make them visible. But to have tags processed and removed which were not and appeared as it is. For ex.
div>Test</div
div>Test br></div
Test div> br></div
Test - 
Test -&"

Should be
Test
Test
Test
Test -
Test -&

but instead appeared as it without being processed. Please refer to the attached snapshot.

Please add starting and ending < and > tags.

image.png (3.0 KB)

@Bhumika.Shah
By using the following sample code for testing, we can find that the HTML tags have been successfully parsed. Please refer to the attachment. out_net.zip (6.7 KB)

The sample code as follows:

DataTable table = new DataTable();
table.Columns.Add("Test Field");
string[] dataRow = { "<div>Test</div><div>Test<br></div>Test<div><br></div>Test -&nbsp;Test -&amp;" };
table.Rows.Add(dataRow);

Workbook workbook = new Workbook();
Worksheet sheet = workbook.Worksheets[0];
Cells cells = sheet.Cells;
cells.ImportData(table, 0, 0, new ImportTableOptions
{
    IsFieldNameShown = true,
    IsHtmlString = true
});

//Set wrapping text style to the cell.
Style style = workbook.CreateStyle();
style.IsTextWrapped = true;
cells["A2"].SetStyle(style);

workbook.Save(filePath + "out_net.xlsx");

If you still have any questions, would you like to provide your test code? We will check it soon.

Can you please use test only below input. What is output with/without IsTextWrapped set to true. Please add < > to the tags.
Test div> br></div

@Bhumika.Shah,

I tested using the following sample code and it works fine with your new HTML string (I added “<>” to the tags).
e.g.,
Sample code:

System.Data.DataTable table = new System.Data.DataTable();
table.Columns.Add("Test Field");
string[] dataRow = {"Test<div><br></div>"};
table.Rows.Add(dataRow);

Workbook workbook = new Workbook();
Worksheet sheet = workbook.Worksheets[0];
Cells cells = sheet.Cells;
cells.ImportData(table, 0, 0, new ImportTableOptions
{
    IsFieldNameShown = true,
    IsHtmlString = true
});

//Set wrapping text style to the cell.
Style style = workbook.CreateStyle();
style.IsTextWrapped = true;
cells["A2"].SetStyle(style);

workbook.Save("e:\\test2\\out_new1.xlsx");

Please find attached the output XLSX file for your reference.
out_new1.zip (6.0 KB)

Please note, if you do not set wrapping text (IsTextWrapped) on in code, there would be no line break in the cell.

Sorry I had wrong input previously. div and br tags are working fine. Thanks.

@Bhumika.Shah,

Thank you for confirming.

It’s good to know that DIV and BR tags are working fine on your end. Please feel free to write back to us if you have any further queries or comments.

There was 1 more value noticed “background-color: initial” causing issue. Is it possible that whenever it encounters invalid value for any tags, especially color, background-color and font-size tags, instead of showing error, it considers default value. Please note that below was noticed in client data but only background-color showed issue for now. Thanks.

background-image: initial; background-position: initial; background-size: initial; background-repeat: initial; background-attachment: initial; background-origin: initial; background-clip: initial;

Or can you provide a flag that when set to true, it will always suppress the error and consider default value for given tag. For ex. SuppressErrorWithDefaultTagValue = True

@Bhumika.Shah
At present, there may be some exceptions in the handling of initial values. As an alternative, you can set specific values to solve the issue.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSNET-55974

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Yes, actually suddenly lot many clients have started facing issue. New values are encountered each time. We replace it in database every time through workaround script. Though it’s not convenient to run each time when issue surfaces and when it has new values client needs to connect again for support.

Clients are copying such HTML formatted data from external sources and its so frequent that we cannot control it. Though our HTML editors does not through issue, so it gets entered into the system but while being processed by Aspose, it throughs exception.

@Bhumika.Shah
Thank you for your feedback. Would you like to provide sample files and test code? We will check them soon.

You had earlier tested it with our tags having issue. Can you please use it with the same code. Thanks.

Currently only “background-color: initial” causing issue among above sample data with other initial values.

@Bhumika.Shah
Thank you for your feedback. Once there are updates, we will notify you promptly.

1 Like