HTML to XLS - CSS style/formatting is ignored in .NET

Hello Team,

With Aspose.Cell for Java, could we load HTML to save it as Excel?

>>> We logged the feature as an id: CELLSNET-21710. We may look into it in future versions. Once we have any update about it, we will let you know.
Is this feature added in latest version?

Thanks,
Kumar

@kumaraswamy.m,

Thanks for your query.

Aspose.Cells does support to read/manipulate HTML file or save it to MS Excel file formats (e.g XLS, XLSX , XLSM, XLSB, etc.) But, generally, we support MS Excel oriented HTML files. Common HTML files are not completely supported. For example, you open your HTML file into MS Excel manually and save to XLS/XLSX. The resultant file would be the output file. The same thing can be done via Aspose.Cells APIs.

If you find any issue where MS Excel does support certain things and Aspose.Cells does not do well, please share your HTML file, we will check it soon.

The attached html.zip (135.6 KB)
.html file relies on .css for styling. However, with the below code, the formatting is lost converting html to excel. If I open the .html file manually in MS Excel, the formatting is retained. Could we achieve the same result with Aspose.Cells?

public static void main(String[] args) throws Exception
{
	Utils.setupLicense();
	HTMLLoadOptions opts = new HTMLLoadOptions(LoadFormat.HTML);
	Workbook workbook = new Workbook("C:\\Users\\kumaraswamy_gowda\\Desktop\\html\\html1.htm", opts);
	String output = "C:\\Users\\kumaraswamy_gowda\\Desktop\\html\\" + "RPE_" + UUID.randomUUID().toString() + ".xls";
	workbook.save(output);
	System.out.println(output);
}

@kumaraswamy.m,

Thanks for the sample files and code segment.

After an initial test, I am able to observe the issue as you mentioned by converting your template HTML to XLS file format. I found that CSS style/formatting is ignored in the output XLS file format. I have logged a ticket with an id “CELLSJAVA-42635” for your issue. We will look into it soon.

Once we have an update on it, we will let you know here.

@Amjad_Sahi Are the CSS styles supported while importing html? or just that some styles are not working?

@kumaraswamy.m,

I guess CSS style is not completely supported when rendering an HTML to XLS/XLSX but we have to evaluate your issue thoroughly. Please spare us little time to look into it.

Once we have any new information, we will share it with you.

@kumaraswamy.m,

This is to inform you that we have fixed your issue (logged earlier as “CELLSJAVA-42635”) now. We will soon provide you the fixed version after performing QA and incorporating other enhancements and fixes.

That is great to hear. Could you share the output for the above testcase?

@kumaraswamy.m,

Once the fix is available for public use, we will share the Download link here for your testing. Hopefully, it will work for your test case.

@kumaraswamy.m,

Please try our latest version/fix: Aspose.Cells for Java v18.6.4

Your issue should be fixed in it.

Let us know your feedback.

@Amjad_Sahi Thanks for sharing the hotfix. I did a quick testing. The issue is not addressed completely. I see two issues.

  1. The columns have constant width. The column width could have been calculated better (Open the html from my first thread in MS Excel to see what I mean).
  2. The first column data (under “SW SRS Requirement”) is supposed to be bold.

@kumaraswamy.m,

  1. Well, you may try to auto-fit rows and columns before rendering HTML to XLS file format, see the sample code for your reference:
    e.g
    Sample code:

     ........
     com.aspose.cells.AutoFitterOptions options = new com.aspose.cells.AutoFitterOptions();
     		options.setAutoFitMergedCells(true);
     		options.setOnlyAuto(true);
     		workbook.getWorksheets().get(0).autoFitColumns(options);
     		workbook.getWorksheets().get(0).autoFitRows(options);
     ............
    
  2. If you could open your HTML file into MS Excel manually, you will see the data in that column is not bold.

On #1, the first column width takes up the entire screen width. This is not the case when the HTML is opened in MS Excel directly.

#2, correct, it is not bold. I think it is due to “hyperlink” style getting applied after font-weight:bold.

@kumaraswamy.m,

We could not re-produce the issue where first column width takes up the entire screen width. The output XLS and XLSX file opens fine in Excel 365 and all the columns widths equal and visible on the screen without any scrolling. We have recorded this difference in our database and will work on it soon. We will write back here when any update is available.

@kumaraswamy.m,

We have further investigated the requirement where column widths are not appropriate. Please try the following sample code and provide your feedback.

HTMLLoadOptions opts = new HTMLLoadOptions(LoadFormat.HTML);
opts.setAutoFitColsAndRows(true);
Workbook wb = new Workbook(filePath + "html\\html1.htm", opts);
wb.save(filePath + "out_java.xlsx");
wb.save(filePath + "out_java.xls");

UpdatedView.PNG (115.5 KB)

Thanks. This looks much better. When is 18.7 release? probably before 10th? I’ll test more cases, complex scenarios and let you know.

@kumaraswamy.m,

Generally we publish official releases (.NET/Java) around the third week of every month. But it is not final all the time.The release is published when ready. Hopefully Aspose.Cells for Java v18.7 will be published in the third week of July 2018.

conversion_fails.zip (5.5 KB)

For the above html, the HTML to Excel conversion fails with the below error. The css property “windowtext” could be found in the .css file within the archive.

I can open this HTML successfully from MS Excel manually.

java.lang.NumberFormatException: For input string: “windowtext”
at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1973)
at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:146)
at java.lang.Double.parseDouble(Double.java:552)
at com.aspose.cells.a.c.zp.a(Unknown Source)
at com.aspose.cells.zahp.b(Unknown Source)

@kumaraswamy.m,

We were able to observe the issue where this exception is raised while loading the HTML file into Workbook but we need to look into it more. We have logged the issue in our database for investigation and for a fix. Once, we will have some news for you, we will update you in this topic.

This issue has been logged as

CELLSJAVA-42675 - NumberFormatException raised while loading the Html file into Workbook

The issues you have found earlier (filed as CELLSJAVA-42635) have been fixed in Aspose.Cells for Java 18.7. Please also see the document for your reference: Installation|Documentation