HTML Table To excel convert issue


#1

Hi Team ,

We are trying to convert HTML Table into excel , using below code.
There is no error in conversion but excel output is distorted and/or some table information is lost. e.g italic font.

You can replicate Issue using below :

Java Version : 1.8.0_51

Note : We have license for below version.

<dependency>
<groupId>com.aspose</groupId>
<artifactId>aspose-cells</artifactId>
<version>18.4</version>
</dependency>

Java Code :

import com.aspose.cells.HTMLLoadOptions;
import com.aspose.cells.License;
import com.aspose.cells.LoadFormat;
import com.aspose.cells.Workbook;
import org.springframework.core.io.ClassPathResource;
import java.io.ByteArrayInputStream;

public class Test
{
public static void main(String args[]) throws Exception
{
String reportHTML = " <table width=“100%” style=" font-style:italic; font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid blue;">\n" +
“<tr>\n” +
" <td>\n" +
“<table width=“100%” style=“font-style:italic; font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid blue;”>\n” +
" <thead style=“font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid red;”>\n" +
" <tr>\n" +
" <th style=“border: 1px solid rgb(191, 191, 191);”>Table Header 1 </th>\n" +
" <th style=“border: 1px solid rgb(191, 191, 191);”>Table Header 2 </th>\n" +
" </tr>\n" +
" </thead>\n" +
" <tbody>\n" +
" <tr>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 1</td>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 2</td>\n" +
" </tr>\n" +
" <tr>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 3 </td>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 4</td>\n" +
" </tr> \n" +
" </tbody>\n" +
“</table>\n” +
" </td>\n" +
" <td>\n" +
“<table width=“100%” style=“font-style:normal; font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid blue;”>\n” +
" <thead style=“font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid red;”>\n" +
" <tr>\n" +
" <th style=“border: 1px solid rgb(191, 191, 191);”>Table Header 11 </th>\n" +
" <th style=“border: 1px solid rgb(191, 191, 191);”>Table Header 22 </th>\n" +
" </tr>\n" +
" </thead>\n" +
" <tbody>\n" +
" <tr>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 11</td>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 22</td>\n" +
" </tr>\n" +
" <tr>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 33 </td>\n" +
" <td style=“border: 1px solid rgb(191, 191, 191);”>Table cell 44</td>\n" +
" </tr> \n" +
" </tbody>\n" +
“</table>\n” +
" </td>\n" +
“</tr>\n” +
"</table> ";

License license = new License();
license.setLicense(new ClassPathResource(“Aspose.Cell.lic”).getInputStream());

HTMLLoadOptions loadOptions = new HTMLLoadOptions(LoadFormat.HTML);
loadOptions.setSupportDivTag(true);

ByteArrayInputStream bais = new ByteArrayInputStream(reportHTML.getBytes());
Workbook workbook = new Workbook(bais, loadOptions);
workbook.getWorksheets().get(0).autoFitColumns();
workbook.save(“D:\Downloads\output.xlsx”);

}

}

Kindly revert in case any more information is required.


#2

@ashish29190,

Please try our latest version/fix: Aspose.Cells for Java v19.8.x. I have tested using the following sample code and the output file looks ok to me:
e.g
Sample code:

String reportHTML = " <table width=\"100%\" style=\" font-style:italic; font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid blue;\">\n" +
				"<tr>\n" +
				" <td>\n" +
				"<table width=\"100%\" style=\"font-style:italic; font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid blue;\">\n" +
				" <thead style=\"font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid red;\">\n" +
				" <tr>\n" +
				" <th style=\"border: 1px solid rgb(191, 191, 191);\">Table Header 1 </th>\n" +
				" <th style=\"border: 1px solid rgb(191, 191, 191);\">Table Header 2 </th>\n" +
				" </tr>\n" +
				" </thead>\n" +
				" <tbody>\n" +
				" <tr>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 1</td>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 2</td>\n" +
				" </tr>\n" +
				" <tr>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 3 </td>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 4</td>\n" +
				" </tr> \n" +
				" </tbody>\n" +
				"</table>\n" +
				" </td>\n" +
				" <td>\n" +
				"<table width=\"100%\" style=\" font-style:normal; font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid blue;\">\n" +
				" <thead style=\"font-family: Calibri; font-size: 10pt; border-spacing: 1px; border: 1px solid red;\">\n" +
				" <tr>\n" +
				" <th style=\"border: 1px solid rgb(191, 191, 191);\">Table Header 11 </th>\n" +
				" <th style=\"border: 1px solid rgb(191, 191, 191);\">Table Header 22 </th>\n" +
				" </tr>\n" +
				" </thead>\n" +
				" <tbody>\n" +
				" <tr>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 11</td>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 22</td>\n" +
				" </tr>\n" +
				" <tr>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 33 </td>\n" +
				" <td style=\"border: 1px solid rgb(191, 191, 191);\">Table cell 44</td>\n" +
				" </tr> \n" +
				" </tbody>\n" +
				"</table>\n" +
				" </td>\n" +
				"</tr>\n" +
				"</table> ";

				HTMLLoadOptions loadOptions = new HTMLLoadOptions(LoadFormat.HTML);
				loadOptions.setSupportDivTag(true);
				loadOptions.setAutoFitColsAndRows(true);

				ByteArrayInputStream bais = new ByteArrayInputStream(reportHTML.getBytes());
				Workbook workbook = new Workbook(bais, loadOptions);
				workbook.getWorksheets().get(0).autoFitColumns();
				workbook.save("f:\\files\\out1.xlsx"); 

Please find attached the output file for your reference. If you still find any issue using latest version/fix v19.8.x, kindly provide your expected file, we will check it soon.
files1.zip (7.0 KB)


#3

Thanks for quick response.

I have tested your code but still the output is not as expected.

italic font style is lost + table is not drawn properly in excel out.

Kindly find the attached excel , showing expected output & aspose cell out.

Book1.zip (8.5 KB)


#4

@ashish29190,

Thanks for the sample file.

We confirmed the italic font style is lost and we can log a ticket for it. But regarding other issue (“table is not drawn properly”), we have some other findings. Please note Aspose.Cells follows MS Excel standards and specifications when parsing or rendering HTML. Please create a common HTML file manually based on your devised HTML strings (you may write your HTML string into notepad and save to HTML file). Now open the HTML file into MS Excel manually and see the display/view for your formatted table. Aspose.Cells mimics rather should mimic the same view. If you find any issue/difference (while comparing it with Aspose.Cells’ results) other than italic font style, kindly provide your common HTML file and output results (by MS Excel and Aspose.Cells), we will check it soon.


#5

Thanks for confirming italic font is not supported.

As per your suggestion we have created attached html file ( we have used same html string , which we have shared in earlier example code )

when attached file is opened in ms excel , output is as expected but aspose cell is failing.

Request you to please look further into it.Noname.zip (531 Bytes)


#6

@ashish29190,

Thanks for providing us sample HTML file.

I was able to reproduce the issue as you mentioned by using your original sample code. I found that italic font style is lost and table is not rendered properly in HTML to Excel rendering. I also tested using your provided HTML file based on the HTML strings you shared earlier. I tried to open the file and save to XLSX file format but I got the same results:
e.g
Sample code:

HTMLLoadOptions loadOptions = new HTMLLoadOptions(LoadFormat.HTML);
        loadOptions.setSupportDivTag(true);
        loadOptions.setAutoFitColsAndRows(true);

        Workbook workbook = new Workbook("f:\\files\\Noname.html", loadOptions);
        workbook.getWorksheets().get(0).autoFitColumns();
        workbook.save("f:\\files\\out1.xlsx");

I have logged a ticket with an id “CELLSJAVA-43005” for your issue. We will look into it soon.

Once we have an update on it, we will let you know.