Excel to HTML string conversion in Java using Aspose.Cells

Hi Team,

I needed to convert Excel Sheet Table Data to Html String and use that string to save in the DB. There can be multiple sheets.

I know that we can convert Excel sheet with table data to HTML file using this code.

HtmlSaveOptions options = new HtmlSaveOptions();
options.setExportHeadings(true);
workbook.save(outputDir + “PrintHeadings_out.html”, options);

But I don’t want to save the file, instead I want to convert the table data to html string and save it in the DB.

Can you please help?

Sample Excel.zip (6.3 KB)

@ashish29190,

There is no direct way to accomplish your task. You have to use your own code to save to database/data source. Please note, Aspose.Cells for Java provides Cell.getHtmlString attribute which you may use to replace existing string by html string. See the sample code segment for your reference:
e.g.
Sample code:

Workbook book = new Workbook("f:\\files\\Sample Excel.xlsx");
        Worksheet sheet = book.getWorksheets().get(0);
        Range range = sheet.getCells().getMaxDisplayRange();//You may also create your desired range (in the worksheet) using, e.g sheet.getCells().createRange("A2", "B3");
        Iterator rangeIterator = range.iterator();
        while(rangeIterator.hasNext())
        {
            Cell cell = (Cell)rangeIterator.next();
            cell.putValue(cell.getHtmlString());
            //your code goes here.
        }

            //your code goes here.

Next you may export the data in the sheet(s) to fill an array or data table, see the document on data exporting options for your reference.

Now you may use your own Java code to insert/fill this data (in array or data table) to database/data source for your needs.

Hi @Amjad_Sahi,

Thanks for your reply.

I do have my logic to save the html string to DB.

But I am still not clear about how I am going to build a table. What I was looking for is an HTML string that will look alike the table in the excel.

I tried your code and saw HTML string getting created for each cell, but it is not anything related to HTML <TR> and <TD>.

So what I understand is that, I need to create a code where a table string will be created, and every cell’s html string needs to be appended to table string within TR and TD tags to form a complete table.

Please let me know if I am right.

Thanks,
Ashish M

@ashish29190,
You may try the following sample code which converts a table from an Excel file into Html and saves it to a file. This information can be used to save the table as string in the database. Give it a try and share the feedback.

// Instantiating a Workbook object
Workbook workbook = new Workbook();

// Obtaining the reference of the newly added worksheet by passing its sheet index
Worksheet worksheet = workbook.getWorksheets().get(0);

// Adding sample values to cells
worksheet.getCells().get("A1").putValue(50);
worksheet.getCells().get("A2").putValue(100);
worksheet.getCells().get("A3").putValue(150);
worksheet.getCells().get("B1").putValue(60);
worksheet.getCells().get("B2").putValue(32);
worksheet.getCells().get("B3").putValue(50);

// Set the print area.
worksheet.getPageSetup().setPrintArea("A1:B3");

// Initialize HtmlSaveOptions
HtmlSaveOptions options = new HtmlSaveOptions();

// Set flag to export print area only
options.setExportPrintAreaOnly(true);
options.setExportGridLines(true);

//Save to HTML format
workbook.save("output.html", options);

Hi @ahsaniqbalsidiqui,

Thanks for your reply.

I have used your idea and converted excel file to html file to get html information. I have used following code as per my need:

HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML); options.setExportHiddenWorksheet(false); options.setExportImagesAsBase64(true); options.setExcludeUnusedStyles(true); options.setExportComments(false); options.setExportDocumentProperties(false); options.setExportWorkbookProperties(false); options.setExportWorksheetProperties(false); options.setDisableDownlevelRevealedComments(true);

The problem here is that the output html file generated contains commented code in the style tag. Though that part is commented, it works and attaches its style properties to the elements. Can we do anything in the aspose to change this thing.

PFA for the html file generated and Source Excel File.

Summary

aspose issue.zip (38.2 KB)

Note : The Html file is the file which contains html code that I got after reading the converted HTML file.

Regards.
Ashish M

@ashish29190,
I can not find the commented code about style tag in output.html file.
Could you please check the attachment created with Aspose.Cells for Java 20.12 and let us know your feedback.
asposeissue.zip (7.4 KB)

@ashish29190,
We have further investigated the issue and would like to share that there is no other option available in Aspose.Cells which can be used to trim the undesired text from the Html file. However it is quite easy to extract pure table contents from the html file as follows:

	private static void Cells_Html() throws Exception
	{
		// Instantiating a Workbook object
		Workbook workbook = new Workbook();

		// Obtaining the reference of the newly added worksheet by passing its sheet index
		Worksheet worksheet = workbook.getWorksheets().get(0);

		// Adding sample values to cells
		worksheet.getCells().get("A1").putValue(50);
		worksheet.getCells().get("A2").putValue(100);
		worksheet.getCells().get("A3").putValue(150);
		worksheet.getCells().get("B1").putValue(60);
		worksheet.getCells().get("B2").putValue(32);
		worksheet.getCells().get("B3").putValue(50);

		// Set the print area.
		worksheet.getPageSetup().setPrintArea("A1:B3");

		HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML); 
		options.setExportHiddenWorksheet(false); 
		options.setExportImagesAsBase64(true);
		options.setExcludeUnusedStyles(true); 
		options.setExportComments(false); 
		options.setExportDocumentProperties(false); 
		options.setExportWorkbookProperties(false); 
		options.setExportWorksheetProperties(false); 
		options.setDisableDownlevelRevealedComments(true); 

		//Save to HTML format
		workbook.save("output.html", options);
		String data=readFile("output_files\\sheet001.htm",Charset.defaultCharset());
		String result = "<table"+ StringUtils.substringBetween(data, "<table", "</table>")+ "</table>";
		System.out.println(result);
	}
	static String readFile(String path, Charset encoding) throws IOException
	{
	  byte[] encoded = Files.readAllBytes(Paths.get(path));
	  return new String(encoded, encoding);
	}

Here is the program output:

<table border='0' cellpadding='0' cellspacing='0' width='128' style='border-collapse: 
 collapse;table-layout:fixed;width:96pt'>
 <col width='64' span='2' style='width:48pt'>
 <tr height='17' style='mso-height-source:userset;height:12.75pt' id='r0'>
<td height='17' width='64' align='right' x:num="50" style='text-align:right;height:12.75pt;width:48pt;'><a name="Print_Area" >50</a></td>
<td width='64' align='right' x:num="60" style='text-align:right;width:48pt;'>60</td>
 </tr>
 <tr height='17' style='mso-height-source:userset;height:12.75pt' id='r1'>
<td height='17' align='right' x:num="100" style='text-align:right;height:12.75pt;'>100</td>
<td align='right' x:num="32" style='text-align:right;'>32</td>
 </tr>
 <tr height='17' style='mso-height-source:userset;height:12.75pt' id='r2'>
<td height='17' align='right' x:num="150" style='text-align:right;height:12.75pt;'>150</td>
<td align='right' x:num="50" style='text-align:right;'>50</td>
 </tr>
 <tr height='0' style='display:none'>
  <td width='64' style='width:48pt'></td>
  <td width='64' style='width:48pt'></td>
 </tr>
</table>