How to get HTML of a Cell & Range of Cells Entirely in Memory

How do you get the HTML representation, with all formatting (i.e. borders, background, font settings, etc.), from a specific cell and a range of cells (named range would be better), without saving as another workbook or file.

I would like to keep this as an in memory operation. All search results I have found on the forum seem to require saving the workbook. A Java example would be preferable.

@DanielAbc123,

Thanks for your query.

Well, you may try to use Cell.getHtmlString() to retreive the HTML representation of cell with its value for your requirements. There is no such attribute for Range or Name objects in the APIs, so you have to traverse/loop though the range cells and use the suggested attribute for individual cells for the purpose.

Cell.getHtmlString() does not return all properties (i.e. formatting like borders font settings, background, foreground, rowspan, colspan, etc.) for the cell. How can I achieve this?

Can a feature be added to keep this in memory for a range?

How about also adding a feature for saving the workbook HTML to a stream or something?

@DanielAbc123,

You may try following sample code which exports a range of cells to Html which is later saved to memory. Please give it a try and provide your feedback.

// Load the Excel file.
Workbook wb = new Workbook("sample.xlsx");

// Access the sheet
Worksheet ws = wb.getWorksheets().get(0);

// Set the print area.
ws.getPageSetup().setPrintArea("D2:M20");

// Initialize HtmlSaveOptions
HtmlSaveOptions options = new HtmlSaveOptions();

// Set flag to export print area only
options.setExportPrintAreaOnly(true);

// Save the workbook in memory stream
ByteArrayOutputStream baout = new ByteArrayOutputStream();
wb.save(baout, options);

The sample code you sent is working great! Thanks.

I have one more question. If cell D2 referenced above in the sample code is a named range called “MyFancyCell”, how can I get that named range in the HTML for that TD (i.e. as an ID or Name of “MyFancyCell”)? Is there some setting or function built into Aspose.Cells that allows that named range to come out into the HTML? If not, can you think of another way to get that out into the HTML?

@DanielAbc123,

You may try following sample code which exports a named range to Html. Please give it a try and provide your feedback.

/**
* Generates HTML from the given named range within the given Excel workbook and
* saves the HTML as a file.
*
* @param xlFile full path to the Excel file.
* @param xlRange String representing the named range to be converted to HTML
* @param outFile full path of output file.
*/
private static void getHtml(String xlFile, String xlRange, String outFile) {

try {
Workbook wb = new Workbook(xlFile);
com.aspose.cells.NameCollection names = wb.getWorksheets().getNames();
Name name = names.get(xlRange);

Range rng = name.getRange();
Worksheet sheet = rng.getWorksheet();
Cells cells = sheet.getCells();
int firstCol = rng.getFirstColumn();
int firstRow = rng.getFirstRow();
int maxCol = cells.getMaxDisplayRange().getColumnCount();
int maxRow = cells.getMaxDisplayRange().getRowCount();

// hide all the rows and columns that are not part of the range
if (firstCol > 0) {
cells.hideColumns(0, firstCol - 1);
}
if (firstCol + rng.getColumnCount() < maxCol) {
cells.hideColumns(firstCol + rng.getColumnCount(), maxCol);
}
if (firstRow > 0) {
cells.hideRows(0, firstRow - 1);
}
if (firstRow + rng.getRowCount() < maxRow) {
cells.hideRows(firstRow + rng.getRowCount(), maxRow);
}

wb.getWorksheets().setActiveSheetIndex(sheet.getIndex());

// export the worksheet
com.aspose.cells.HtmlSaveOptions options = new com.aspose.cells.HtmlSaveOptions(SaveFormat.Html);
options.setEncoding(com.aspose.cells.Encoding.getUTF8());
options.setHtmlCrossStringType(com.aspose.cells.HtmlCrossType.CROSS);
options.setPresentationPreference(true);
options.setExportHiddenWorksheet(false);
options.setExportActiveWorksheetOnly(true);
options.setExportImagesAsBase64(true); // aovids the temp folder
options.setCreateDirectory(false);
options.setExpImageToTempDir(false);
options.setHiddenColDisplayType(com.aspose.cells.HtmlHiddenColDisplayType.REMOVE);
options.setHiddenRowDisplayType(com.aspose.cells.HtmlHiddenRowDisplayType.REMOVE);

wb.save(outFile, options);

} catch (Exception ex) {
System.out.println("Unexpected exception: " + ex.getMessage());
ex.printStackTrace();
}
}

The above code is awesome but not exactly what I was looking for. Let me try and explain it better.

In the HTML that is output, is there a way to get the name of named cells that are within the named range and/or their addresses exported to the HTML as an ID or HTML 5 data-* attribute of the TD’s and TABLE? I would think there might be some HTMLSaveOptions settings or another feature that would allow me to get this in the HTML.

Notice the data-* attributes below. If this is not available, can you make this a feature pretty quickly, or can you think of another way that I could do this?

<table data-range-address='A1:J20' data-named-range='MyTableAsHtml'>
    <tr height='21' style='mso-height-source:userset;height:15.75pt' id='r0'>
        <td colspan='10' height='18' class='EPM_x21' width='974' style='height:13.5pt;' data-range-address='A1' data-named-range='MyFancyCell'>Fancy Cell Value</td>
    </tr>
    ...
</table>

Let me know if you need more information.

@DanielAbc123,

Aspose.Cells follows MS Excel standards and specifications when parsing or rendering HTMLs. I am not sure if MS Excel gives different results than what Aspose.Cells gives for exporting HTML. If you think MS Excel works other way and can render as per your desired results using specific options, let us know with details and steps involved, sample files (input and output) and some screenshots to highlight it, we will check it on how to do it via Aspose.Cells APIs.