Embedded Chart exported to HTML is bounded to a cell

Hello,

We need to import an Excel worksheet into a Word document. As prescribed (in this post), we are first exporting the worksheet to HTML and then importing the HTML into the Word document using the DocumentBuilder class. However, when the worksheet contains an embedded chart, the chart is shrunk down and bounded to a single cell and the resulting table in the Word document looks bad.

You can observe this behavior using the attached Excel file (“EmbeddedChart.xlsx”) and the following code (which runs on the latest versions of Aspose Cells and Apsose Words):

private static void exportExcel2WordTest() {
try {
Workbook wb = Workbook wb = new Workbook(“EmbeddedChart.xlsx”);
Worksheet sheet = wb.getWorksheets().get(“Sheet1”);
wb.getWorksheets().setActiveSheetIndex(sheet.getIndex());

// export the worksheet
HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML);
options.setExportActiveWorksheetOnly(true);

String html = null;
ByteArrayOutputStream byOut = new ByteArrayOutputStream();
try {
wb.save(byOut, options);
html = new String(byOut.toByteArray());
// TODO: there seems to be a bug in Aspose where some extra bytes are
// prepended prior to the first ‘<’ in the HTML so we remove it here:
html = html.substring(html.indexOf(‘<’));

// remove the temporary directory which is always generated
// TODO: This is returning null for some reason
String tempDir = options.getAttachedFilesDirectory();
if (tempDir != null) {
File dir = new File(tempDir);
if (dir.exists()) {
dir.delete();
}
}
} finally {
byOut.close();
}

Document wdDoc = new Document();
DocumentBuilder builder = new DocumentBuilder(wdDoc);
Paragraph para = wdDoc.getLastSection().getBody().getLastParagraph();
builder.moveTo(para);
builder.startBookmark(“Xl_Content”);
// TODO: writeln needed for the html to be added within the bookmark,
// but also adds an empty line
builder.writeln();
builder.insertHtml(html, false);
builder.endBookmark(“Xl_Content”);
wdDoc.save(“Excel2Doc.docx”);

} catch (Exception ex) {
System.out.println("Unexpected EXCEPTION: " + ex.getMessage());
ex.printStackTrace();
}
System.out.println(“End of exportExcel2WordTest method”);
}

If you compile and run the above code (and download the attached “EmbeddedChart.xlsx” file), you should see that although the embedded chart from the worksheet does get imported, it is shrunk and bounded to a single cell producing unexpected results in the Word document.

Additionally (and less importantly), there are some follow up questions I have related to the TODO items in the code above:
  1. The HTML generated from the Excel file always seems to have extra byte characters before the opening “<” character. These are manually removed and don’t seem to affect the imported content, but I wonder why it exists.
  2. The export of the worksheet to HTML always generates a folder. For example the code above produces a new folder named “EmbeddedChart_files”. While I understand the need for the folder, I would like to remove after the import to Word is complete. Is there any way to avoid having the folder generated? If not, how can I determine the folder name from Aspose so that I can remove it manually.
  3. I want to put the worksheet content within a bookmark. However, it seems it’s necessary to add en empty line (via “builder.writeln()”) in order to accomplish this. This results in an extra line at the end of the imported content - which I don’t need. Is there any way to insert the HTML without having to insert the extra line?
Thanks in advance for your help.

Hi,

Thanks for your posting and using Aspose.Cells.

Please download and use the latest version: Aspose.Cells for Java 8.5.1 it should fix your issue.

I have tested this issue with the following sample code and it generates correct html without any chart issue. I have attached the output html and screenshot showing the html output for your reference.

Java


String filePath = “F:\Shak-Data-RW\Downloads\EmbeddedChart.xlsx”;


Workbook wb = new Workbook(filePath);

Worksheet sheet = wb.getWorksheets().get(“Sheet1”);

wb.getWorksheets().setActiveSheetIndex(sheet.getIndex());


// export the worksheet

HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML);

options.setExportActiveWorksheetOnly(true);


wb.save(filePath + “.out.htm”, options);

Hi Shakeel,

Thank you for your prompt response and feedback.

The issue is not really related to the output produced for an HTML file - as per the code you provided. The issue is how the exported HTML (produced by Aspose Cells) is imported into a Word document (via the DocumentBuilder’s “InsertHtml” API) by Aspose Words.

So the screen shot you provided is what I would expect to see within the Word document. However, if you compile and run the code I provided above, when you open the produced Word file (i.e. “Excel2Doc.docx”) - you should see that the chart is shrunk and scaled to fit within a single cell of the generated Word table.

Note that I did try using Aspose.Cells for Java 8.5.1 (and Aspose.Words for Java 15.6.0) and the results were the same. If you could please try to reproduce the issue using the code I provided, I think you will agree with me that the output in the Word file is not similar to the output from the HTML file that you attached.

Thanks again.

Hi,


Thank you for writing back.

We have executed the complete scenario; that is, exporting the worksheet to HTML format with Aspose.Cells for Java 8.5.1 & embedding it into a Word document using Aspose.Words for Java 15.6.0. We have noticed that the HTML generated by Aspose.Cells for Java API renders the chart correctly when viewed in any browser, however, when same HTML is embedded in a Word document, the aforesaid chart is squeezed. Please check the attached resultant files generated with the code provided at the end of this post.

In order to investigate the matter from Aspose.Words perspective, we have moved this thread to Aspose.Total support forum. We will shortly get back to you with more updates in this regard.

Java

Workbook wb = new Workbook(“D:/EmbeddedChart.xlsx”);
Worksheet sheet = wb.getWorksheets().get(“Sheet1”);
wb.getWorksheets().setActiveSheetIndex(sheet.getIndex());
HtmlSaveOptions options = new HtmlSaveOptions(SaveFormat.HTML);
options.setExportActiveWorksheetOnly(true);
options.setExportImagesAsBase64(true);

String html = null;
ByteArrayOutputStream byOut = new ByteArrayOutputStream();
wb.save(byOut, options);
html = new String(byOut.toByteArray());
byOut.close();

BufferedWriter bw = new BufferedWriter(new FileWriter(“D:/html.html”));
bw.write(html);
bw.close();

Document wdDoc = new Document();
DocumentBuilder builder = new DocumentBuilder(wdDoc);
Paragraph para = wdDoc.getLastSection().getBody().getLastParagraph();
builder.moveTo(para);
builder.startBookmark(“Xl_Content”);
builder.writeln();
builder.insertHtml(html, false);
builder.endBookmark(“Xl_Content”);
wdDoc.save(“D:/Excel2Doc.docx”);
oraspose:

Additionally (and less importantly), there are some follow up questions I have related to the TODO items in the code above:
  1. The HTML generated from the Excel file always seems to have extra byte characters before the opening "<" character. These are manually removed and don't seem to affect the imported content, but I wonder why it exists.
  2. The export of the worksheet to HTML always generates a folder. For example the code above produces a new folder named "EmbeddedChart_files". While I understand the need for the folder, I would like to remove after the import to Word is complete. Is there any way to avoid having the folder generated? If not, how can I determine the folder name from Aspose so that I can remove it manually.

Regarding the above two points, please find the answers as follow.

  1. I have attached the HTML generated with latest version of Aspose.Cells for Java 8.5.1, and I am not able to see any extra "<" characters. Could you please check the html.html file in previously attached archive?
  2. The spreadsheet to HTML conversion process creates the folder to save the images and child html files for the worksheets. As you are currently converting only the active worksheet therefore the folder will contain only the images (of charts & shapes). You can avoid the folder generation by setting the HtmlSaveOptions.setExportImagesAsBase64 as demonstrated in the code snippet shared earlier.

Thank you Babar!

I’m glad that you were able to reproduce the issue and I appreciate the additional details you provided. I will wait for the good people at Aspose Total to provide additional feedback.

In the mean time, you are correct regarding the additional two points. I will make the appropriate changes on my end.

Thank you again for your investigation and feedback.

Hi,

I want to put the worksheet content within a bookmark. However, it seems it’s necessary to add en empty line (via “builder.writeln()”) in order to accomplish this. This results in an extra line at the end of the imported content - which I don’t need. Is there any way to insert the HTML without having to insert the extra line?

It is not necessary to add empty line at start of bookmark for normal text/html however your worksheet content starts with a table and you cannot start a table and bookmark from the same line. If you try to do the same in MS Word, your bookmark start will be moved inside the table.

There must be a paragraph/blank line to start bookmark before the table start so this is expected behavior.

Please check the last example at https://docs.aspose.com/words/java/working-with-bookmarks/ for more details on how to bookmark a table.

Best Regards,

Hi Muhammad,

Thank you for your feedback and explanation. I was able to observe the behavior you describe and agree with your assessment.

By the way, would you know if there has been any updates made on the primary issue of this thread - related to the “chart shrinkage” with the embedded chart from Excel is imported into Word?

Thank you and regards.

Hi,

Chart shrinkage issue has been logged into our issue tracking system as WORDSNET-12210. We will be able to share more details once our product team completes their analysis. We will keep you updated on this issue in this thread.

Best Regards,

The issues you have found earlier (filed as WORDSNET-12210) have been fixed in this .NET update and this Java update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.