Xlsx to html conversion issue

I am using Aspose cell java API to convert XLSX file to HTML

<dependency>
  <groupId>com.aspose</groupId>
  <artifactId>aspose-cells</artifactId>
  <version>21.6</version>
</dependency>

I tried this by two different ways.

First Method (Failing)

File file = new File("path/to/file/TestAspose.xlsx");
ByteArrayInputStream in = new ByteArrayInputStream(FileUtils.readFileToByteArray(file));
ByteArrayOutputStream out = new ByteArrayOutputStream();

Workbook workbook = new Workbook(in);
HtmlSaveOptions options = new HtmlSaveOptions();
options.setPresentationPreference(true);
workbook.save(out, options);

FileOutputStream fout = new FileOutputStream("TestAspose.html");
fout.write(out.toByteArray());
fout.flush();

This is what I can see when I open the converted file(.html) open in the chrome browser.

Second Method (Succeeded)

File file = new File("path/to/file/TestAspose.xlsx");
ByteArrayInputStream in = new ByteArrayInputStream(FileUtils.readFileToByteArray(file));
ByteArrayOutputStream out = new ByteArrayOutputStream();

Workbook workbook = new Workbook(in);
HtmlSaveOptions options = new HtmlSaveOptions();
options.setPresentationPreference(true);
workbook.save("TestAsposeWorking.html", options);

Both cases can convert the file without any error but first way the converted file not loading properly and in second case can load the generated html in the browser without any issue. But I need to use first way save method to do the task. Here I have attached the different of these generated files as a screenshot.


(in image Second Method Result Left & First Method Result Right)

Appreciate if anyone can explain what is the issue in my code in first way.

OS : Ubuntu 20.4
Java Version : openjdk version “1.8.0_282”
Maven Version : Apache Maven 3.6.3

@hasithwijerathna,
We are working on your sample XLSX files and will share our feedback soon.

Thank you @ahsaniqbalsidiqui :slight_smile:

@hasithwijerathna,
We have observed this issue and logged it inn our database for further investigation. We will write back here once any update is ready for sharing.

This issue is logged as:
CELLSJAVA-43533 - XLSX to Html creation issue in Ubuntu

1 Like

Hi @ahsaniqbalsidiqui, Is there any temporary solution until your are fixing it or can you guys suggest any older version this is working.

@hasithwijerathna,
We will update you once it is investigated and some details are ready to be shared. Please spare us some time.

@ahsaniqbalsidiqui Ok thanks for the update.

@hasithwijerathna,

We evaluated your issue further. This is not an issue with the APIs. When you export a file to HTML, it contains multiple files. If you write to a stream, an error would occur. You can export each sheet page in the file separately, and then combine them as needed. According to the example file, you can use the following code to fix your issue:
e.g.
Sample code:

File file = new File("path/to/file/TestAspose.xlsx");
ByteArrayInputStream in = new ByteArrayInputStream(FileUtils.readFileToByteArray(file));
ByteArrayOutputStream out = new ByteArrayOutputStream();

Workbook workbook = new Workbook(in);
HtmlSaveOptions options = new HtmlSaveOptions();
options.setPresentationPreference(true);

//add this line to export current active worksheet only.
options.setExportActiveWorksheetOnly(true);

workbook.save(out, options);

FileOutputStream fout = new FileOutputStream("TestAspose.html");
fout.write(out.toByteArray());
fout.flush();

Please try it and let us know your feedback.

@hasithwijerathna,

Moreover, you can also implement the interface “IStreamProvider” to get the correct results:
The sample code is as follows:
e.g.
Sample code:

String filePath = "D:\\JAVA43533\\";
String outputPath = filePath + "output\\";

ByteArrayOutputStream out = new ByteArrayOutputStream();
Workbook workbook = new Workbook(filePath + "TestAspose.xlsx");

HtmlSaveOptions saveOptions = new HtmlSaveOptions();
HtmlSaveOptions options = new HtmlSaveOptions();
options.setPresentationPreference(true);        
saveOptions.setStreamProvider(new ExportStreamProvider(outputPath));        

FileOutputStream stream = new FileOutputStream(filePath + "out.html");
workbook.save(stream, saveOptions);

public class ExportStreamProvider implements IStreamProvider
{
    private String    outputDir;

    public ExportStreamProvider(String dir)
    {
        outputDir = dir;
    }

    @Override
    public void closeStream(StreamProviderOptions options) throws Exception
    {
        if (options != null && options.getStream() != null)
        {
            options.getStream().close();
        }
    }

    @Override
    public void initStream(StreamProviderOptions options) throws Exception
    {
        File file = new File(outputDir);
        if (!file.exists() && !file.isDirectory())
        {
            file.mkdirs();
        }
        String defaultPath = options.getDefaultPath();
        String path = outputDir + defaultPath.substring(defaultPath.lastIndexOf("/") + 1);
        options.setCustomPath(path);
        options.setStream(new FileOutputStream(path));
    }
}

Hope, this helps a bit.

@Amjad_Sahi I tried this also and it worked but In here I think we cannot get converted all sheets if multiple sheets available in the provided xlsx.

@hasithwijerathna,

Yes, if you use the code segment (in the post), only active sheet would be converted to HTML. If a workbook contains multiple sheets, you got to activate each sheet (one by one) to generate separate HTMLs and then finally combine the HTMLs to single HTML.

Alternatively, you may try to use the code segment shared here.

Let us know if you still have any issue, we will check it soon.

@Amjad_Sahi

I can see several files generated when calling
workbook.save(tmpFileName, options);

root_folder.JPG (12.2 KB)
inside the folder i can see like this
inside_folder.JPG (17.1 KB)

but is there a way to get each of those files inside the folder as a output stream (like ByteArrayOutputStream)

if i call like below
ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); workbook.save(byteArrayOutputStream, SaveFormat.HTML);

I am getting only the html file as a stream and then without other file I cannot open this file (as I mentioned in the very first comment.)

Do you need to render single HTML file per worksheet? Please see the code segment in the post for your reference. Please share your sample code (runnable) and template file(s). Please zip the files prior attaching here.

I am not sure about your requirements. Could you elaborate and give us more details and samples for reference.

Ok I got your point in the previous comment regarding the active sheet…

If I generate each sheet separately, is there a way to combine those file in to a single file like tab separated html with your library.

like below
image.png (5.8 KB)

@hasithwijerathna,
1, Could you post a template file and your excepted file here? We will check it ASAP?
We can try to write all data to a single table , but we do not how to layout these sheets.
2, Now you can merge all workshes to one worksheet with Range.Copy() method, then remove other worksheets.

@Override
public byte[] convert(Source source) throws ConversionException {

UUID uuid = UUID.randomUUID();
String tmpFileName = String.format("%s.html", uuid);
Path filePath = Paths.get(tmpFileName);
String tmpFolder = String.format("%s_files", uuid);
try (
    ByteArrayInputStream in = new ByteArrayInputStream(source.getStream())) {

    Workbook workbook = new Workbook(in);

    HtmlSaveOptions options = new HtmlSaveOptions();
    options.setPresentationPreference(true);

    workbook.save(tmpFileName, options);
    log.info("Temp file crated with name {}, and related files inside the {} folder", tmpFileName, tmpFolder);

    return Files.readAllBytes(filePath);
} catch (Exception e) {
    throw new ConversionException("Error when converting document", e);
} finally {
    try {
        Files.delete(filePath);
        log.info("File {} Deleted", tmpFileName);
        FileUtils.deleteDirectory(new File(tmpFolder));
        log.info("All created temp files deleted : {}, {}", tmpFileName, tmpFolder);
    } catch (IOException e) {
        log.error("Delete error : {}", e.getMessage());
    }
}

}

Above is the current implementation and we are going to avoid the IO part here.

Currently we are uploading the generated tmpFolder(with files) to our aws S3 bucket. if we have a way to get a single stream for the whole content of the xlsx(Excel file with multiple work sheets) file we can upload it to s3 without an IO operation.

1). Template File
Students.zip (10.1 KB)
Expected File as a stream
expectedFile.zip (8.7 KB)

2). We actually don’t want to merge work sheet to one. We need a Html with multiple tabs for each sheet.

if there is a way to join for a meeting I can demo it for you

@hasithwijerathna,

  1. We think you may try to save your Excel workbook to .mht file format. This way there will be single file (having all sheet tabs in it) for your requirements. See the document on how to save to MHTL file format for reference.

  2. Do you need to get something like: Developer Guide|Documentation ?

We are sorry but we do not provide support via net meetings or phone. The best way to get support is via forums (that you are using).

@hasithwijerathna,

We have recorded your requirements against your existing ticket “CELLSJAVA-43533” into our database for thorough investigations. We will look into it soon. Once we have an update on it, we will let you know.

@Amjad_Sahi Thank you for the update, I tried your M_HTML conversion thing even in early you mentioned but the thing is it cannot open in the browser directly.

here is the file I converted
Students.zip (6.1 KB)
I was unable to open it in a browser with all the content. but in Internet Explorer only showing the first work sheet.

@hasithwijerathna,

Thanks for your feedback.

Since we have recorded your requirements (as per our previous post in the thread) now, so, let us evaluate it and we will try to support your needs.

Once we have an update on it, we will let you know here.

1 Like