Aspose.cells throws exception converting valid excel to word

Hello.

I’m trying to do a conversion of a very simple excel to word. The excel is a valid one and it opens without any issue in Microsoft Excel, but whenever I try to convert it with aspose (using 23.1 version of both aspose-cells and aspose-words) I get the following exception:

com.aspose.words.FileCorruptedException: The document appears to be corrupted and cannot be loaded.

I’ve traced this to the single cell in the SQL sheet that contains the text of an sql. Please find the excel attached:SqlResultTest.zip (6.8 KB)

Here is the code I use to convert the excel (and append it to a word document):

private void appendExcelDocument(Document wordDocument, MultipartFile file) {
        try (ByteArrayOutputStream conversionStream = new ByteArrayOutputStream()){
            // Load the Excel file
            var workbook = new Workbook(new ByteArrayInputStream(file.getBytes()));
            // For each sheet set up the page orientation and the cells size
            for (int i = 0; i < workbook.getWorksheets().getCount(); i++ )
            {
                workbook.getWorksheets().get(i).getPageSetup().setOrientation(PageOrientationType.LANDSCAPE);
                workbook.getWorksheets().get(i).getPageSetup().setFitToPagesWide(1);
                workbook.getWorksheets().get(i).getPageSetup().setFitToPagesTall(0);
            }
            // Create the output stream to save the Excel conversion to word
            workbook.save(conversionStream, com.aspose.cells.SaveFormat.DOCX);

            var docAttachment = new Document(new ByteArrayInputStream(conversionStream.toByteArray()));
            // Dont copy the footers/headers from the target document in case they are already defined
            disableHeadersLink(docAttachment);
            // Remove any "empty" paragraphs the document may have in the end
            removeLastParagraph(docAttachment);

            List<Table> tables = Arrays.stream(docAttachment.getChildNodes(NodeType.TABLE, true).toArray())
                    .filter(Table.class::isInstance)
                    .map(Table.class::cast)
                    .collect(Collectors.toList());

            for (Table table: tables) {
                table.autoFit(AutoFitBehavior.AUTO_FIT_TO_WINDOW);
            }

            wordDocument.appendDocument(docAttachment, ImportFormatMode.KEEP_SOURCE_FORMATTING);
        } catch (IOException e) {
            throw new RuntimeException("Could not read content from file " + file.getName() + ". Error: " + e);
        }
        catch (Exception e) {
            throw new RuntimeException("Could not create Aspose Excel document from file " + file.getName() + ". Error: " + e);
        }
    }

private void disableHeadersLink(Document wordDocument) {
    // Disable headers/footers inheriting.
    wordDocument.getFirstSection().getHeadersFooters().linkToPrevious(false);
}

private void removeLastParagraph(Document wordDocument) {
    // Get the last paragraph in the document.
    Paragraph lastPara = wordDocument.getLastSection().getBody().getLastParagraph();
    // Remove last run in the paragraph if it contains page break.
    if (lastPara.getRuns().getCount() > 0)
    {
        Run lastRun = lastPara.getRuns().get(lastPara.getRuns().getCount() - 1);
        if (lastRun.getText().equals(ControlChar.PAGE_BREAK))
            lastRun.remove();
    }
}

Thank you.

@m1tnick,

Please notice, we were able to reproduce the issue as you mentioned by converting your template Excel file to DOCX. We found and it seems the output DOCX is corrupt and MS Word itself prompts error messages when opening the output DOCX into it. We simply used the following sample code to produce the invalid or corrupt DOCX file:
e.g.
Sample code:

        Workbook workbook = new Workbook("f:\\files\\SqlResultTest.xlsx");
        // For each sheet set up the page orientation and the cells size
        for (int i = 0; i < workbook.getWorksheets().getCount(); i++ )
        {
            workbook.getWorksheets().get(i).getPageSetup().setOrientation(PageOrientationType.LANDSCAPE);
            workbook.getWorksheets().get(i).getPageSetup().setFitToPagesWide(1);
            workbook.getWorksheets().get(i).getPageSetup().setFitToPagesTall(0);
        }

        workbook.save("f:\\files\\SqlResultTest.docx", com.aspose.cells.SaveFormat.DOCX);

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSJAVA-45150

You can obtain Paid Support services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@m1tnick
The value of the cell interval is too large and occupies too many columns. If it fits on one page width, it will exceed the column limit of the table in Word. So if there are many on one page, please do not set one page width. We are also studying how to store many columns in the word file.

Hello @simon.zhao, thank you for the reply.

I don’t think the cell interval being too large is the cause of the issue. If you actually edit that specific cell in the excel file I attached in the original post and add more content to the cell then the conversion will be successful. Actually you don’t even need to add more content, just double click the cell (which will make the cell “expand” in height) and then save the excel. The conversion will also be successful.

@m1tnick
1,The values ends in Cell CX2. When exporting to the worksheet with fitting one page wide, we have to create a table with 102 columns in Word, and the max columns of the Word is 63,so the file is corrupted. And please check the print view with Fitting one page wide, you can only see a black line .
We will look into how to avoid a corrupted file if a page contains too many columns.
2,If you double click the cell, the cell value will be wrapped, so the max column is 1., the it works.