HTML-to-Excel Conversion using Aspose.Cells or any other Aspose Product(s)

We have a requirement to read some legacy reports pushed out in HTML and convert them to Excel. I don't see this feature available in Aspose.Cells, but was wondering if you could offer any guidance to accomplish this task.

We're open to purchasing other Aspose products to assist with the conversion if required. Obviously, we need to preserve the formatting of the text in the HTML.

Thanks,

Ed

Hi,

Please try the latest version:
Aspose.Cells for .NET v7.0.3.4 to export your html to excel documents.

If you are unable to convert some report then post them here, we will look if they could be supported.

Also, you could try Microsoft Office Automation to convert your reports from HTML to Excel documents.

Thanks for a quick reply. By way of background, our goal is to take HTML reports from a legacy system and use Aspose.Cells to convert those reports to Excel. We DO NOT want to use Microsoft Office to avoid license costs as these conversions must take place on a server farm.

We conducted a test with 7.0.3.4 as you suggested. The newer version does a better job. However, there were some issues. Attached is a zip file with two files: (1) the input HTML file (7.html) and the resulting output produced by Aspose.Cells (Output7.html.xlsx).

As you can see, the table values made it to Excel. However:

  1. The underlining on the column headers were removed.
  2. The number formats were removed.
  3. The HTML header values are not included in the Excel file.

Are the missing items a result of bugs in Aspose.Cells, unsupported features or was there something wrong with our test?

Hi,

Thanks for your files.

We will look into it if your html report can be fixed while converting it to xlsx report. We have logged this issue in our database. We will update you asap.

This issue has been logged as CELLSNET-40183.

I noticed the original post in this thread indicates that issue CELLSNET-40183 has been Resolved. I didn't get notified, please confirm that this problem is in fact resolved.

Secondly, the CELLSNET-40183 hyperlink takes me to a log-in page for a Nanjing JIRA bug tracking system. Is this just for Aspose personnel or do clients have access to the system?

Thank you,

Ed

Hi,


The issue should be resolved in the latest version e.g v7.1.0:
http://www.aspose.com/community/files/51/.net-components/aspose.cells-for-.net/entry355234.aspx

Please try it and let us know the result.

Also, we have our internal issue tracking system, so you cannot access it. You have to ask us to know the status or update about your issues.

Thank you.

Thank you for the quick reply to this issue. As you suggested, I ran a test with the latest 7.1 version. The test demonstrated good progress. You guys did address the three explicit issues I had identified earlier. However, I believe your conversion logic still requires a little tweaking. For example,<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

  1. The numeric columns in the spreadsheet (C, D & E) lost their formatting. They were comma delimited in the HTML. The cells should be of Number type.
  2. The percentage column (F) did not convert as a Percentage type.

With respect to item 1, the conversion should respect the prevailing culture setting of the thread performing the work. It should recognize numeric values even if the number is delimited by something other than a comma (e.g. some cultures use periods as thousand-delimiters). For the record, I didn't test this but want to make sure it's covered on your end.

For your convenience, I am attaching the test HTML input file and XLSX output file from my test.

Again, I appreciate the improvements made by 7.1 and look forward to you guys addressing the aforementioned issues.

Kindly confirm that the above will be issues you will look into. If you feel these are not valid issues please let me know.

Thanks,

Ed

Hi,

Thanks for your comments and feedback, testing the issue with the latest version and providing the needed files.

We have logged the provided information in our database. Once the issue is resolved or we have some update for you, we will let you know asap.

I noticed the CELLSNET-40183 issued is flagged as RESOLVED. Does Aspose still plan to look into, and hopefully resolve, the remaining issues reported (see previous post)? If so, why is the issued flagged as Resolved? I have a pending project kind of waiting on the ability of Aspose.Cells to accurately convert HTML reports to Excel.

Thanks,

Ed

Hi,

We have run the code like below using the latest version:
Aspose.Cells
for .NET v7.1.0.1



C#


string filePath = @“F:\7.html”;

Workbook workbook = new Workbook(filePath);

workbook.Save(filePath + “.net.xls”);

You can see the output file from attachment and it appears to be ok.

If there are still problems, then please highlight the problems using the red circles in a screenshot.


Judging from the output it looks like you addressed the formatting issues I previously raised.

The only item I noticed is the "Total:" literal on column "A", row "11". On the spreadsheet it's normal font even though in the source HTML it's bold. I didn't pick up on this in my last review and neither did your developers. Kindly fix this bug. Nevertheless, this is forward progress.

The v7.1.0.1 link doesn't work for me. I presume this is because it is not yet publically available. I can wait for the next release, especially if you can address the aforementioned font issue.

Thanks for your help.

- Ed

Hi,

Please download and try the latest version:
Aspose.Cells
for .NET v7.1.0.2
and let us know your feedback.

Kindly highlight the problems including font issue with red circles. You can use Ms-Paint for this purpose.

It will help us sort out the problems.

Thanks for providing access to 7.1.0.2. I ran a test. Unfortunately, the field in question is still not being bold in the spreadsheet as in the original source HTML file.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

Attached is a zip file containing (1) the original HTML, (2) a JPG with the field circled in red, and (3) the XLSX output file annotated to again highlight the erroneous field.

Let me know if I can be of any further assistance to resolve the problem.

Thanks,

Ed

Hi,

Thanks for your files.

We have found the issue of bold font and already fixed yesterday. We will provide the fix to you soon as it would be contained in the next version.

That's great news. I appreciate all your help. The tool is at a point that it's definitely usable for us.

FYI, there's one other cosmetic issue that was brought to my attention yesterday by a fellow team member. The Aspose.Cells HTML-to-Excel conversion apparently is ignoring the HTML table cell alignment attribute (e.g. align="right" ). In the sample report I gave you, in my previous post, all of the column headings have explicit align values, but they're not aligned accordingly in the Excel output. This won't be a show stopper for us, but that added fidelity and attention to detail just makes the product that much better.

I look forward to the next release with the stated bug fixes.

Kind regards,

Ed

Hi,

Thanks for your posting.

We have also added the issue mentioned by you against the issue id: CELLSNET-40183

We will look into it and update you asap.

Hi,

We have fixed this issue. Please download: Aspose.Cells for .NET v7.1.0.4

I tested v7.1.0.4. You nailed it!! All previously reported issues have been fixed.

Thank you for your patience and attention to theses issues.

Cheers,

Ed

Hi,

We have fixed these issues. You can also download and use the latest version: Aspose.Cells for .NET v7.1.0.6

The issues you have found earlier (filed as CELLSNET-40183) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.