Errors in Open Method for FileFormatType.CSV Import

The importer for CSV has some bugs I noticed:

1) The length is guessed wrong in some cases leading to truncated data
2) The double quote escape sequence doesn’t work correctly
3) The data import should remove trailing and leading spaces in a field
4) Data type is inconsistent causing data loss of preceding zeros (corrupts zip codes)

The following test file is a valid CSV and can be used to validate an import:

John,Doe,120 jefferson st.,Riverside, NJ, 08075
Jack,McGinnis,220 hobo Av.,Phila, PA,09119
"John ““Da Man””",Repici,120 Jefferson St.,Riverside, NJ,08075
Stephen,Tyler,“7452 Terrace ““At the Plaza”” road”,SomeTown,SD, 91234
,Blankman,SomeTown, SD, 00298
"Joan ““the bone””, Anne",Jet,“9th, at Terrace plc”,Desert City,CO,00123

Can you verify these bugs and create a fix?

Also, here are some useful references regarding CSV. I plan to use your object for a large project and CSV handling needs to be bulletproof or we need to write a set of classes for CSV imports.

http://overlord.evolvable.com/ExcelExport/CSVRules.txt

http://overlord.evolvable.com/ExcelExport/CSV%20Gotchas.txt


Also, kudos for your support and fix turn around. You guys are really fast!

Aaron, Evolvable Corp.


Hi Aaron,

Thanks for your report. I will fix this problem within one week. Could that serve your need?

Yes, thanks much for your response on a weekend! It will be nice to use just Aspose to handle all the inbound formats on our project. Aspose.Excel looks like an excellent object, and I look forward to using it over the next few months.

Hi Aaron,

Please try this attached fix. The first two problems are fix. But about the following two issue:

3) The data import should remove trailing and leading spaces in a field
4) Data type is inconsistent causing data loss of preceding zeros (corrupts zip codes)

I have different thoughts. When MS Excel opens your csv file, it doesn’t trim trailing and leading spaces. And it converts your zip code into numeric data which causes to lost the preceding zeros.

Aspose.Excel is a spreadsheet creating engine and it should following the MS Excel ways to processing CSV file. Otherwise, it will cause confusion. Thanks for your understanding.

You can try to remove the extra spaces and add double quotes to your zip codes with your own code.

I understand "Aspose.Excel is a spreadsheet creating engine and it should follow the MS Excel ways to processing CSV file".

Indeed opening a .csv file with Aspose open method, we are getting the same result as if the file had been opened with Excel. The data of the .csv file are imported by lines and each line is pasted in the first cell of every line.

Under Excel we are able to nicely dispatch the data per column by selecting the first column and by going in the menu Data > Convert and specifying comma as the delimeter and specifying that every values are contained within " ". Could you please tell me what would be the most appropriate way to simulate that behavior using Aspose.Cells methods ?

Regards, Renaud

Hi Renaud,

Thanks for considering Aspose.

Under Excel we are able to nicely dispatch the data per column by going in the menu Data > Convert and specifying comma as the delimeter and specifying that every values are contained within " "..........

I am not sure about your need. I don't find the menu option i.e., Data > Convert and specifying comma as the delimeter in MS Excel. Could you elaborate your query and give us some more details for your requirement. If possible create your desired spreadsheet in MS Excel and post it here. So, that we may able to understand what you really want.

Thank you.

Hello,

Sorry for not being precise enough with the menu titles, i was just trying to remember the english version of Excel.

As for an example to illustrate what happen when you open a .csv file with Excel (all values of the first line goes in the first cell of the first row...) and how using the "convert" option can dispatch the data correctly (one value per cell instead), the example provided in the first message of this post will illustrates this very well.

I'm pretty sure you can find the option i'm talking about and you should by now be able to tell me what would be the most appropriate way to do that using Aspose.Cells, if not let me know and i will come back to you with the exact Excel menu titles in english.

Thanks in advance,
Regards, Renaud

Hi,

I am still unable to find this "Convert" option for .csv file in MS Excel. Could you give us more details, possibly you may install English Version of MS Excel on your machine and check to tell the option, you may also try to create a sample test project to show the issue and create an excel spreadsheet to implement your requirement, zip it and post it here. We will check and try to implement this using Aspose.Cells API.

Thank you.

Hello,


Since I agree illustration through examples will be most understandable, please find hereby a zip file containing an example of a .csv file (MyExample_Source.csv) that I consider opening with the appropriate Workbook.Open Method (Stream, FileFormatType.CSV) and the target workbook I wish to obtain (MyExample_Target.xls).


Hope it helps for you to tell me the most appropriate and simple way to get there,
Regards, Renaud


PS: concerning convert option I was simply talking about : http://office.microsoft.com/en-us/excel/HP052573861033.aspx

Nevermind, i figured out while playing around with exportDataTableAsString method that, except removing leading zeros or space from fields Excel behavior, .csv data file were well imported in each column of the workbook.

Sorry for the misunderstanding,
Sincerely, Renaud