Converting excel files with .xls or .xlsx extension to .csv add extra lines at the end

When converting an excel file with a .xls or .xlsx extension to csv using Apose.Cell version 7.5.2.0, the converted file has extra lines at the end of the last row.


Sample code:

string source = @“c:\book1.xls”;
string dest = @“c:\book1.csv”;

Workbook excel = new Workbook(source);
excel.Save(dest);
/* This does not work either
Workbook excel = new Workbook();
excel.Open(source);
excel.Save(dest, FileFormatType.CSV);
*/
Content of input file has:
A,B,C,D,E
1,XXXXX,XXXXX,XXXXX,XXXXX
2,XXXXX,XXXXX,XXXXX,XXXXX
3,XXXXX,XXXXX,XXXXX,XXXXX
4,XXXXX,XXXXX,XXXXX,XXXXX
5,XXXXX,XXXXX,XXXXX,XXXXX
6,XXXXX,XXXXX,XXXXX,XXXXX
7,XXXXX,XXXXX,XXXXX,XXXXX
8,XXXXX,XXXXX,XXXXX,XXXXX
9,XXXXX,XXXXX,XXXXX,XXXXX
10,XXXXX,XXXXX,XXXXX,XXXXX
11,XXXXX,XXXXX,XXXXX,XXXXX
12,XXXXX,XXXXX,XXXXX,XXXXX
13,XXXXX,XXXXX,XXXXX,XXXXX
14,XXXXX,XXXXX,XXXXX,XXXXX
15,XXXXX,XXXXX,XXXXX,XXXXX
16,XXXXX,XXXXX,XXXXX,XXXXX
17,XXXXX,XXXXX,XXXXX,XXXXX
18,XXXXX,XXXXX,XXXXX,XXXXX
19,XXXXX,XXXXX,XXXXX,XXXXX
20,XXXXX,XXXXX,XXXXX,XXXXX
21,XXXXX,XXXXX,XXXXX,XXXXX
22,XXXXX,XXXXX,XXXXX,XXXXX
23,XXXXX,XXXXX,XXXXX,XXXXX
24,XXXXX,XXXXX,XXXXX,XXXXX
25,XXXXX,XXXXX,XXXXX,XXXXX
26,XXXXX,XXXXX,XXXXX,XXXXX
27,XXXXX,XXXXX,XXXXX,XXXXX
Content of csv file:
A,B,C,D,E
1,XXXXX,XXXXX,XXXXX,XXXXX
2,XXXXX,XXXXX,XXXXX,XXXXX
3,XXXXX,XXXXX,XXXXX,XXXXX
4,XXXXX,XXXXX,XXXXX,XXXXX
5,XXXXX,XXXXX,XXXXX,XXXXX
6,XXXXX,XXXXX,XXXXX,XXXXX
7,XXXXX,XXXXX,XXXXX,XXXXX
8,XXXXX,XXXXX,XXXXX,XXXXX
9,XXXXX,XXXXX,XXXXX,XXXXX
10,XXXXX,XXXXX,XXXXX,XXXXX
11,XXXXX,XXXXX,XXXXX,XXXXX
12,XXXXX,XXXXX,XXXXX,XXXXX
13,XXXXX,XXXXX,XXXXX,XXXXX
14,XXXXX,XXXXX,XXXXX,XXXXX
15,XXXXX,XXXXX,XXXXX,XXXXX
16,XXXXX,XXXXX,XXXXX,XXXXX
17,XXXXX,XXXXX,XXXXX,XXXXX
18,XXXXX,XXXXX,XXXXX,XXXXX
19,XXXXX,XXXXX,XXXXX,XXXXX
20,XXXXX,XXXXX,XXXXX,XXXXX
21,XXXXX,XXXXX,XXXXX,XXXXX
22,XXXXX,XXXXX,XXXXX,XXXXX
23,XXXXX,XXXXX,XXXXX,XXXXX
24,XXXXX,XXXXX,XXXXX,XXXXX
25,XXXXX,XXXXX,XXXXX,XXXXX
26,XXXXX,XXXXX,XXXXX,XXXXX
27,XXXXX,XXXXX,XXXXX,XXXXX
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
I can’t use the DeleteBlankRow() function because if there is an empty row between rows that have data, I don’t want to remove it from the converted file.  For example:
25,XXXXX,XXXXX,XXXXX,XXXXX 
27,XXXXX,XXXXX,XXXXX,XXXXX 
Should return 
25,XXXXX,XXXXX,XXXXX,XXXXX 
27,XXXXX,XXXXX,XXXXX,XXXXX 
Instead of
25,XXXXX,XXXXX,XXXXX,XXXXX 27,XXXXX,XXXXX,XXXXX,XXXXX 
Please advise.
Thanks

Hi,

Thanks for using Aspose.Cells.

You can start iterating from the max row and then check if the current row is blank or not. If it is blank, you should delete the row and if is not blank, you should stop iterating. Finally, you can convert your file into csv format.

Please see the below code for your reference. It works fine as per your needs.

C#


string path = @“F:\Shak-Data-RW\Downloads\book1.xls”;


Workbook workbook = new Workbook(path);


Worksheet worksheet = workbook.Worksheets[0];


int maxR = worksheet.Cells.MaxRow;


for (int idx = maxR; idx > 0; idx–)

{

//if the row is not blank then exit the loop

if (worksheet.Cells.Rows[idx].IsBlank != true)

break;


//Remove the blank row

worksheet.Cells.DeleteRow(idx);

}


workbook.Save(path + “.out.csv”);

So is this more of a work around for this problem because I think this is a bug in your product and should be fixed. However if you don’t have an immediate fix for this problem will there be an performance issue such as consuming a lot of memory or having to use processing time to go and delete rows when there are many more rows then what’s in my sample file?


Please advise.

Hi,

Thanks for your posting and using Aspose.Cells.

It is not a bug of DeleteBlankRows(), it’s purpose is to delete rows that are blanks regardless of their position.

Now, there is a better way to solve this problem. Please use MaxRow and MaxDataRow properties. When there are blank rows at the end, then MaxRow will be greater than MaxDataRow, you can then delete the rows by calculating the difference between them.

Please see the following code for your reference.

C#


string path = @“F:\Shak-Data-RW\Downloads\book1.xls”;


Workbook workbook = new Workbook(path);


Worksheet worksheet = workbook.Worksheets[0];


int maxR = worksheet.Cells.MaxRow;

int maxDR = worksheet.Cells.MaxDataRow;


if (maxR > maxDR)

worksheet.Cells.DeleteRows(maxDR + 1, maxR - maxDR);


workbook.Save(path + “.out.csv”);

I think you have missed my point. My point is that the Apose.Cell component is appending extra rows to the end of the output file during the conversion when the input file doesn’t contain them. Please look at the original input file book1.xls and see that there are 28 rows and the converted output file book1.csv has 100 rows. Where did all these extra rows come from?

Hi,

Thanks for your feedback and using Aspose.Cells.

Actually, your xls file contains extra blank rows and when it is converted to csv either with MS-Excel or Aspose.Cells, these extra rows appear in the output csv file. You can manually convert your xls file into csv by MS-Excel and see this behavior. Aspose.Cells and MS-Excel both behaves same so it is not a bug of Aspose.Cells.

If you’re saying there are empty rows, it’s defnitely not visible to the eyes. Additionally, I have tested what you are describing through your com dll, the really old version dating 2005 versus the Apose.Cell and they behave differently for my input file. The really old version works well and doesn’t add the extra rows. Can you explain the difference? Both the new and old library test was to convert the .xls to .csv.

Hi,

Thanks for your posting and using Aspose.Cells.

Yes, these blank rows are not visible. These rows might contain styles or other settings but these are blanks, therefore these blank cells are different than non-existent cells.

Please save your xls file into csv file manually using MS-Excel, you will find these blank rows in the form of , as you have shown in the first post. MS-Excel and Aspose.Cells behave same. It could be a bug of older version that if it treats blank rows as non-existent rows.

Hi I am using below code to covert the file from xls to csv

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
[DelimitedRecord("|")]
using Aspose.Cells;

namespace SampleProject
{
class Program
{
static void Main(string[] args)
{
// Directory path for input and output files.
string dirPath = “PATH”;

        // Load input XLS file inside the Aspose.Cells workbook object.
        Aspose.Cells.Workbook workbook = new Workbook(dirPath + "file.xls");

        // Save the workbook in output CSV format.
        workbook.Save(dirPath + "file.csv", SaveFormat.CSV);


    }
}

}

file is getting converted but not in proper format one line data is coming in two lines

Can you please help

@gauti0710,

Thanks for the code segment.

Please zip your template file “file.xls” and attach it here, we will check it soon.

I can’t share file as that is having confidential information.

There is a column in which address is mentioned from that it is going to other line

@gauti0710,

We are very sorry but without a template Excel file, we cannot evaluate your issue and hence cannot help precisely. We recommend you to kindly create a new file with some dummy data and provide us here. Alternatively, you may change your confidential data with some dummy data and re-save the file to provide us.

PFA and help

XLStocsv_Dummy.zip (12.9 KB)

@gauti0710,
I have tested the scenario but could not observe any issue. Could you please highlight the text in the attached CSV (which is generated using the latest version Aspose.Cells for .NET 20.7.x) where issue can be observed for our reference? I have checked the address column but it is not broken anywhere.

XLStocsv_Dummy (2).zip (1.9 KB)

Hello,

After TowerA it is going in to other line , PFA screenshot and i also want to put a blank column in front means comma and i want | delimited file Do you know how i will get this in my code.
image.png (7.9 KB)

I think problem is double quote
“Sambhavan park,TowerA
Bhorivali east Marg, Rahul nagar
IN/Badlapur 76534.”

If i am copying the data of the column i am getting in double quotes

Is aspose can remove that?

@gauti0710,
If you convert your XLS to CSV using MS Excel, you will get the same output as generated by Aspose.Cells. It is obvious as your source file contains text in this format in the address column.

However you may try the following logic where I have just updated contents of one cell for your reference. You may write your own logic to implement the same on the address column to get the address in single line. If it does not fulfill your requirement, please perform your desired task using Excel ONLY and share the steps with us along with the output file. We will try to provide you assistance to get same output using Aspose.Cells.

Aspose.Cells.Workbook workbook = new Workbook(dirPath + "XLStocsv_Dummy.XLS");
//Process text in a sample cell say P13
var StringValue = workbook.Worksheets[0].Cells["P13"].StringValue;
StringValue = StringValue.Replace('\n', ' ');
workbook.Worksheets[0].Cells["P13"].Value = StringValue;
// Save the workbook in output CSV format.
workbook.Save(dirPath + "XLStocsv_Dummy.csv", Aspose.Cells.SaveFormat.CSV);

After running this code when you open the output CSV file in Excel, you will see that cell P13 in CSV contains appropriate text.

Thanks It is working fine for P13 if want to do for whole p column what should i do?

@gauti0710,
You may perform this task in a loop as follows:

Aspose.Cells.Workbook workbook = new Workbook(dirPath + "XLStocsv_Dummy.XLS");
for (int iRow = 12; iRow < 68; iRow++)
{
    var StringValue = workbook.Worksheets[0].Cells[iRow, 15].StringValue;
    StringValue = StringValue.Replace('\n', ' ');
    workbook.Worksheets[0].Cells[iRow,15].Value = StringValue;
}
            
// Save the workbook in output CSV format.
workbook.Save(dirPath + "XLStocsv_Dummy.csv", Aspose.Cells.SaveFormat.CSV);

Hi ahsan,

It is working fine Thanks.

Last question i want to put | as delimiter how can i put in the code.