Suppressing Duplicate Rows

How can we suppress duplicate rows of data when importing from Datatable to ExcelSheet using Aspose.Net

Hi,


Well, Aspose.Cells only provides means (APIs) to import data (which can be redundant or duplicated) from different data sources including data table, arrays, list etc., it does not provide any such feature to find out it there is any duplicate rows or records in the table, you have to use your own code or statements to accomplish your task. You may filter your records while retrieving it into data table, so Aspose.Cells could import data from the underlying datatable to your Excel worksheets.

By the way, I think you may use Distinct keyword in SQL Statement while retrieving data from your data source, see the topic for your reference:
http://stackoverflow.com/questions/1641718/how-to-select-unique-records-by-sql

Thank you.

Hi,

As discussed earlier, Aspose.Cells for .NET does not provide any means to suppress/remove duplicate data rows while importing the data from a DataTable. Therefore, you may have to write your own custom routines to remove the duplication from the DataTable before importing it to worksheet cells.

Please check the below provided code snippet that uses a Hashtable and an ArrayList for this purpose.

C#


public static DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Add list of all the unique item value to Hashtable, which stores combination of key, value pair.
//And add duplicate item value in ArrayList.
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}
//Removing duplicate items from DataTable.
foreach (DataRow dRow in duplicateList)
dTable.Rows.Remove(dRow);
//DataTable which contains unique records will be returned.
return dTable;
}

In case you have further questions or concerns, please feel free to write back any time.

Hi,

My intention is to ask if Aspose.cells, provide any functionality,using which we will be able to retain only first record from a set of duplicate records.

ex.

record1

record1

record1

record2

record2

It would become

record1

. (empty rows for duplicate records)

.

record2

.(empty row for duplicate records)

Hi there,

Thank you for writing back.

You can achieve your requirement by modifying the previously provided code snippet as follow,

  1. Sort the DataTable on a given column so that all duplicate rows are aligned
  2. Identify and store the duplicate rows
  3. Iterate over the list of duplicate entries
  • Remove the duplicate row
  • Insert an empty row
C#

public static DataTable SupressDuplicateRows(DataTable dTable, string colName)
{
Hashtable hTable = new Hashtable();
ArrayList duplicateList = new ArrayList();
//Sort the DataTable on a given Column
//So that all duplicate rows are aligned
dTable.DefaultView.Sort = colName + " ASC";
//Store the sorted DataTable
dTable = dTable.DefaultView.ToTable();
//Add duplicate item value in arraylist
foreach (DataRow drow in dTable.Rows)
{
if (hTable.Contains(drow[colName]))
duplicateList.Add(drow);
else
hTable.Add(drow[colName], string.Empty);
}

//Supressing duplicate items from datatable
foreach (DataRow dRow in duplicateList)
{
//Get the index of duplicate row
int index = dTable.Rows.IndexOf(dRow);
//Remove the row
dTable.Rows.Remove(dRow);
//Insert an empty row at same index
dTable.Rows.InsertAt(dTable.NewRow(), index);
}
return dTable;
}

I have tested the method on my end, and it works as expected. Please let us know if you need further assistance in this regard.