Filter data while reading excel

I am using LightCellsDataHandler for reading a large excel sheet.

Now, I have some filter to be applied to the excel sheet. So, when I read the data using LightCellsDataHandler, it should give me only the filtered data.

Any way to achieve this? And how can we find the possible operators applicable for the column?

In other words,

  1. Open the sheet.
  2. Apply the filters.
  3. Read the filtered data.

Any sample program to achieve this?

Hi,


Thanks for your posting and using Aspose.Cells.

In order to look into your issue, please provide us your sample code, sample excel files and the screenshots that illustrate your requirements. We will investigate this issue and in case, it is a New Feature, we will add it in our database to support it in our future versions.

However, you can filter data in normal excel Aspose.Cells APIs using the following sample code.

I have attached the source excel file used in this code and output pdf generated by it for your reference. As you can see, in output pdf, chart is not present.

Java
//Specify the load options and filter the data
//We do not want to load charts
LoadOptions options = new LoadOptions();
options.setLoadFilter(new LoadFilter(LoadDataFilterOptions.ALL & ~LoadDataFilterOptions.CHART));

//Load the workbook with specified load options
Workbook workbook = new Workbook(dirPath + “sample.xlsx”, options);

//Save the workbook in output format
workbook.save(dirPath + “LoadExcelFileWithoutChart_out.pdf”, SaveFormat.PDF);


C#
//Specify the load options and filter the data
//We do not want to load charts
LoadOptions options = new LoadOptions();
options.LoadFilter = new LoadFilter(LoadDataFilterOptions.All & ~LoadDataFilterOptions.Chart);

//Load the workbook with specified load options
Workbook workbook = new Workbook(“sample.xlsx”, options);

//Save the workbook in output format

workbook.Save(“LoadExcelFileWithoutChart_out.pdf”, SaveFormat.Pdf);

Attached an excel sheet. The third column has filter: show rows having value greater than 500. When we open the excel in MS excel, there is only one such row which is visible after applying the filter.

If we read the excel file using LightCellsHandler API, it does NOT honour the filter applied. It reads all the data - reads ten rows.

Through aspose cells java library, is there any way to read the just one row as we can see in the UI?

Hi,


Thanks for your posting and clarifying the issue as well as considering Aspose.Cells.

Please spare us some time. We will look into your issue and update you asap.

Any updates?

Hi,


Thanks for your posting and using Aspose.Cells.

You can achieve your requirements i.e. Reading the filtered data easily with Aspose.Cells API. We will provide you a sample code if it fits your requirements. However, with light cells API, we need more time to look into this issue.

Hi,

Thanks for using Aspose.Cells.

Please see the following sample code which achieves your requirements in Normal mode. Please check the source excel file used in this code and output excel file generated by it. The code first applies the auto filter on source excel file first worksheet i.e. Sheet1, then it copies the filtered rows into newly added second worksheet i.e. Dest and saves the output excel file.

If you open the output excel file and check the Dest sheet, you will see the filtered rows which were obtained after applying filter on Sheet1

Please note, in order to obtained all the filtered rows, we have used PasteOptions.setOnlyVisibleCells(true) method.

Java

Workbook wb = new Workbook(dirPath + “source.xlsx”);

Worksheet dstSheet = wb.getWorksheets().add(“Dest”);

Worksheet ws = wb.getWorksheets().get(0);

ws.getAutoFilter().setRange(“A1:H1”);

//ws.getAutoFilter().addFilter(2, “1”);

ws.getAutoFilter().addFilter(2, “4”);

//ws.getAutoFilter().addFilter(2, “6”);

ws.getAutoFilter().addFilter(2, “8”);

ws.getAutoFilter().refresh();

Range rng = ws.getCells().createRange(“A1”, “J11”);

Range dstRng = dstSheet.getCells().createRange(“A1”, “J11”);

PasteOptions opts = new PasteOptions();

opts.setOnlyVisibleCells(true);

opts.setPasteType(PasteType.VALUES);

dstRng.copy(rng, opts);

wb.save(dirPath + “output.xlsx”);

Hi,


Thanks for using Aspose.Cells.

We have also logged your original requirements in our database for evaluation. We will look into it and let you know if it is possible to achieve your goals using LightCellsDataHandler and what are the limitations if any. Once, there is some news for you, we will let you know asap.

This issue has been logged as

  • CELLSJAVA-42119 - Read the filtered data using LightCellsDataHandler

Hi,


Thanks for your posting and using Aspose.Cells.

We have already provided you a solution for Normal mode. Please check this post for your reference.

The reply below is related to only LightCells mode.

We cannot apply the filter defined in template file while loading it with LightCells. However, you can implement the filter by yourself in the implementation of LightCellsDataHandler.

Below is an example to show the usage:

Java
LoadOptions opts = new LoadOptions();
LightCellsDataHandlerPrepareFilter ph = new LightCellsDataHandlerPrepareFilter();
opts.setLightCellsDataHandler(ph);
Workbook wb = new Workbook(“sample.xlsx”, opts); //gather rows by filter
opts.setLightCellsDataHandler(new LightCellsDataHandlerFilter(ph.getKeptRows()));
wb = new Workbook(“sample.xlsx”, opts); //load rows that matches filter

//-------------------------
//-------------------------

class LightCellsDataHandlerFilter implements LightCellsDataHandler
{
private int[][] keptRows;
private int[] currentRows;
private int nextPos = 0;
private boolean noKeep = false;
public LightCellsDataHandlerFilter(int[][] keptRows)
{
this.keptRows = keptRows;
}
public boolean startSheet(Worksheet sheet)
{
currentRows = keptRows[sheet.getIndex()];
if(currentRows == null || currentRows.length < 1)
{
return false;
}
nextPos = 0;
noKeep = false;
return true;
}

public boolean startRow(int rowIndex)
{
    if(noKeep)
    {
        return false;
    }
    if(rowIndex < currentRows[nextPos])
    {
        return false;
    }
    nextPos++;
    if(nextPos == currentRows.length)
    {
        noKeep = true;
        return false;
    }
    return true;
}

public boolean processRow(Row row)
{
    return true;
}

public boolean startCell(int column)
{
    return true;
}

public boolean processCell(Cell cell)
{
    return true;
}

}
class LightCellsDataHandlerPrepareFilter implements LightCellsDataHandler
{
private boolean keepRow = false;
private int prevSheetIndex = -1;
private int prevRowIndex = -1;
private ArrayList alRows;
private int[][] keptRows;

public int[][] getKeptRows()
{
    if(prevSheetIndex < 0)
    {
        return keptRows;
    }
    if(keepRow)
    {
        alRows.add(prevRowIndex);
        keepRow = false;
    }
    int[] rows = new int[alRows.size()];
    for(int i=0; i<rows.length; i++)
    {
        rows[i] = alRows.get(i);
    }
    keptRows[prevSheetIndex] = rows;
    prevSheetIndex = -1;
    return keptRows;
}
public boolean startSheet(Worksheet sheet)
{
    if(prevSheetIndex < 0)
    {
        keptRows = new int[sheet.getWorkbook().getWorksheets().getCount()][];
    }
    else
    {
        if(keepRow)
        {
            alRows.add(prevRowIndex);
        }
        int[] rows = new int[alRows.size()];
        for(int i=0; i<rows.length; i++)
        {
            rows[i] = alRows.get(i);
        }
        keptRows[prevSheetIndex] = rows;
    }
    alRows = new ArrayList<Integer>();
    prevSheetIndex = sheet.getIndex();
    return true;
}

public boolean startRow(int rowIndex)
{
    keepRow = true;
    return true;
}

public boolean processRow(Row row)
{
    return true;
}

public boolean startCell(int column)
{
    return keepRow;
}

public boolean processCell(Cell cell)
{
    if(cell.getColumn() == 2)
    {
        int type = cell.getType();
        if((type== CellValueType.IS_NUMERIC || type == CellValueType.IS_DATE_TIME))
        {
            double dv = cell.getDoubleValue();
            if(dv > 600.0 && dv < 620.0)
            {
                return false;
            }
        }
        keepRow = false;
    }
    return false;
}
}