Conditional formatting speed

Hi.

Is there some way to improve perfomance of processing a book with conditional formatting?
We have the large book with many conditional formatting.
Standard processing of book with same size (without conditional formatting) spent about 3sec
But when we tried to use cell.getDisplayStyle(), that considers conditional formatting, processing of same book increase up to 15 minutes.

Best regards. Alexey

@makarovalv
Would you like to provide sample file and test code? We will check it soon.

@makarovalv

For some kinds of format conditions, using read-only cache( StartAccessCache(AccessCacheOptions)) may improve the performance. Code example:

            workbook.StartAccessCache(AccessCacheOptions.All);
            //access display styles for cells
            workbook.CloseAccessCache(AccessCacheOptions.All);

Please try the option to check whether it can give you better performance. If not, please provide us the sample file and code so we will investigate whether we can provide solution for better performance.

Hello.

I didn’t find such an example. There is another example, but it also shows that conditional formatting has significant impact.

@Test
	void test() throws Exception {
		Workbook workbook = new Workbook("E://with_conditional.xlsx");
		workbook.startAccessCache(AccessCacheOptions.ALL);
		Cells cells = workbook.getWorksheets().get(0).getCells();
		long start = System.nanoTime();
		StringBuilder sb = new StringBuilder();
		for (int i = 0; i < cells.getMaxDataRow(); i++) {
			for (int j = 0; j < cells.getMaxDataColumn(); j++) {
				sb.append(" ").append(cells.get(i,j).getDisplayStyle().getForegroundArgbColor());
			}
			sb.append("\n");
		}
		workbook.closeAccessCache(AccessCacheOptions.ALL);
		System.out.println(java.util.concurrent.TimeUnit.MILLISECONDS.convert((System.nanoTime() - start), TimeUnit.NANOSECONDS));
		System.out.println(sb.length());
	}

example.zip (6.2 MB)

@makarovalv
By using sample files and code for testing, we can find performance issues caused by conditional formatting.

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): CELLSJAVA-45847

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Use read only cache to access display styles for cell which may increase processing speed.

@Jiyasharma
Thank you for your feedback. After testing, use read only cache to access display styles for cell which does not significantly improve processing speed. We will carefully research and test performance issues. Hopefully we could figure it out soon. Once there is an update, we will notify you immediately.

@Jiyasharma

There are large amount of format condition with formula like:
=AND(AND(DATE(YEAR($BS20),MONTH($BS20),DAY($BS20))=DATE(YEAR(CN$18),MONTH(CN$18),DAY(CN$18)),DATE(YEAR($BS20),MONTH($BS20),DAY($BS20))=DATE(YEAR(BR$20),MONTH(BR$20),DAY(BR$20))),OR($M20=“fixed”,$M20=“conflict”))

When such kind of conditions are applied to large range of cells, the formula will be calculated one by one for every cell(the base cell changes, then the formula changes too and so has to be calculated separately, and cache mechanism also has no effect). Obviously, calculating such kind of complicated formula so many times with different data requires a lot of time and we are afraid we have no better solution to improve the performance for calculation.

However, we think you may optimize the formulas for the conditional formatting to get better performance. For example,
to compare the date built from one cell, you can use INT function and compare the result, then the original formula:
DATE(YEAR($BS20),MONTH($BS20),DAY($BS20))=DATE(YEAR(CN$18),MONTH(CN$18),DAY(CN$18))
can be changed to
INT($BS20)=INT(CN$18)

to compare the time, if there is no requirement for high precision, then the original formula:
TIME(HOUR($BS20),MINUTE($BS20),SECOND($BS20))=TIME(0,0,0)
can be changed to
INT($BS20)=CN$18

and
TIME(HOUR($BS20),MINUTE($BS20),SECOND($BS20))<>TIME(0,0,0)
can be changed to
INT($BS20)<>CN$18

By our simple test, after changing those formulas in this the time cost can be reduced to no more than half of the original one.

Ok it sounds good.
Thank you.

Thank you for the detailed explanation and suggestions for optimizing the conditional formatting formulas. We’ll review and implement the proposed changes to improve processing speed. Your insights are greatly appreciated.
Thanks and Regards.
Jiya Sharma

@Jiyasharma
You are welcome. And if you have any questions when applying those changes, or have any other issues, please let us know and we will try our best to provide assistance.

Thank you very much sure I’ll let you know.

@Jiyasharma,

You are welcome.