Poor performance when Copying Worksheets between Workbooks


#1

I have noticed very poor performance when copying worksheets between workbooks.

I am creating multiple worksheets in individual worksbooks. I add each workbook to a generic queue before processing them. I use multiple threads to populate those worksheets. When all work is done, I use a single (main) thread to combine all of my worksheets into a single workbook.

Sample Code:

private Queue<Workbook> queueWorkBook = new Queue<Workbook>();

queueWorkBook.Enqueue(newWorkBook);

I then combine all of those workbook (and sheets) back into a single workbook before saving it.

Sample Code:

Workbook dequeuedWorkBook = queueWorkBook.Dequeue();

int intWorkSheet = workBookToBeSaved.Worksheets.Add();

workBookToBeSaved.Worksheets[intWorkSheet].Copy(dequeuedWorkBook.Worksheets[0]);

workBookToBeSaved.Worksheets[intWorkSheet].Name = dequeuedWorkBook.Worksheets[0].Name;

What supprises me most about this process it that I was able to create and populate 113 workbooks and worksheets in under six minutes. However, it took me over 18 minutes to save those sheets back into a single workbook. This is true even when I create populate each of the worksheets on a single thread. The difference in performance has nothing to do with the fact that I am using multiple threads. It still takes over 3 times as long to conbine the worksheets into a workbook.

What would you suggest is my performance problem and how can I correct this behavior?

Thanks James


#2

Hi James,

What's the size of the result combined workbook?

When you combine small files into a not too large file, the execution time is approximately linear to the file size. For example, if you create a 20K file in 0.1 second and it may take you 11~12 seconds to create a files with 2MB size.

However, if you combine large files into a huge file, the time will increase sharply. For example, if you create a 2MB file in 10 seconds and it may take you 4000 seconds to create a 200MB file.

And when more files is combine, the execuation will also increase sharply with number of files. For example, if the time to combine 1 file is 1 second, the time to combine 100 files may be 150~200 seconds.

(Above number are just used to describe this scenario and it shouldn't be used to estimate the performance.)

This performance degradation is caused by:

1. If the file size is huge, it will consume more memory which causes more times of memory allocation and re-allocation. And more time will be consumed in searching, combining and other actions.

2. When more files are combined, the time spent on combining data and formulas, updating references and other actions will also be much more.


#3

Hi Laurence and thanks for the reply.

The workbook (excel file) saved is over 50 MB. It is coming from about 110 worksheets (and workbooks). I am only doing this because I want to use multiple threads to create my worksheets and this requires me to use individual workbooks in order to be threadsafe.

Is there a better solution to this problem?

James


#4

This workbook is too large. Why do you create such a large file? Is it possible to divide them into smaller files?


#5

This is a sales tracking report. I cannot do anything about the size because the information is what it is.

The report is being divided into multiple tabs because each tab represents a different sales person and there are 100+ sales people being reported on in this report. Sales management does not want to receive 100+ different reports, they only one to see one report but tabbed by the individual sales people.

I will assume that this means there is not a more efficient way do this other than to create smaller files. Is that correct?

James


#6

Yes. Another way is to try to make individual reports smaller.