Convert Large Excel Files to PDF

I wanted to get some recommendation on the best approach to convert Microsoft Excel files to a PDF. Using the basic Aspose function I can easily convert a small Excel file to PDF. When I say “small” I mean an Excel file that will return a less then 20 page PDF.

Excel excel = new Excel();
excel.Open(_SourceDir + "tempsource.xls");
excel.Save(_SourceDir + "tempEXCEL.xml", FileFormatType.AsposePdf);

Pdf pdf = new Pdf();
pdf.BindXML(_SourceDir + "tempEXCEL.xml", null);
pdf.Save(_DestinationDir + "outputofxls.pdf");

What about “huge” Excel files? When I say “huge” I mean an Excel file that will return 5000 page PDF? In the industry I work in, this is a very command thing.

I’m not sure if the code above works when trying to make a large PDF file. When calling “pdf.save” it sites there and accumulates memory and I have no way of checking progress.

Does Aspose fire any events that would indicate it is still working on the file?

Does anyone know of a better solution for this; other than a print driver?

Many thanks,

Ryan

Hi Ryan,

I tried to convert an Excel file to a pdf file with 800 pages. It runs about 1 minute to convert xls file to Aspose.Pdf xml file and 7 minues to convert this xml file to pdf file. The output xml file is about 70MB.

So I think Aspose.Pdf may consume a large amount of memory which stuck the program when processing a huge xml file.

Generally I suggest you to split your xls file to smaller ones.

I moved your question to Aspose.Pdf forum. Our Pdf team may have some suggestion on this conversion.

Hi Ryan,

Please refer to section Using XML of Writing PDF directly.

Currently Aspose.Pdf is not efficient to process single large table. If you can split the large table into more small tables you can get better performance. We will optimize the performance for table in the future version.

Thank you for your assistance. I have some additional questions I would greatly appreciate your assistance answering:

How do you recommend splitting up tables? I know this is possible using Microsoft Com, but can it be done with Aspose?

Does it help Aspose if you save each page into an individual PDF files; instead of saving to a multi page PDF?

Does Aspose fire any events to indicate the progress of creating a PDF?

Many thanks,

Ryan

Thank you for this information. I have a component that converts formatted text to a PDF file. I changed it to use the method that you suggested and it seems to be running much better. Before I was seeing problems using XML and Aspose.PDF after 3000 files, now I am not having any problems.

Do you have an additional information on how to use this method with Excel files?

Many thanks,

Ryan

Dear Ryan,

Using Aspose.Pdf, you can split the table before the Save() method. Here is an example to split a 1000 rows table into two 500 rows tables:

//the table to be split
Table table1 = ...;

//create a new table and add it into the section
Table newTable = table1.Clone() as Table;
pdf1.Sections[0].Paragraphs.Add(newTable);

//copy rows
newTable.Rows = new Rows();
for(int i = 500; i < 1000; i++)
newTable.Rows.Add(table1.RowsIdea [I]);

//remove rows in original table
for(int j = 999; j > 500; j--)
table1.Rows.RemoveAt(j);

How do you convert Aspose.Cells.Workbook.Worksheets to Table so I can iterate through it?

Thanks,

Ryan

Below is an example of what I am trying to do.

I would like to find a way to save a WorkSheet at a time – instead of the entire Workbook – to a single PDF file. My hope is, doing this will make saves more stable when dealing with large excel files.

Can someone please look at the comment at the end of my code and let me know how to accomplish this?

Many thanks,

Ryan

CODE:

string xlsFile = @"readme.xls";

string pdfFile = xlsFile + ".pdf";

Workbook objDoc = new Workbook();

objDoc.Open(xlsFile);

/* PDF we are going to write to */

FileStream fs = new FileStream(pdfFile, FileMode.Create);

Pdf pdf = new Pdf(fs);

foreach (Worksheet objWorksheet in objDoc.Worksheets)

{

/*

* Create Slip Sheet

* Before each worksheet is saved, I want a page to display information on the sheet

*/

StringBuilder sheetText = new StringBuilder();

Aspose.Pdf.Section sheetSection = pdf.Sections.Add();

Aspose.Pdf.Section workSheetSection = pdf.Sections.Add();

sheetText.Append("NAME: " + objWorksheet.Name);

foreach(Comment objComment in objWorksheet.Comments)

{

sheetText.Append("\r\n\r\n");

sheetText.Append("COMMENT:\r\n");

sheetText.Append(objComment.Note);

}

Text t = new Text(sheetText.ToString());

/* This Works */

sheetSection.AddParagraph(t);

/* Add Worksheet */

/*

* Plase Help Here

* Need to populate workSheetSection with objWorksheet

*

* workSheetSection.AddParagraph(???)

*

*/

}

pdf.Close();

fs.Close();

fs.Dispose();

Dear Ryan,

Thank you for considering Aspose.

When converting xls to pdf, Aspose.Cells generates xml file that can be loaded by Aspose.Pdf. It seems you want to add Aspose.Cells objects into Aspose.Pdf. That is not supported.

You can get the table like the following:

pdf.BindXML(xmlFile, null); //xmlFile is produced by Aspose.Cells
Table firstTable = pdf.Sections[0].Paragraphs[0] as Table;

I will ask the Aspose.Cells team to give some advise on how to convert worksheet to pdf.

To convert a single worksheet into pdf, you can try this:

Workbook wb = new Workbook();

wb.Open(largeXlsFile);

for(int i = 0; i < wb.Worksheets.Count; i ++)

{

Workbook singleSheet = new Workbook();

singleSheet.Worksheets[0].Copy(wb.WorksheetsIdea [I]);

MemoryStream stream = new MemoryStream();

singeSheet.Save(stream, FileFormatType.AsposePdf);

//Convert xml file as your wish here

.....

}

This seems to work well, however there are some Tabs this does not work with.

I have an example, but I can not attach it.

If you send me your email I will send it to you.