100M csv file convert to pdf failed

I Convert a 100M xls file to pdf is right, but I convert a 100M csv file to pdf failed.

Then I firstly convert 100M csv file to xls file and then convert the xls file to pdf, but now convert this xls file to pdf failed.

Hi Tommy,


Thank you for contacting Aspose support.

Please provide more details, such as of error/stack trace you are getting for the conversion failure, what code are your using for the said conversion. If possible, please share an executable sample application along with the input CSV file for our testing. Please note, you may not be able to upload such large file on community server therefore please zip the file and upload it to some free file hosting service such as Dropbox and share the link here.
public Stream ConvertCsvToPdf(){
var inputStream = new FileInfo(@"D:\TestAspose\CreateByEmmaYin0.csv").OpenRead();
var outputStream = new MemoryStream();
using (var middleStream = new MemoryStream())
{
try
{
inputStream.Seek(0, SeekOrigin.Begin);
inputStream.CopyTo(middleStream);
middleStream.Seek(0, SeekOrigin.Begin);
var options = new LoadOptions(LoadFormat.CSV);
var workbook = new Workbook(middleStream, options);
workbook.Save(stream, SaveFormat.Excel97To2003);
stream.Seek(0, SeekOrigin.Begin);
var newOptions = new LoadOptions(LoadFormat.Excel97To2003);
var excel = new Workbook(stream, newOptions);
using (var fileStream = File.Create(@"D:\csvtoxls.xls"))
{
stream.Seek(0, SeekOrigin.Begin);
stream.CopyTo(fileStream);
}
var opts = new PdfSaveOptions();
opts.OnePagePerSheet = true;
excel.Save(outputStream, opts);
outputStream.Seek(0, SeekOrigin.Begin);
middleStream.Close();
}
catch (Exception e)
{
throw new Exception(String.Format("An error occurred when convert cells to pdf. Exception :{0}", e));
}
}
return outputStream;
}



The code above is convert csv file to xls and the convert xls to pdf.

You can change the code to convert xls file to pdf redirectly.

Hi Tommy,


Thank you for the sample CSV, however, we also require the exception message that you are facing on your side. Moreover, by looking at your code, it seems you are loading the CSV to Workbook object, saving it to XLS format and then reading it again to convert the result in PDF. Could you please explain your ultimate goal so we could optimize the code for you?

Hi again,


This is to update you that I have simplified your code as follow, and executed it against the latest revision of Aspose.Cells for .NET 8.8.0.3 to observe the OutOfMemoryException (with stack trace available at the bottom of this post) at Workbook.Save method while saving to PDF format. Please note, XLS file gets saved correctly. In order to further investigate the matter, I have logged it as CELLSNET-44438.

Please provide the stack trace from your end so we could match it with the one we have observed.

C#

MemoryStream inStream = new MemoryStream();
MemoryStream outStream = new MemoryStream();
FileStream file = new FileStream(dir + “CreateByEmmaYin0.csv”, FileMode.Open);
file.CopyTo(inStream);
var book = new Workbook(inStream, new LoadOptions(LoadFormat.CSV) { MemorySetting = MemorySetting.MemoryPreference });
book.Save(outStream, SaveFormat.Excel97To2003);
outStream.WriteTo(new FileStream(dir + “output.xls”, FileMode.Create));
outStream = new MemoryStream();
book.Save(outStream, SaveFormat.Pdf);
outStream.WriteTo(new FileStream(dir + “output.pdf”, FileMode.Create));

Stack Trace

at . (Stream , PdfSaveOptions )
at . (Stream , SaveOptions )
at Aspose.Cells.Workbook.Save(Stream stream, SaveOptions saveOptions)
at Aspose.Cells.Workbook.Save(Stream stream, SaveFormat saveFormat)
at ACells.Program.Main(String[] args) in C:\Users\Babar\Documents\Visual Studio 2015\Projects\ACells\ACells\Program.cs:line 29
at System.AppDomain._nExecuteAssembly(RuntimeAssembly assembly, String[] args)
at System.AppDomain.ExecuteAssembly(String assemblyFile, Evidence assemblySecurity, String[] args)
at Microsoft.VisualStudio.HostingProcess.HostProc.RunUsersAssembly()
at System.Threading.ThreadHelper.ThreadStart_Context(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()
MemoryStream inStream = new MemoryStream();
MemoryStream outStream = new MemoryStream();
FileStream file = new FileStream(dir + "CreateByEmmaYin0.csv", FileMode.Open);
file.CopyTo(inStream);
var book = new Workbook(inStream, new LoadOptions(LoadFormat.CSV) { MemorySetting = MemorySetting.MemoryPreference });
book.Save(outStream, SaveFormat.Excel97To2003);
outStream.WriteTo(new FileStream(dir + "output.xls", FileMode.Create));
file = new FileStream(dir + "output.xls", FileMode.Open);
inStream = new MemoryStream();
file.CopyTo(inStream);
book = new Workbook(inStream, new LoadOptions(LoadFormat.Excel97To2003) { MemorySetting = MemorySetting.MemoryPreference });
outStream = new MemoryStream();
book.Save(outStream, SaveFormat.Pdf);
outStream.WriteTo(new FileStream(dir + "output.pdf", FileMode.Create));


Because I convert 100M xls file to pdf is right,but I Convert 100M csv file to pdf is wrong, exception is OutOfMemoryException.
So I think I can convert 100M csv to xls firstly , then convert this xls file to pdf, but in this case, is also have an exception that is OutOfMemoryException.
In another case, I convert 100M xls file to pdf directly, this is right.

Are you got it ?

Hi Tommy,


I have already reproduced the OutOfMemoryException while converting your provided CSV to PDF format, and I have already logged it for further investigation under the aforementioned ticket. I just want to make sure that the stack trace is same on both ends that is the reason I asked you to share the complete stack trace from your test results.
{"An error occurred when convert cells to pdf. Exception :Aspose.Cells.CellsException: Exception of type 'System.OutOfMemoryException' was thrown.\r\n
at \u0006 .\u0002(Stream \u0002, PdfSaveOptions \u0003)\r\n
at \u0006 .\u0002(Stream \u0002, SaveOptions \u0003)\r\n
at Aspose.Cells.Workbook.Save(Stream stream, SaveOptions saveOptions)\r\n
at vCloud.Util.CellsToPdfConvertorBase.Convert(Stream inputStream)
in D:\\WorkCopy\\code\\vCloud\\vCloud.Util.Aspose\\FileFormat\\Convertor\\PdfConvertorBase\\CellsToPdfConvertorBase.cs:line 33"}

Above is my Exception.

Hi again,


Thank you for the stack trace. I have attached it to the aforementioned ticket for product team’s review. We will require some time to thoroughly investigate the matter and get back with updates in this regard. As soon as we receive any news, we will post here for your kind reference.

Hi,


Please try our latest version/fix: Aspose.Cells for .NET v17.3.3.

Aspose.Cells for .NET v17.3.3 (.NET 2.0)
Aspose.Cells for .NET v17.3.3 (.NET 4.0)
(Note: please choose any of the fixes for your underlying .NET framework version)

Your issue should be fixed in it.
e.g
Sample code:

LoadOptions loadOptions = new LoadOptions(LoadFormat.CSV) {MemorySetting = MemorySetting.MemoryPreference};
Workbook wb = new Workbook(srcFile.csv, loadOptions);

//there are very long text in rows. using this setting to let us to release some memory more quickly.
wb.Worksheets[0].PageSetup.Order = PrintOrderType.OverThenDown;

wb.Save(outFile.pdf, SaveFormat.Pdf);

Let us know your feedback.

Thank you.

The issues you have found earlier (filed as CELLSNET-44438) have been fixed in Aspose.Cells for .NET 17.4.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.