We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Concatenate/Insert/Append workaround?

Hi,

We are trying to merge multiple pages into a large document, this varies between 20 pages and up to 100. The pages can be simple text pages and also pages filled with images.

The job that is generating problems has 96 single pages (each 2mb ~ 4mb) with a combined total of 275mb. When we are merging the pages, the memory usage goes up to 800+mb. When we are lucky the proces can finish. If not the proces throws an OutOfMemoryException (and sometimes i get a NullReferenceException in Insert/Append. And i even got a 0-byte file without exception on my workstation while testing)
When we are trying to do it in batches of 10 pages, we have no trouble. But when we want to merge the batches to get the document we want, it's still a no go :-(

We tried inserting the MemoryManagement class posted in this forum, but this is after pdf.kit is finished. We are having trouble while pdf.kit is running.
We notices this with Pdf.Kit 3.5.0, we updated to 4.0.0 and still have the same issue.

Is there a workaround that allows us to finish this task with Pdf.Kit? Currently we can use aspose on our smaller products, but not on the bigger documents. Currently this is getting a serious problem for us.

Sincerely,
Fred


ps: it is possible to flush data to disk while inserting/appending to reduce the memory used during this process?

Hi Freddy,

Thank you very much for considering Aspose.

The extensive memory consumption issues with Append and Concatenate methods are already logged as PDFKITNET-13206 and PDFKITNET-13208 in our issue tracking system. Our team is looking to improve the process and reduce memory consumption. We’ll update you once we enhance the process.

Regarding your other issues like NullReferenceException and 0-byte output file, I would like to say that it might not be caused due to the same memory consumption issue. Can you please sort out the PDF files causing these particular problems and share with us? We’ll investigate the issue at our end and update you accordingly.

We’re sorry for the inconvenience.
Regards,

Hello,

Thank you for your quick response. I will send you a PM with a link to the pdf files.

Sincerely,
Fred

Hi Freddy,

Thank you for sharing the files. I have downloaded the sample files. We’ll investigate the issue at our end and update you accordingly.

If you have any other questions, please do let us know.
Regards,

Hi Freddy,

I have tested the files you shared earlier, however I couldn’t reproduce the NullReferenceException and 0-byte output file issues. Although, the memory and processor utilization is high, which is already logged.

As the tests ran on these individual and pair of files, along with all the files at once, didn’t show this problem at my end, so probably these issues are the by product of the extensive memory utilization at your end; and I hope once the resource utilization is improved, these issues will be resolved as well.

Nevertheless, if you can share a little more details regarding your environment, we’ll try to reproduce these issues using your particular scenario. Can you please share the details regarding your system specifications, OS details and .NET Framework version etc?

We’re sorry for the inconvenience.
Regards,

Hello,

Thank you for your reply.

In addition to the nullreference and 0-byte output files: i think they are caused by the outofmemory issue. When i'm testing with all the files i get the nullreference error in aspose.pdf.kit. When i reduce the file count i can reproduce the outofmemory error. (and sometimes a 0-byte output file)

The server running the application/website is a server 2008 standard sp1, 32bit with 4gb ram running in vmware. Cpu is a quad xeon (E5420) at 2,5ghz.
We use sql 2008 express, which uses about 1gb ram. We also have two sites running on the same server, the iis worker process is using 300ram and we use the .net 3.5 framework.

All pdf manipulation is done in an application (generate / merging pages) and this is where we have troubles merging large pdf pages.

If you need more information, please let me know.

Sincerely,
Freddy

Hi Freddy,

I have reproduced the NullReferenceException at my end on Windows Server 2008 with .NET 3.5. I have logged this issue as PDFKITNET-13481 in our issue tracking system. Our team will look into this and you’ll be updated via this forum thread once this is resolved.

Moreover, I couldn’t reproduce the 0-byte output due to the NullReferenceException mentioned above. We’ll need to look into this once the high memory consumption and NullReferenceException issues are resolved and we’re able to produce the file successfully.

We’re sorry for the inconvenience.
Regards,



Hi Freddy,

I have also reproduced the OutOfMemory exception with Concatenate and Insert methods and logged these issues as PDFKITNET-13488 and PDFKITNET-13489 respectively. We’ll update you once these issues are resolved.

If you have any further questions, please do let us know.
Regards,

Hello Shahzad,

Is there any news on the described issues?

If there is no solution yet, is there any way that you can provide a workaround for inserting pages into large documents? (we just need a reliable function to insert pages, we don’t care about how it is done or how much time the function takes to complete)

Currently we are stuck on the insert issue and the orders we need to process are getting more urgent.

Feel free to PM me if that is more appropriate.

Sincerely,
Fred

Hi Freddy,

I have contacted our development team to get their opinion regarding this situation. You’ll be updated once the team shares some idea.

We appreciate your patience.
Regards,

Hi Freddy,

I would like to share with you that our team is working on this issue and we’ll try to provide you the fix in the next monthly release due at the end of February 2010. I’m afraid, I can’t share any workaround at the moment.

We’re very sorry for the inconvenience.
Regards,

I too am very interested in seeing the OutOfMemoryException resolved.

Last year I had posted that I was having the same issue this thread:
http://www.aspose.com/community/forums/2/106175/append-or-concatenate-to-make-a-very-large-pdf/showthread.aspx

We are still having this issue today and this is major problem for us. Seeing this resolved in the very near future will be GREATLY appreciated.

Hi Eric,

We’re going to publish the latest version today. Please try it at your end, once it is published, and share if it works for you or not.

We’re sorry for the inconvenience and looking forward to help you out.
Regards,

Sounds good. Any idea what time the update will be published?

Thanks

Hi Eric,

We’re working on the latest release and hopefully it’ll be available in a few hours.

Regards,

The issues you have found earlier (filed as 13208;13488) have been fixed in this update.


This message was posted using Notification2Forum from Downloads module by aspose.notifier.

I have downloaded the updated dll and unfortunately I am still having the same failure when trying to concatenate pdf files together. The program I has takes a record that can have a random number of attachments on it, our program takes the attachments and merges them together in file. The record I have been struggling with lately has 155 files that needs to merged together. It currently fails on the 12th document. (I have other records that merge more than 12 files with no problems so I am guessing it’s the contents of these files that is causing the outofmemory exception.) In this particular case the page count gets to 198 pages and then fails when trying to add just one additional page.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" />

I do see that the memory usage drops down after each concatenation but the problem is the memory spikes VERY high while each concatenation action is being performed (the memory consumption gets higher and higher with each call as the master copy grows in size)

Hi Eric,

In order to optimize memory utilization and avoid any memory spikes, you’ll have to modify the code according to the new mechanism introduced in Aspose.Pdf.Kit for .NET 4.2.0. Please see the following code snippet:


string[] files = new string[10001];

files[0] = “AP1099.pdf”;

for (int i = 1; i < 10001; i++)

files[i] = “AP10992.pdf”;

PdfFileEditor editor = new PdfFileEditor();

FileProcessingStrategy strategy = Settings.Strategy;

Settings.Strategy = FileProcessingStrategy.OptimizeMemoryUsage;

editor.Concatenate(files, “result.pdf”);

Settings.Strategy = strategy;

In this code, instead of using the memory streams we have used files. So, if you want to concatenate a large number of files then please make an array of all these files and use the code as suggested above. You can also see that we have used OptimizeMemoryUsage option for the optimal use of the memory. Please note that this option can only be effective if you use it with an array of files, whereas if you concatenate individual files in the loop then it won’t help. Also, it is not effective in evaluation mode.

Please try to modify your code as suggested above and see if it helps in your scenario. We would really appreciate if you share the results with us.

We’re sorry for the inconvenience and looking forward to help you out.
Regards,





I have tried to get the above approach to work however I get an empty file.

Not sure what is wrong?
I know the path to my PDFs is correct


string[] _filesArray ;
int i;
Aspose.Pdf.Kit.PdfFileEditor pdfEditor;
Aspose.Pdf.Kit.FileProcessingStrategy strategy;
try
{
i = 0;
_filesArray = new string[files.Count];
foreach (AttachedFile f in files)
{
_filesArray[i] = TempDir + f.FileID + “.pdf”;
i++;
f.isMergedToMaster = true;
}

pdfEditor = new Aspose.Pdf.Kit.PdfFileEditor();
strategy =Aspose.Pdf.Kit. Settings.Strategy;
Aspose.Pdf.Kit.Settings.Strategy = Aspose.Pdf.Kit.FileProcessingStrategy.OptimizeMemoryUsage;
Aspose.Pdf.Kit.Settings.Strategy = strategy;
pdfEditor.Concatenate(_filesArray, TempDir + “result.pdf”);

Additional info…

It appears that the call to pdfEditor.Concatenate is returning false, it returns false even if I comment out strategy lines