PDF to TIFF memory issue

We have recently begun to use Aspose.PDF for conversion of PDF pages to TIFF. With large files we are experiencing instances where we run out of memory. We have tried disposing of the Aspose.PDF.Document as well as setting the TiffSettings and TiffDevice to nothing (they don’t have dispose options). We also have tried forcing .NET garbage collection after cleaning up the objects. Then we have to recreate them for every page being processed. Is there another way to force the memory to be cleaned up that we are missing? Thanks in advance.

We have been processing one page a time. I did try to render all of the pages at the same time. I had the same problem running out of memory. I will attach the file that I am working with and also try other files to make sure that this isn’t just related to this one file. I don’t think so though because as I watch the memory while processing it simply grows each time any page is processed.

Attach sample

We also tried using a stream instead of writing to a file. Then disposing of the stream when we were finished with it. Still have the same problem.

Hi Joseph,


Thanks for contacting support.

I have tested the scenario using Aspose.Pdf for .NET 10.7.0 and as per my observations, there has been a little hike in memory utilization. However when application is closed, the memory is returned to normal. Can you please share some details regarding your working environment i.e. Aspose.Pdf for .NET version, Operating System version etc.

We are sorry for your inconvenience.

We are using the .NET platform.


The memory does release after the application is closed. The problem is we aren’t able to process more than about 20 pages of a typical document before we run out of memory with the process eating a gig or more. Of course it will depend on how much memory is available on the machine that the program is running on. It seems like the memory grows roughly equal to the bitmap that is produced. Our only solution right now would be to call our exe for a very small range of pages and repeat that process. And that adds a lot of overhead.

Thanks in advance.

We have no problem processing one page at a time as long as using the dispose methods will clean up the memory. It is difficult to have to close our exe each time to render a set of pages. If the memory grew that is ok as long as dispose or a new method would release it…

Hi Joseph,


Thanks for sharing the details.

The document which you have shared earlier is around 23 pages and when using Aspose.Pdf for .NET 10.7.0 in VisualStudio 2010 application with .NET Framework 4.0 running over Intel Core i5 2.5 Ghz with 4GB of RAM with Windows 7 (x64), I did not notice any OutOfMemory issue. However the memory utilization was hiked by 400MB. Can you please share some sample project, so that we can test the scenario in our environment. We are sorry for this inconvenience.

Besides this, you may consider Splitting large PDF file to individual pages document.

[C#]

//open document<o:p></o:p>

Document pdfDocument = new Document("input.pdf");

int pageCount = 1;

//loop through all the pages

foreach (Page pdfPage in pdfDocument.Pages)

{

Document newDocument = new Document();

newDocument.Pages.Add(pdfPage);

newDocument.Save("page_" + pageCount + ".pdf");

pageCount++;

}

'Sample code would be something like this. The difference is we are concerned about a leak in the TIFFDevice object. I simplied this but this should be a good example to demonstrate the leak.


Dim _strLicenseFileName As String = “Aspose.Total.lic"
Dim AsposePDFLicense As Aspose.Pdf.License
Dim pdfDocument As Aspose.Pdf.Document
Dim resolution As Aspose.Pdf.Devices.Resolution
Dim tiffSettings As Aspose.Pdf.Devices.TiffSettings
Dim iPage As Integer = 0
Dim tiffDevice As Aspose.Pdf.Devices.TiffDevice
Dim strCurrentImageFilename As String
AsposePDFLicense = New Aspose.Pdf.License
AsposePDFLicense.SetLicense(_strLicenseFileName)</div>
tiffSettings = New Aspose.Pdf.Devices.TiffSettings
tiffSettings.Compression = Aspose.Pdf.Devices.CompressionType.CCITT4
tiffSettings.Depth = Aspose.Pdf.Devices.ColorDepth.Format1bpp
resolution = New Aspose.Pdf.Devices.Resolution(300)
tiffDevice = New Aspose.Pdf.Devices.TiffDevice(resolution, tiffSettings)
pdfDocument = New Aspose.Pdf.Document(“input.pdf”)
For iPage = 1 To pdfDocument.Pages.Count
strCurrentImageFilename = “input.” +iPage.ToString(“000000”) +”.tif"
tiffDevice.Process(pdfDocument, iPage, iPage, strCurrentImageFilename)
Next

With this example going page by page you can see how the memory grows if you step through it.

Hi Joseph,


Thanks for sharing the details.

We are working on reproducing the issue based on new code snippet and will keep you posted with our findings.

Hi Joseph,


Thanks for your patience.

I have again tested the scenario using above stated code and have used form1.pdf as input document and as per my observations, when using VisualStudio 2010 with .NET Framework 4.0 running over Windows 7 (x64) on Intel Core i5 2.5 Ghz machine with 4GB of RAM, the memory utilization hiked only by 200 MB and I am unable to notice any OutOfMemory issue. As a result, 23 TIFF images are being generated.

It has to be something I am doing outside of this code I assume since if I create a fresh project with only this code it works. I will keep working on it. Thanks for helping though since this did at least rule out the library as the source now.

Hi Joseph,


We are glad to hear that you have managed to figure out the reasons and have been able to identify that issue is not occurring due to Aspose.Pdf for .NET. Please continue using our API and in case you still face any issue or you need any further assistance, please feel free to contact.

Unfortunately I am unable the resolve the issue so far after dedicating several full days to it. Whenever I call Aspose.PDF from the assembly where I need it I get a memory leak on the tiffdevice.Process call. If I create a clean WinForm project and call a class module project that hosts Aspose.PDF I don’t have the issue. But if call Aspose.PDF from the assembly where I need it I will have the leak for every page processed. I quickly run out of memory.


I am trying to figure out what could be different between a project where it works and one where it doesn’t. The code that calls Aspose. PDF is identical - literally copy and paste. So it has to be something else. And it is not easy to figure out.

I have used memory diagnostic tools. They point to Aspose.PDF as eating big blocks of memory that is not being returned. But why? What is different? This could be several more days of effort to try to figure that out. And I am not sure that I will succeed. I will try for one more day…

What could possibly be loaded such that calling the tiffdevice.process forces it to not return memory? I can tell on the very first page that it is leaking. The new diagnostics in Visual Studio 2015 show it as do my Telerik tools.

Hi Joseph,


Thanks for sharing the details.

From above description, it appears that Aspose.Pdf for .NET is causing issue when using API in certain project structure/type, because when using same code snippet with same API version in different project type does not produce any Memory leak related issue. Can you please share some sample project where you are noticing memory leak issue, so that we can test the scenario in our environment. We are sorry for this inconvenience.

I am working to try to create a sample that I can send you that reproduces it. I am trying to figure out what is different. This may take some time.

Hi Joseph,


Please take your time and once you have resource files ready, we will further look into this matter.

A post was merged into an existing topic: C# PDF to TIFF Conversion fails because of high Memory Usage