Memory Leak when extracting lots of images from a PDF

Hi to all.
I’m having a big problem with memory leaks by now.
I’m trying to extract images from Pdf’s with more than 10k pages.
The problem is: When extracting these images, memory is allocated and never gets free. It explodes in few minutes. After about 20 pages it’s holding 500mb. But I need to do this in Pdfs with 10k+ pages. and the way it is, is impossible.


Any idea on how to solve?

This is the FULL code I am using.

My Aspse.Pdf.Kit is version 4.2.0.0

//
using System;
using System.Drawing.Imaging;
using System.IO;
using iTextSharp.text.pdf;

namespace PdftoImg
{
class Program
{
static void Main(string[] args)
{
Console.ReadKey();
Console.WriteLine(“Starting…”);
String pdfPath = “test.pdf”;
PdfReader reader = new PdfReader(pdfPath);
Aspose.Pdf.Kit.PdfExtractor extractor = new Aspose.Pdf.Kit.PdfExtractor();
extractor.BindPdf(pdfPath);
extractor.StartPage = 1;
extractor.EndPage = reader.NumberOfPages;
extractor.ExtractImage();
int page = 1;
string fileName = Path.Combine(Path.GetTempPath(), “temp” + page + “.tiff”);
extractor.GetNextImage(fileName, ImageFormat.Tiff);
while (extractor.HasNextImage())
{
page++;
fileName = Path.Combine(Path.GetTempPath(), “temp” + page + “.tiff”);
extractor.GetNextImage(fileName, ImageFormat.Tiff);
}
Console.WriteLine(“Done!”);
Console.ReadKey();
}
}
}
//

Hope you can help me!

Hi David,

Please share the problematic (sample) PDF file with us, so we could test the issue at our end. You’ll be updated accordingly.

We’re sorry for the inconvenience.
Regards,

It’s attached.


This is just a sample. With only 24 pages. When it’s near the half it’s getting ~500mb.
But I will be using with pdfs much more bigger.

Hope you can solve it. As soon as you can :S
It’s urgent for a project in my company.

THanks


Hi David,

Can you please try using the following code snippet?


using Microsoft.Win32;

using System.Runtime.InteropServices;


public class MemoryManagement

{

[DllImportAttribute(“kernel32.dll”, EntryPoint = “SetProcessWorkingSetSize”, ExactSpelling = true, CharSet = CharSet.Ansi, SetLastError = true)]

private static extern int SetProcessWorkingSetSize(IntPtr process, int minimumWorkingSetSize, int maximumWorkingSetSize);


public static void FlushMemory()

{

GC.Collect();

GC.WaitForPendingFinalizers();


if (Environment.OSVersion.Platform == PlatformID.Win32NT)

{

SetProcessWorkingSetSize(System.Diagnostics.Process.GetCurrentProcess().Handle, -1, -1);

}

}

}

You can call the FlushMemory method inside the while loop as under:

while (extractor.HasNextImage())
{

//other image extraction code

MemoryManagement.FlushMemory();
}


This frees any unused memory quite well. Please try this at your end and see if it helps. If you still find any issues or have some more questions, please do let us know.
Regards,

This works great for 32-bit Windows but how about 64-bit Windows?

Hi Frank,

Kindly follow up this issue on your other separate [thread].

Regards,