PDF to JPEG memory issues

Dear,

we are converting PDF documents to JPEG files using .NET Core 3.1. There seems to be a memory leak in the JPegDevice.Process() step.

This is the code used:

using (Stream stream = new MemoryStream(document.FileContent))
{

            using (Document pdfDocument = new Document(stream))
            {
                pdfDocument.EnableObjectUnload = true;
                // Bind input pdf file
                pdfDocument.OptimizeResources(new Aspose.Pdf.Optimization.OptimizationOptions()

                {

                    LinkDuplcateStreams = true,

                    RemoveUnusedObjects = true,

                    RemoveUnusedStreams = true
                    

                });
                Resolution resolution = new Resolution(100);
                // Create Jpeg device with specified attributes
                // Width, Height, Resolution
                JpegDevice JpegDevice = new JpegDevice(420, 594, resolution, 100);
                JpegDevice.RenderingOptions.UseNewImagingEngine = true;
           
               

                int imgNumber = 0;
                foreach (Page page in pdfDocument.Pages)
                {
                    using (MemoryStream memoryStream = new MemoryStream())
                    {
                        JpegDevice.Process(page, memoryStream);
                        await UploadImagesToBlob(document, memoryStream, imgNumber++.ToString() + ".jpg");
                    }
                    page.Dispose();
                }
                pdfDocument.FreeMemory();
            }

This is our dockerfile:

    FROM mcr.microsoft.com/dotnet/core/sdk:3.1-focal AS base
WORKDIR /app
RUN echo ttf-mscorefonts-installer msttcorefonts/accepted-mscorefonts-eula select true | debconf-set-selections
RUN \
  sed -i -e's/ main/ main contrib non-free/g' /etc/apt/sources.list \
  && apt-get -q update                                              \
  && apt-get install -y --quiet ttf-mscorefonts-installer libfontconfig1 libgdiplus

FROM mcr.microsoft.com/dotnet/core/runtime:3.1-focal AS build
WORKDIR /src
COPY ["***.csproj", "***/"]

RUN dotnet restore "***.csproj"
COPY . .
WORKDIR "***"
RUN dotnet build "***" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "***.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "***.dll"]

We are using Aspose.PDF version 21.6.0.

Regards

@loumondelaers

Can you please share the source PDF file so that we may try to reproduce the same on our end.

Hello,

we have the issue with any PDF, but attached you can find the example we were using.

Regards,
Lou
sample-pdf-with-images.pdf (3.8 MB)

@loumondelaers

Are you working with Azure storage? Can you please share OS details and a sample application for reproducing the issue on our end.

Hello,

I included our dockerfile for the OS specifications.
We are using azure storage to upload the generated image, but it’s before that step that the memory is growing. Dockerfile.zip (699 Bytes)

You can easily build your own sample application by using by code and putting the document contents in the first step:

using (Stream stream = new MemoryStream(document.FileContent))

I do not have a sample application to share due to dependencies with other services.

The memory leak we detected is in an isolated container though, no other processes are running in that container apart from the aspose PDF to image process.

I also tried running the same application on a windows container and then the memory stays stable.

Regards,
Lou

@loumondelaers

We are checking the issue on our end and will get back to as soon as possible.

Hello,

is there more information on this issue?

Thank you

@loumondelaers

We need to further investigate the issue in details from perspective of Docker environment. For the purpose, an investigation ticket as PDFNET-50263 has been logged in our issue tracking system. We will further look into its details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for the inconvenience.