Aspose-Words fails when run in Redis Queue

Environment
We have a docker container running Debian 10, and on this container is Redis Queue. We are starting a rq work-horse to process a word document and convert all of the WMF images inside of the document to PNGs.

A python script performing the same operations runs without error, but for some reason when run inside of rq the work-horse process dies and returns a code of 6.

Moving job to FailedJobRegistry (Work-horse terminated unexpectedly; waitpid returned 6 (signal 6); )

The method that fails is the one performing the actual conversion, specifically the file save. Using other libraries to do the conversion does not produce the same error, so I have to assume the problem stems from somewhere inside Aspose-Words. The code doing the conversion looks like this

aspose_document = aw.DocumentBuilder(aw.Document())
stream.seek(0)
shape = aspose_document.insert_image(stream)
renderer = shape.get_shape_renderer()
options = aw.saving.ImageSaveOptions(aw.SaveFormat.PNG)

with NamedTemporaryFile(suffix='.png', delete=False) as f:
    f.close()
    renderer.save(f.name, options) #<---- The program crashes here
    stream = open(f.name, 'rb')
    return FileStorage(stream)

@lbraguesdodoc The problem might occur because the same file is accessed simultaneously by several workers. Have you tried saving the file to stream? Please try modifying the code like this:

aspose_document = aw.DocumentBuilder()
stream.seek(0)
shape = aspose_document.insert_image(stream)
renderer = shape.get_shape_renderer()
options = aw.saving.ImageSaveOptions(aw.SaveFormat.PNG)
result = io.BytesIO()
renderer.save(result, options) 

@alexey.noskov That method causes the same problem on the same line (the renderer.save). Here is a bit more information from the rusage stats returned by the work-horse. The definitions of these values can be found here resource — Resource usage information — Python 3.12.1 documentation

13:22:44 Moving job to FailedJobRegistry (Work-horse terminated unexpectedly; waitpid returned 6 (signal 6); )
RETPID: 80
RET_VAL: 6
RUSAGE: resource.struct_rusage(ru_utime=2.418868, ru_stime=0.396039, ru_maxrss=154760, ru_ixrss=0, ru_idrss=0, ru_isrss=0, ru_minflt=27937, ru_majflt=0, ru_nswap=0, ru_inblock=0, ru_oublock=0, ru_msgsnd=0, ru_msgrcv=0, ru_nsignals=0, ru_nvcsw=13317, ru_nivcsw=32)

@lbraguesdodoc If possible, could you please create a simple application and required Dockerfile, which will allow us to reproduce the problem? This will allow us to better unrested the problem and resolve it.

@alexey.noskov I am working on getting a docker container together, but should rephrase the original problem stated in the question.It is actually EMF images, NOT WMF images that are breaking when run in redis. I’ve attached a sample word document which causes the problem.

sampleemf.docx (80.9 KB)

@lbraguesdodoc Thank you for additional information. I have tested your document in a simple script with the following code and it works fine on my side:

doc = aw.Document("C:\\Temp\\in.docx")
shape = doc.get_child(aw.NodeType.SHAPE, 0, True).as_shape()
renderer = shape.get_shape_renderer()
options = aw.saving.ImageSaveOptions(aw.SaveFormat.PNG)
renderer.save("C:\\Temp\\out.png", options) 

@alexey.noskov Here is a folder containing the code and configs necessary to make a docker container that can recreate the issue. Instructions on how to recreate the issue can be found in the README.md.
aspose-test.zip (159.1 KB)

@lbraguesdodoc Thank you for additional information.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): WORDSNET-26471

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

@lbraguesdodoc We have investigated the problem on our side and it looks like the problem is caused by libreoffice package. If do not install it the problem is not reproducible. Here is the modified Dockerfile:

FROM python:3.7.5 AS base
ENV PYTHONUNBUFFERED=1
ARG application_version
RUN apt-get update && apt-get install -y libxml2 libxml2-dev \
    libxmlsec1 libxmlsec1-dev pkg-config libxmlsec1-openssl && \
    pip install --upgrade pip
RUN apt-get update && apt-get install -y libgdiplus
RUN mkdir -p /opt/dodoc/log \
                /opt/dodoc/run \
                /workspace\
                /workspace/src
COPY ./build/dev.pip /dev.pip
COPY ./build/Aspose.Words.Python.NET.lic /opt
COPY ./samples/sampleemf.docx /workspace/src
RUN pip install -r /dev.pip

COPY . /workspace/
ENV PYTHONPATH "${PYTHONPATH}:/authsettings"
WORKDIR /workspace/src

Also, for correct work it is recommended to install libgdiplus

RUN apt install -y libgdiplus

And optionally fonts:

RUN apt install -y wget
RUN wget http://ftp.de.debian.org/debian/pool/contrib/m/msttcorefonts/ttf-mscorefonts-installer_3.8_all.deb
RUN apt install -y ./ttf-mscorefonts-installer_3.8_all.deb

@alexey.noskov Removing that library has seemingly removed some issue, but I think there are still problems. Even running the small docx file with only 2 images, I still sometimes get job failures. However, when expanding the test to a larger file with more images, I am pretty consistently getting failures after a few successful conversions.

I have updated the docker container using your suggestions, as well as including a larger file which is still causing failures for me.

aspose-test.zip (1.5 MB)

Here is an example of the output I’m seeing:

aspose-test-background-1  | 18:06:50 background: convert.test_convert() (1f0cd660-11a2-4f37-9888-9bd8eed8406b)
aspose-test-background-1  | Working on document ../samples/sampleemf.docx
aspose-test-background-1  | Shape: Picture 1
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME = Picture 1 ---- BOTTOM = 393.75 ---- LEFT = 0.0 ----
aspose-test-background-1  | 0
aspose-test-background-1  | Shape: Picture 1
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME = Picture 1 ---- BOTTOM = 234.0 ---- LEFT = 0.0 ----
aspose-test-background-1  | 1
aspose-test-background-1  | Working on document ../samples/GraphPadSmartObjects.docx
aspose-test-background-1  | Shape:
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME =  ---- BOTTOM = 390.5806 ---- LEFT = 0.0 ----
aspose-test-background-1  | 0
aspose-test-background-1  | Shape: Picture 5
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME = Picture 5 ---- BOTTOM = 166.97133858267716 ---- LEFT = 0.0 ----
aspose-test-background-1  | 1
aspose-test-background-1  | Shape:
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME =  ---- BOTTOM = 545.02595 ---- LEFT = 0.0 ----
aspose-test-background-1  | 2
aspose-test-background-1  | Shape:
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME =  ---- BOTTOM = 539.25 ---- LEFT = 0.0 ----
aspose-test-background-1  | 3
aspose-test-background-1  | Shape:
aspose-test-background-1  | About to save
aspose-test-background-1  | NAME =  ---- BOTTOM = 394.7065 ---- LEFT = 0.0 ----
aspose-test-background-1  | 4
aspose-test-background-1  | Shape: TextBox 6
aspose-test-background-1  | 18:06:53 Moving job to FailedJobRegistry (Work-horse terminated unexpectedly; waitpid returned 6 (signal 6); )

@lbraguesdodoc Thank you for additional information. We will further investigate the problem and provide you more information.

@lbraguesdodoc We have completed analysis of the issue.

Redis:

  1. Configuring Redis with a timeout and increased memory did not yield any results.
  2. Building a Redis image with the vm.overcommit_memory = 1 option enabled also did not yield any results.
  3. Different Docker configurations did not solve the issue either.

What we managed to find out:

  1. The behavior of the your code depends on the presence of the libreoffice package. If it is present, nothing works in Reddis. If it is not present, the code in Redis starts running and processes the first document.
  2. Your code works without Reddis on its image.
  3. On MacOs in a Docker container (without libreoffice), it doesn’t even process the first document.
  4. On Windows in a Docker container (without libreoffice), it usually crashes on the 28th iteration of the second document. However, we managed to run the queue twice without errors and convert everything successfully.
  5. If the documents are resaved via MS Word, everything works.

Summary:
We have every reason to believe that the problem inside Reddis. And there may be two possible causes:

  1. Its settings - timeouts, memory, etc.
  2. When processing the source documents in SkiaSharp or libgdiplus, and perhaps in other libraries (don’t forget about libreoffice), certain processing methods are called that Redis doesn’t like. After resaving the documents, the attributes are different, and it is possible that the problematic methods are not called.

The Aspose.Words code is working. We can recommend you to contact Redis support or use the resaved documents.

The issues you have found earlier (filed as WORDSNET-26471) have been fixed in this Aspose.Words for .NET 24.3 update also available on NuGet.