Use aspose.words on docker image python:3.11 not working

Hi there,

we’re evaluating moving from LibreOffice to Aspose.Words Python. Unfortunately, we’re running into problems right away. We’re using the official python:3.11 docker image.

Dockerfile

FROM python:3.11

WORKDIR /home
COPY some.docx ./
COPY program.py ./

RUN pip install aspose-words

CMD ["python", "program.py"]

program.py

import aspose.words as aw

aw.Document("some.docx").save("some.pdf")

Building the image and running it fails with the following error message:

Process terminated. Couldn't find a valid ICU package installed on the system. Set the configuration flag System.Globalization.Invariant to true if you want to run with no globalization support.
   at System.Environment.FailFast(System.String)
   at System.Globalization.GlobalizationMode.GetGlobalizationInvariantMode()
   at System.Globalization.GlobalizationMode..cctor()
   at System.Globalization.CultureData.CreateCultureWithInvariantData()
   at System.Globalization.CultureData.get_Invariant()
   at System.Globalization.CultureInfo..cctor()
   at System.StringComparer..cctor()
   at System.StringComparer.get_OrdinalIgnoreCase()
   at System.Text.EncodingTable..cctor()
   at System.Text.EncodingTable.GetCodePageFromName(System.String)
   at System.Text.CodePagesEncodingProvider.GetEncoding(System.String)
   at System.Text.EncodingProvider.GetEncodingFromProvider(System.String)
   at System.Text.Encoding.GetEncoding(System.String)
   at Aspose.WrpGen.Interop.GenericMarshaler..cctor()
   at Aspose.WrpGen.Interop.GenericMarshaler.ToString(Aspose.WrpGen.Interop.PyStringArg*)
   at WrpNs_Aspose.WrpNs_Words.WrpCs_Document_44597C31.ctor_001_Document(Aspose.WrpGen.Interop.PyStringArg*)

Searching the internet, this error can be resolved adding

ENV DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1

to the Dockerfile, but I don’t understand the consequences of this env variable. Running the modified image, the error changes to:

No usable version of libssl was found

And I have not found a way to install libssl1 on the python:3.11 image.

That means that Aspose.Words for Python is not usable via the official python docker image. Can anyone help with a workaround?

@tjmaul You can install libssl1 using the following command:

wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.0g-2ubuntu4_amd64.deb
&& sudo dpkg -i libssl1.1_1.1.0g-2ubuntu4_amd64.deb

Here is a working Dockerfile:

FROM python:3.11 AS base

ENV DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1
RUN apt-get update && apt-get install -y libgdiplus

RUN wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.0g-2ubuntu4_amd64.deb 
RUN dpkg -i libssl1.1_1.1.0g-2ubuntu4_amd64.deb

RUN pip install aspose-words

# Copy function code
COPY app.py ./

ENTRYPOINT ["python3", "app.py"]

thanks for the quick response @alexey.noskov , I’ll try that first thing in the morning. Can you say something about the consequences of setting ENV DOTNET_SYSTEM_GLOBALIZATION_INVARIANT=1 ?

@tjmaul Please see the following article to learn about this option:
https://github.com/dotnet/runtime/blob/main/docs/design/features/globalization-invariant-mode.md

You are interested in the following:

APP behavior with and without the invariant config switch

If the invariant config switch is not set or it is set false

  • The framework will depend on the OS for the globalization support.
  • On Linux, if the ICU package is not installed, the application will fail to start.

So to make the application work you need either install ICU package or enable this option.

@alexey.noskov

entering a shell inside the container, the command apt list | grep icu has the following output:

gophernicus/stable 3.1.1-3+b1 amd64
icu-devtools/stable,now 72.1-3 amd64 [installed,automatic]
icu-doc/stable 72.1-3 all
libghc-text-icu-dev/stable 0.7.1.0-1+b2 amd64
libghc-text-icu-doc/stable 0.7.1.0-1 all
libghc-text-icu-prof/stable 0.7.1.0-1+b2 amd64
libharfbuzz-icu0/stable 6.0.0+dfsg-3 amd64
libicu-dev/stable,now 72.1-3 amd64 [installed,automatic]
libicu4j-4.4-java/stable 4.4.2.2-4 all
libicu4j-java/stable 72.1-1 all
libicu72/stable,now 72.1-3 amd64 [installed,automatic]
libploticus0-dev/stable 2.42-5 amd64
libploticus0/stable 2.42-5 amd64
php-symfony-polyfill-intl-icu/stable 1.27.0-2 all
ploticus/stable 2.42-5 amd64
postgresql-15-icu-ext/stable 1.6.2-4+b2 amd64
python3-icu/stable 2.10.2-1+b3 amd64
r-cran-reticulate/stable 1.28+dfsg-1 amd64
yaz-icu/stable 5.34.0-1 amd64

So it seems libicu72 is installed. Is Aspose.Words maybe expecting a different version?

Also, I read the information about globalization invariant mode but I still don’t understand what exactly are the consequences. If I, for example, convert a docx to pdf or txt, will this work flawlessly for non-ascii text?

I’d rather have the library fully working before continuing my research.

EDIT: After fiddeling around some more I can already tell that saving a german word document to TXT fails with

RuntimeError: Proxy error(CultureNotFoundException): Culture is not supported. (Parameter 'culture')
1033 (0x0409) is an invalid culture identifier.

@tjmaul We will investigate the problem and get back to you soon.

@tjmaul

Could you please attach the document that causes the problem upon conversion and provide code you use for testing? I have tested with a dummy document and cannot reproduce the problem on my side.

Hi @alexey.noskov ,

I created a repository showcasing the issue:

https://github.com/fi4sk0/aw-python311-docker

The file in question is GESSI-Exit-Muster-SPA.docx which is a common legal document. Please note, that on my local machine (Ubuntu 22.04) it does work. It just doesn’t work in the docker container.

To execute the test, a valid license file Aspose.WordsforPythonvia.NET.lic is needed and has to be placed in the root directory of the repo.

@tjmaul Thank you for additional information. I managed to reproduce the problem on my side. It is a strange problem. It looks like .NET does not work with new version of ICU. I tried installing an older version and the code works. Here is an updated Dockerfile which works with your document:

FROM python:3.11 AS base

RUN apt-get update && apt-get install -y libgdiplus
# Install libssl1
RUN wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.0g-2ubuntu4_amd64.deb 
RUN dpkg -i libssl1.1_1.1.0g-2ubuntu4_amd64.deb
# install an older version of ICU
RUN wget http://archive.ubuntu.com/ubuntu/pool/main/i/icu/libicu70_70.1-2_amd64.deb
RUN dpkg -i libicu70_70.1-2_amd64.deb

RUN pip install aspose-words

# Copy function code
COPY app.py ./

ENTRYPOINT ["python3", "app.py"]

Could you please try on your side and let us know if it works on your side.

hey @alexey.noskov,
I just tried and it works. It also explains why it works on my local Ubuntu desktop install, because the installed version there is libicu70.

Seems like .Net 8 not installable on Ubuntu 24.04 · Issue #39907 · dotnet/sdk · GitHub is related to that. They also mention libicu72 as the most recent libicu that works.

Please note that I’m not interested in running the latest libicu. I just want it to work in general and not be dragged into this rabbit hole :slight_smile:

thank you @alexey.noskov for your help! I guess for now this workaround is good enough and I assume with newer versions of both dotnet and Aspose.Words, this problem will resolve itself. It might be worth communicating this problems to the developers, though.

@tjmaul It is perfect that the workaround works for you. Please feel free to ask in case of any other issues.
I also logged an issue with the above analysis provided and we will further investigate whether we can improve this somehow.

1 Like