@David_Matin I have tested with your input document and still the problem is not reproducible on my side. The provided PDF document is successfully converted to Markdown.
@alexey.noskov Does this have anything to do with parsing the file after I split it? I first use aspose-pdf(24.3.0) to split the pdf file into a number of pdf according to the page number, and then these pdf conversion md
@David_Matin Could you please attach the PDF document that is passed to Aspose.Words and causes the problem? We will test with it and let you know the result.
this is file:
https://drive.google.com/file/d/1tVqPTe_CE0-GNuGS5sh-9ErIRm_586RZ/view?usp=sharing
@David_Matin Thank you for additional information. unfortunately, the problem is still not reproducible on my side. Please try with the docker file provided above and let us know if the problem is reproducible on your side.
@alexey.noskov Our Linux version is CentOS Linux release 7.9.2009 (Core).I reproduced the problem on this version of the machine.Same code and file,same error message.
@David_Matin I tested with CentOS and still the problem is not reproducible on my side.
Dockerfile:
FROM centos/python-38-centos7
USER root
# Install ICU package.
RUN yum -y install icu
WORKDIR /usr/app/src
# Copy function code
COPY app.py ./
RUN pip install aspose-words
ENTRYPOINT [ "python3", "app.py"]
app.py:
doc = aw.Document("/temp/in.pdf")
save_options = aw.saving.MarkdownSaveOptions()
save_options.image_resolution = 300
doc.save("/temp/out.md", save_options)
this time I used in linux server not in k8s
@alexey.noskov After updating the version, an new error was reported :
Unhandled exception. System.TypeInitializationException: The type initializer for 'SkiaSharp.SKObject' threw an exception.
---> System.DllNotFoundException: Unable to load shared library 'libSkiaSharp' or one of its dependencies. In order to help diagnose loading problems, consider setting the LD_DEBUG environment variable: liblibSkiaSharp: cannot open shared object file: No such file or directory
at SkiaSharp.SkiaApi.sk_version_get_milestone()
at SkiaSharp.SkiaSharpVersion.get_Native()
at SkiaSharp.SkiaSharpVersion.CheckNativeLibraryCompatible(Boolean throwIfIncompatible)
at SkiaSharp.SKObject..cctor()
--- End of inner exception stack trace ---
at SkiaSharp.SKObject.DeregisterHandle(IntPtr handle, SKObject instance)
at SkiaSharp.SKObject.set_Handle(IntPtr value)
at SkiaSharp.SKNativeObject.Dispose(Boolean disposing)
at SkiaSharp.SKObject.Dispose(Boolean disposing)
at SkiaSharp.SKBitmap.Dispose(Boolean disposing)
at SkiaSharp.SKNativeObject.Finalize()
@David_Matin Do you use any other Aspose products? If so, could you please specify which and their versions.
The latest version of Aspose.Words for Python uses SkiaSharp 3.116.1, probably other product you are using uses an older version and this cases the problem.
@David_Matin Thank you for additional information. This does not look like might be reason of the problem on your side, since Aspose.PDF does not use SkiaSharp. Also, I tested with both Aspose.Words and Aspose.PDF in one project and still the problem is not reproducible.
If possible, could you please try testing in a clear environment on your side? For example in the Docker container like described above.
@alexey.noskov Than you.I search our k8s env:
cat /etc/os-release
PRETTY_NAME="Debian GNU/Linux 10 (buster)"
NAME="Debian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"
uname -a
Linux 56956d5b85-vjz9k 5.10.134-16.3.al8.x86_64 #1 SMP Tue Mar 26 18:54:05 CST 2024 x86_64 GNU/Linux
@David_Matin Thank you for additional information. I have managed to reproduce the problem with Debian GNU/Linux 10 (buster).
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSPYTHON-86
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.
@David_Matin Unfortunately, the issue is not resolved yet. We will keep you informed and let you know once it is fixed.
@David_Matin This issue is not reproducible with the latest 25.6 version of Aspose.Words for Python. Please try using the latest version and let us know if the problem still persists on your side.
@alexey.noskov Hello,this error looks like fixed’Unhandled exception. System.TypeInitializationException: The type initializer for ‘SkiaSharp.SKObject’ threw an exception.
'.But I found one pdf,when convert to markdown using aspose-words 25.6.0,content has messy code.
PDF:TBSA-金字塔原理:思考、寫作、解決問題的邏輯方法-2011.4.6.pdf (2.3 MB)
Markdown: https://drive.google.com/file/d/1-OUEhBAU_yrbq_5MPsdmN2C82a6Nbft7/view?usp=sharing
@David_Matin
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): WORDSNET-28366
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.