Not able to extract text from PDF

nneelamjjain1 · December 3, 2025, 6:27am

I have installed Aspose PDF on Debian (bookworm). Refer attached PDF. It does not extract page # 35 or 36.
bad2.pdf (144.7 KB)

asad.ali · December 3, 2025, 5:34pm

@nneelamjjain1

Have you used the latest API version? Can you please share which code snippet have you used to extract the pages and what kind of error have you noticed? We will test the scenario in our environment and address it accordingly.

nneelamjjain1 · December 3, 2025, 6:06pm

I installed on debian with pip install aspose-pdf. As i previously told there is no error. The issue is missing text. For e.g. page#35 is missing
Library Installed:
“Name: aspose-pdf
Version: 25.11.0
Summary: Aspose.PDF for Python via .NET is a PDF Processing library to perform document management can easily be used to generate, modify, convert, render, secure and print documents without using Adobe Acrobat.”
The code which i used
import aspose.pdf as apdf
doc = apdf.Document(str(path))
absorber = apdf.text.TextAbsorber()
doc.pages.accept(absorber)
extracted = absorber.text

asad.ali · December 4, 2025, 6:14pm

@nneelamjjain1

We are checking it and will get back to you soon.