Unable to call the Document constructor passing a stream

I am unable to call a Document constructor using a stream as the parameter.
I got the error message “Specified method is not supported.”.
I am using .NET 6.
Please help and advice.
Below is the code::

// stream is of type Stream from a PDF downloaded from a website
public static string ExtractText(Stream stream)
{
var extractedText = “”;
using (var pdfDocument = new Document(stream)) => will cause an exception with error message “Specified method is not supported.”.
{
var textAbsorber = new TextAbsorber();
textAbsorber.ExtractionOptions = new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Pure);
textAbsorber.ExtractionOptions.ScaleFactor = 0.4;
pdfDocument.Pages.Accept(textAbsorber);
extractedText = textAbsorber.Text;
}
return extractedText;
}

@hermangouw
Please attach the document with which this happens.

AsposePDFNet-1.png (185.6 KB)

AsposePDFNet-2.png (36.3 KB)

@sergei.shibanov
Attached are the screenshots of Visual Studio 2022 when the error occurred!

@hermangouw
Having the document, I can check reproducibility and create a task to the development team. Which in turn can look for the reason.
Screenshots… they won’t give enough information about what the reason is. Even if you add stacktrace. This will allow you to determine where the exception occurred and its causes, but most often you need to know what happened before and why it happened.
That’s why I’m asking for a document. I assume you are afraid to disclose some data? Can you remove sensitive data from the document on your side and see if the error is reproduced? And if so, then attach the resulting document.

@sergei.shibanov
The PDF document is perfectly fine.
If I use the constructor which reads the PDF document from the hard disk, it works fine.
Attached is the document.

BC201010564.pdf (42.1 KB)

@hermangouw
What version of the library are you using?
When I trying reproduce issue, for code running in .Net6 and library version 23.9 it works without exceptions.

var extractedText = "";
using (var inputStream = File.OpenRead(dataDir + "BC201010564.pdf"))
{
    using (var pdfDocument = new Document(inputStream)) // => will cause an exception with error message “Specified method is not supported.”.
    {
        var textAbsorber = new TextAbsorber();
        textAbsorber.ExtractionOptions = new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Pure);
        textAbsorber.ExtractionOptions.ScaleFactor = 0.4;
        pdfDocument.Pages.Accept(textAbsorber);
        extractedText = textAbsorber.Text;
    }
}

I am using v23.9.0

The stream was created as follows:
public Stream GetDocument(string documentLink, string filename)
{

var endpoint = this.baseServiceURL + documentLink;

var docReq = WebRequest.Create(endpoint) as HttpWebRequest;
docReq.Accept = "application/pdf";
docReq.Method = "GET";
docReq.ContentType = "application/x-www-form-urlencoded";
docReq.KeepAlive = false;
docReq.Headers.Add("Authorization", $"Bearer {this.token}");

var docResp = docReq.GetResponse();
var stream = docResp.GetResponseStream();

return stream;

}

FYI, stream is of type Stream NOT FileStream.

@hermangouw

Sorry, I missed your comment.
In this case, it is unknown whether it is a library error or an invalid stream. The best option is to download the PDF document and save it locally. And then, having read it into the stream, load it into the Document constructor.
BC201010564.pdf - just this document?
To check whether the file is being read correctly from the web. You can compare the resulting stream with a previously saved local file.

@sergei.shibanov

Sorry for the late reply as I wasn’t feeling well for the last 2 days.
This is what I am trying to achieve to be deployed as an AWS Lambda Function.

  1. Get the content of a PDF from our internal API as a Stream
  2. Pass the Stream to Aspose PDF Document object.
  3. Extract the text from the Document object
  4. Extract the properties from the Document object
    I DO NOT want to store the PDF locally as it is on AWS.

The PDF I sent you was produced from the Stream from our internal API so the Stream is a valid one.

This is one case (for which the constructor handles without exception). Just for fun, I slightly changed the code that I posted above, there is no exception.

var extractedText = "";
Stream aStream = File.OpenRead(dataDir + "BC201010564.pdf");
MemoryStream mStream = new MemoryStream(File.ReadAllBytes(dataDir + "BC201010564.pdf"));
using (var pdfDocument = new Document(mStream)) // => will cause an exception with error message “Specified method is not supported.”.
{
    var textAbsorber = new TextAbsorber();
    textAbsorber.ExtractionOptions = new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.Pure);
    textAbsorber.ExtractionOptions.ScaleFactor = 0.4;
    pdfDocument.Pages.Accept(textAbsorber);
    extractedText = textAbsorber.Text;
}

There is no certainty that in some other cases there are no errors when reading data. Or the library does not work in some cases. To catch this situation, make debugging so that if the constructor throws an exception, the data from the stream is written to a file and then it could be reproduced.

We need a reproducible situation. Can you provide additions to the code fragment given earlier (for code fragments that are flagged by the environment as errors)?
image.png (20.7 KB)