Support non-seekable streams in Aspose.Pdf

Many streams do not support seeking. ie CanSeek returns false and Seek() throw an exception

For example when reading a stream from a database.

In these case the only workaround is to read the entire stream into a memorystream. The problem with this is that the entire byte array needs to read into memory. for large files this severely effects the performance of an app, and has a negative impact the GC.

Can support for non-seekable streams be added?

1 Like

@simoncropp

Thanks for contacting support.

Would you kindly share some sample PDF document along with code snippet through which we can observe the performance issue and need of non-seekable streams. We will investigate it in our environment and address it accordingly.

    [Test]
    public void Run()
    {
        using (var inputStream = File.OpenRead(docPath))
        {
            var pdfDocument = new Document(new NonSeekableStream(inputStream));
        }
    }

class NonSeekableStream : Stream
{
    Stream inner;

    public NonSeekableStream(Stream inner)
    {
        this.inner = inner;
    }

    public override void Flush()
    {
        inner.Flush();
    }

    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new System.NotImplementedException();
    }

    public override void SetLength(long value)
    {
        throw new System.NotImplementedException();
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        return inner.Read(buffer, offset, count);
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        throw new System.NotImplementedException();
    }

    public override bool CanRead => inner.CanRead;
    public override bool CanSeek => false;
    public override bool CanWrite => false;
    public override long Length => inner.Length;

    public override long Position
    {
        get => inner.Position;
        set => throw new System.NotImplementedException();
    }
}
1 Like

@simoncropp

A feature request has been logged in our issue tracking system as PDFNET-47805. We will further investigate the feasibility of required support and keep you posted with the status of its availability. Please be patient and spare us some time.

We are sorry for the inconvenience.

Any update on this . it currently breaks for SqlSequentialStream

note the same bug now also occurs in the words api

@simoncropp

We are afraid that the earlier logged ticket has not been yet resolved. We will surely inform you via this forum thread once we have some updates regarding feature implementation.

We request you please create a post in Aspose.Words forum category where your concerns will be addressed accordingly.

We request you please create a post in Aspose.Words forum category where your concerns will be addressed accordingly.

really. given this bug has been open for 3 years, i doubt it

@simoncropp

Please accept our humble apology for the delay. The earlier logged ticket is more like an enhancement. Also, every Aspose API is different and posseses its own limitations and restrictions while implementing any feature or enhancement. It is not necessary that this feature could face similar restrictions to get implemented in Aspose.Words just like its facing for Aspose.PDF.

Nevertheless, the ticket priority has already been raised and your comments have been attached with it as well. We will try to escalate the investigation and will surely inform you once we have some updates. We again apologize for the inconvenience faced.

this is not an enhancement. many streams are forward onle. the most common one is SqlSequentialStream. Apose not respecting Stream.CanSeek is a bug. you should add nonseekable streams as a test to every product in your suite

1 Like

@simoncropp

We have taken your feedback into our Account and will certainly consider it during ticket resolution.

@simoncropp

Unfortunately, non-seekable streams contradict the logic of PDF documents. At the end of a document, there is a cross-reference table that contains information allowing random access to indirect PDF objects within the file. Without seeking in a stream, random access becomes impossible. Therefore, supporting non-seekable streams would require a major architecture redesign of the library. The only workaround would be to save the non-seekable stream into a memory stream or temporary file stream in order to reduce memory consumption.