Free Support Forum - aspose.com

Exception of type 'System.OutOfMemoryException' was thrown when uploading a document


#1

Hello,

Attempting to upload the file (removed) causes an Exception of type ‘System.OutOfMemoryException’ to be thrown. Stack trace below.

Exception of type ‘System.OutOfMemoryException’ was thrown.
at System.IO.MemoryStream.set_Capacity(Int32 value)
at System.IO.MemoryStream.EnsureCapacity(Int32 value)
at System.IO.MemoryStream.Write(Byte[] buffer, Int32 offset, Int32 count)
at System.IO.StreamWriter.Flush(Boolean flushStream, Boolean flushEncoder)
at System.IO.StreamWriter.Write(Char[] buffer, Int32 index, Int32 count)
at System.IO.TextWriter.WriteLine(String value)
at .(Stream , Encoding )
at .()
at …ctor(List`1 , Rectangle , TextExtractionOptions )
at .(TextExtractionOptions )
at Aspose.Pdf.Text.TextAbsorber.( , Boolean )
at Aspose.Pdf.Text.TextAbsorber.Visit(Page page)
at Aspose.Pdf.PageCollection.Accept(TextAbsorber visitor)
at CMS.BusinessLayer.ContentFileManager.ExtractPdfContent(Content fileContent, CmsFile oFile) in C:\Dev\Master\IrmsWeb\src\Cms\CMS.BusinessLayer\Content\ContentFileManager.vb:line 886

A similar instance has been reported in…

Is there any resolution to this issue?

Thank you,
Krassimir


#2

@kmanol

Thanks for contacting support.

We have tested the scenario using following code snippet with Aspose.PDF for .NET 18.12 and were unable to notice the exception.

Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document("D:\\Enoxaparin MDV 451429A.pdf");
TextAbsorber ta = new TextAbsorber();
pdfDocument.Pages.Accept(ta);

Would you please share the complete code snippet which you are using at your side and experiencing the issue. Please also share your environment details with us so that we can test the scenario in our environment and address it accordingly. We have tested the scenario in an environment i.e. Windows 10 EN x64, Console App x64 Debug Mode, Core Framework 2.1, Visual Studio 2017 Community Edition with 8GB of RAM installed.


#3

@asad.ali

Thank you for your quick response! Here is the requested information:

Complete Code Snippet

Private Shared Sub ExtractPdfContent(ByVal fileContent As Content, ByVal oFile As CmsFile)
    Dim inFile As String = Nothing
    inFile = oFile.FilePath & fileContent.FileName
    Dim impersonateUser As OBA.Core.Security.SecureAccessUser = OBA.Core.Security.SecureAccessUser.GetSecureAccessUser()

    Using New OBA.Core.Security.Impersonator(impersonateUser.UserName, impersonateUser.Domain, impersonateUser.Password)
        Try
            'open document
            Dim doc As New Aspose.Pdf.Document(inFile)

            'create TextAbsorber object to extract text
            Dim textAbsorber As New TextAbsorber()

            'accept the absorber for all the pages
            doc.Pages.Accept(textAbsorber)

            'get the extracted text
            Dim extractedText As String = textAbsorber.Text

            fileContent.Text = extractedText
            fileContent.IsCustomDocumentText = False
        Catch ex As System.IO.IOException
            If ex.Message.StartsWith("Wrong text extracting, please check your pdf") Then
                If ContentManager.AllowDocumentTextEditForContent() Then
                    SessionFeedback.SetFeedback(Resource.Resource.ID_PROBLEMEXTRACTINGTEXT, SessionFeedback.FeebackMode.Information)
                Else
                    SessionFeedback.SetFeedback(Resource.Resource.ID_FILETEXTNOTEXTRACTED, SessionFeedback.FeebackMode.Information)
                End If
            End If
        Catch ex As Exception
            Throw
        End Try
    End Using
End Sub

Environment Details

Windows 10 EN x64
ASP.NET x64 Debug Mode
Microsoft .NET Framework 4.7.1
Visual Studio 2017 Professional Edition
12GB of RAM installed.


#4

@kmanol

Thanks for sharing requested information.

We have again tested the scenario in similar configuration that you have shared and were not able to replicate the issue. Please note that it is necessary for us to replicate the issue at our side in order to address it. Would you please share a sample application, which is able to reproduce the same issue. We will again test it in our environment and address it accordingly.


#5

@asad.ali

We have identified that the Accept method call is responsible for the large performance impact experienced. Upon further investigation, we received mixed results in replicating the timeout, which seemingly depended on where we executed the code (local vs server). Nonetheless, even in success, the performance we received was undesirable.

Also, we found another related thread which indicated that this issue has recently been addressed?

That said, do you have any suggestions on how we can improve the performance of the code snippet provided?

Thank you,
Krassimir


#6

@kmanol

Thanks for getting back to us.

The other issue in the post which link you have shared was related to huge operator collections on particular pages of the document. We had already improved TextAbsorber to deal with this kind of large documents. In case it can help, you may please use TextFormattingMode.MemorySaving in TextExtractionOptions during initializing TextAbsorber. It is almost same to ‘Raw’ mode but works slightly faster and uses less memory.

Please initialize TextAbsorber as following:

TextAbsorber absorber = new TextAbsorber(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.MemorySaving));

You may additionally reduce memory consumption by using ‘per page’ processing and manual calling dispose on processed page objects.

TextAbsorber absorber = new TextAbsorber(new TextExtractionOptions(TextExtractionOptions.TextFormattingMode.MemorySaving));
using (doc = new Aspose.Pdf.Document(myDir + "input.pdf"))
{
 foreach (Page page in doc.Pages)
 {
  page.Accept(absorber);
  page.Dispose();
 }
}
string text = absorber.Text;
doc.Dispose();

In case you still face any issue, please let us know. We will further proceed to help you out.


#7

@asad.ali

Thank you for this response!

I have implemented the changes suggested, and can confirm that the performance was improved (and the System.OutOfMemoryException exception is no longer thrown).

Best regards,
Krassimir


#8

@kmanol

Thanks for your feedback.

It is good to know that your issue has been resolved by implementing suggested approach. Please keep using our API and in case you face any issue, please feel free to contact us.