Getting error while reading large MBox file using Aspose.Email for .net

Hello,

I am getting below error while reading large MBox file using Aspose.Email. File is around ~4gb in size.

System.OutOfMemoryException: Array dimensions exceeded supported range.
at System.Collections.Generic.List1.set_Capacity(Int32 value) at System.Collections.Generic.List1.AddWithResize(T item)
at Aspose.Email.Storage.Mbox.MboxrdStorageReader.GetTotalItemsCount()

@J_Z
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): EMAILNET-41586

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.

Hi @J_Z ,
We have investigated and confirmed that Aspose.Email works correctly with large files (over 6 GB). The issue appears to be with your specific file — most likely due to a missing line break. If you share your file via Dropbox, Google Drive, or another file-sharing service, we will analyze it and provide a solution.

Hello,

Thanks for your prompt reply.
Actually it is a client file so we can’t share it.

I think the issue is the counter variable taken in code side is Int32 and value maybe exceeding the range for this file. Or taking too much stuff in memory just to get count of number of items.
Can you investigate which other scenario could cause this GetTotalItemCount() method to go OutOfMemory?

Thanks,
J

Hi @J_Z

  1. As we wrote above, the problem is not with the Int32 type but with the contents of your file.
  2. Could you check your file type using our utility from the Aspose.Email API like this:
            Aspose.Email.FileFormatInfo formatInfo = Aspose.Email.Tools.FileFormatUtil.DetectFileFormat("largefile.mbox");
            Aspose.Email.FileFormatType type = formatInfo.FileFormatType;
            Assert.AreEqual(FileFormatType.Mbox, type);
  1. The problem is probably in the line feed characters (either they are non-standard or they are not there) and unfortunately we cannot help without the file or at least part of it (possibly without sensitive information)

Hello @alexander.pavlov ,

Tried with this API and got FileFormatType as MBox for this file.

Thnks,
J

Hi @J_Z
Could you please run this code to determine the longest line in your file and send us that value?

        public static long GetMaxLineLength(string filePath)
        {
            long maxLength = 0;
            long currentLength = 0;

            using (var fs = new FileStream(filePath, FileMode.Open, FileAccess.Read, FileShare.Read, bufferSize: 8192, options: FileOptions.SequentialScan))
            {
                int b;
                int? prevByte = null;
                while ((b = fs.ReadByte()) != -1)
                {
                    if (b == '\n') 
                    {
                        if (prevByte == '\r') 
                        {

                        }
                        else
                        {
                            maxLength = Math.Max(maxLength, currentLength);
                            currentLength = 0;
                        }
                    }
                    else if (b == '\r')
                    {
                        maxLength = Math.Max(maxLength, currentLength);
                        currentLength = 0;
                    }
                    else
                    {
                        currentLength++;
                    }

                    prevByte = b;
                }

                maxLength = Math.Max(maxLength, currentLength);

                return maxLength;
            }
        }

Hello,

maxLength = 3180331038 is what I got for this.

Let me know if you need more details.

Thanks,
J

Hi @J_Z ,
Thank you for the update. We will review your case, but without a sample file, we cannot guarantee an accurate result.

Hello,

Thanks for the update.
I can provide portion of the file if you need it.
Can’t share the entire file. (Because of client confidentiality)

You can let me know which portion you need.

Thanks,
J

Hi @J_Z
We need the portion of the file located between two From header lines in mbox format, such as:
From - Fri Feb 25 12:20:04 2011 (dates are just examples).

This section must contain the longest line with maxLength = 3180331038 that we previously identified.It can be located by refining the code above.

In a text editor, it looks approximately like this:


From - Fri Feb 25 12:20:04 2011
Received: from WIN-2BKCUEDGIIG ([127.0.0.1]) by agco.cataligent.net with Microsoft SMTPSVC(7.5.7600.16601);
Thu, 11 Nov 2010 07:19:28 -0600
MIME-Version: 1.0
From: noreply@agco.cataligent.net
To: info@cataligent.com
Reply-To: feedback@cataligent.com
Date: 11 Nov 2010 07:19:28 -0600
Subject: =?us-ascii?Q?Approval_request_for_=2C_A2_2010-11-09=2C_=2C_Beauvais=2C_?=
Content-Type: multipart/mixed;
boundary=“–_=_NextPart1_63c36585-9b28-4bc2-b856-448d05068dca”
Return-Path: noreply@agco.cataligent.net
Message-ID: WIN-2BKCUEDGIIGbC8w000005d5@agco.cataligent.net
X-OriginalArrivalTime: 11 Nov 2010 13:19:28.0744 (UTC) FILETIME=[11BCF680:01CB81A3]

----_=_NextPart1_63c36585-9b28-4bc2-b856-448d05068dca
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: quoted-printable

<html=20style=3D’font-family:verdana;font-size:10=2E0pt’><meta=20http=
-equiv=3DContent-Type=20content=3D"text/html;=20charset=3Dwindows-1252"></h=
ead><body=20lang=3DDE=20link=3Dblue=20vlink=3Dpurple=20>

<p=20><span=20=
style=3D’font-size:10=2E0pt’>The=20approval=20process=20for=20project=20A2=
=202010-11-09=20has=20been=20started=2E=20Please=20view=20the=20attachment=
=20for=20further=20information=2E

<p=20><span=20=20style=3D’font-=
size:10=2E0pt;color:black;‘>Reviewer=20Decision:

<p=20><span=20st=
yle=3D’font-size:10=2E0pt’><a=20href=3D"mailto:49859300@agco=2Ecataligent=
=2Enet?subject=3DConditionally=20Approved:%20%5bRunning%20Stage:%202%5d%20C=
ER%20approval%20request%20for%20A2%202010-11-09,%20,%20&body=3DReviewer=
%20Comments:“>Conditionally=20Approved

<p=20><sp=
an=20style=3D’font-size:10=2E0pt’><a=20href=3D"mailto:49859300@agco=2Ecatal=
igent=2Enet?subject=3DApproved:%20%5bRunning%20Stage:%202%5d%20CER%20approv=
al%20request%20for%20A2%202010-11-09,%20,%20&body=3DReviewer%20Comments=
:”>Approved

<p=20><span=20style=3D’font-size:10=
=2E0pt’><a=20href=3D"mailto:49859300@agco=2Ecataligent=2Enet?subject=3DDeni=
ed:%20%5bRunning%20Stage:%202%5d%20CER%20approval%20request%20for%20A2%2020=
10-11-09,%20,%20&body=3DReviewer%20Comments:“>Denied</=
span>

<p=20><span=20style=3D’font-size:10=2E0pt’><a=20href=3D"mailto:498=
59300@agco=2Ecataligent=2Enet?subject=3DClarification:%20%5bRunning%20Stage=
:%202%5d%20CER%20approval%20request%20for%20A2%202010-11-09,%20,%20&bod=
y=3DReviewer%20Comments:”>Clarification=20needed</p=

----_=NextPart1_63c36585-9b28-4bc2-b856-448d05068dca
Content-Type: multipart/mixed; boundary="–
=_NextPart2_83908fe7-7763-4961-b7f7-9f501eb2b1c7"

----_=_NextPart2_83908fe7-7763-4961-b7f7-9f501eb2b1c7
Content-Type: application/octet-stream; name=“Milestone Review.ppt”
Content-Transfer-Encoding: base64

0M8R4KGxGuEAAAAAAAAAAAAAAAAAAAAAPgADAP7/CQAGAAAAAAAAAAAAAAARAAAALggAAAAAAAAAEAAALAgAAAEAAAD+////AAAAAAAAAACAAAAAAAEAAIABA…content of line with maxLength…
----_=_NextPart2_83908fe7-7763-4961-b7f7-9f501eb2b1c7–

----_=_NextPart1_63c36585-9b28-4bc2-b856-448d05068dca–

From - Fri Feb 25 12:20:05 2011
X-Mozilla-Status: 0001
X-Mozilla-Status2: 10000000
X-Mozilla-Keys:
X-Real-To: test.network@ngs.ru
Return-Path: alexander.melnikov@aspose.com
Received: from [172.16.0.3] (HELO imx-fe.ngs.ru)
by mx8.intranet.ru (CommuniGate Pro SMTP 4.3.11)
with ESMTP id 126968249 for test.network@ngs.ru; Fri, 25 Feb 2011 11:48:45 +0600


…content of next message…

Hello @alexander.pavlov ,

Found that the last line of file is what was creating the issue and which was way too big in size.
Not able to load this file in any viewer or tool. So not able to provide you the line or investigate.
But I think maybe the issue is in the file itself so it’s fine to close this ticket now.

Thanks for all the help.
Cheers

Hello @J_Z,

Thank you for the update! Since you’re okay with closing the ticket, we’ll consider it resolved.