Performance issue about extractMessage from large pst file

Hi


We need check each message and its attachments in pst file, modify it if needed. The way we did : extract every message, modify it if needed, after that, create a new pst, addmessages into new pst file.

Because pst file limitation is 50G, we need handle pst file up to 50G. But for large size pst file, extractMessage is slow.

I checked the threads in this forum, there is a bug EMAILNET-33470 , and it’s “won’t fix”.

And in How to Improve performance?
There mentioned:
Support for adding messages in parallel to the PST is not available at present. For improved performance, it is generally advised to keep the PST size as large as up to 5 GB based on our previous experience while working with the API. This provides improved performance for adding as well as extracting messages from the PST

Is 5G still the suggestion to improve performance? then what’s the suggestion for pst file > 5G? split into small size pst files and merge them back? If do this way, will the final pst file be same as original one? And do you have perf stats about it?

Thanks
Li

Hi Li,


Thank you for writing to Aspose Support team.

Indeed, the size of PST has effect on the extraction of messages from the PST file. EMAILNET-33470 refers to extracting messages from PST file in parallel which is not supported at the moment, nor do we have plans to provide support in near future.

We performed different tests at our end with our test files after which we suggested the 5GB size to user. However, small size PSTs may be of more quick response in this regard though we haven’t tested it at our end. There were some issues reported with the splitting and merging of PSTs in our last release which are pending for resolution in upcoming version of the API. Until then, the re-merged PST may have affects of these issues and may not be the same. We are sorry but we don’t have any performance stats about working on these grounds.

Hi


Thanks for your reply.
Except for extracting messages in parallel, do you have any other plan about performance improvement? if we purchase Enhanced Support, will that help on this request?

Another thing, for large size pst file (30G), the generated pst file cannot be opened by outlook. But after extract and package the generated pst file again, it can be opened by outlook.
I searched in this forum, found Corrupted PST file - #13 by curtisyamada - Free Support Forum - aspose.com , there is a bug EMAILJAVA-33542. Has this bug been fixed? is there a similar bug in .net version of aspose.email?

Thanks
Li

Hi Li,


Since the issue is marked as Won’t Fix, purchasing Enhanced Support will be of less benefit in this case. An issue marked with such status means that there are enough design changes required for implementation of such feature that can’t be met currently and, hence, issue won’t fix.

EMAILJAVA-33542 is related to large number of messages (100000+) added to a PST after which the PST doesn’t open in MS Outlook. However, this is not a bug of Aspose.Email API but restriction of MS Outlook on maximum number of items per folder which is 65535.

In your case, could you please tell us if the original PST file was generated by Aspose.Email API? If yes, we’ll be in need of the messages that were used to generate such a large PST file (though it may take very very long), so that it can be used for reproducing the same issue at our end for assisting you further.

Hi


For ExtractMessage perf issue, will bulk extract messages improve perf? There are bulk add/delete messages which perf are much better than add/delete individual message one by one, but no bulk extract messages.

Thanks
Li

Hi Li,


Extraction of messages from PST file in parallel/buld is not supported at the moment. The feature is logged in our issue tracking system for implementation, but there is no near-future plan for implementation of the same. Please feel free to write to us if you have any other query related to our API.