Explanation of the block size parameter of the Java PersonalStorage#create method

Hello

My company uses Aspose email. Could someone please give an explanation of the block size parameter of the java PersonalStorage#create method and how should we determine what value to set for it.

Thanks

@gksingh.01

Could you please share some more detail about your requirement? We will then provide you more information on it.

Hi

So we are creating PST from MSG files using the Aspose PersonalStorage object. We need to provide a stream - as opposed to a file location - as we need to do additional processing on the stream before we write to a temp PST file. Using the method PersonalStorage#create(stream, fileFormatVersion) seems to make use of a MemoryStream internally which is causing a lot of performance degradation. We therefore tried using the method PersonalStorage#create(stream, blockSize, fileFormatVersion) - as it does not seem to use the MemoryStream internally and its performance is acceptable. However, we are unsure on what value to set for the block size param. The documentation at Aspose.Email for Java 20.6 Release Notes|Documentation says at the bottom that

Create PST with size more than 2Gb using OutputStream

The user can optimize PST internal cache using new PersonalStorage API method:

blockSize - The optimal block size to expand cache buffer(in bytes)

However, this is not very helpful to us. Could you please elaborate and advise on how to determine what value to set for the block size param.

Thanks

@gksingh.01

Please note that data is written to the cache in small blocks, with different size and position. The block size is not related to the size of messages. The message is not written to a separate block. The parts of messages can be in one block.

The BlockSize used to dynamically expand the cache. By changing the block size, we can avoid overhead on large number of blocks.

Please check the following code example. Hope this helps you.

String mboxFileName = "c:/temp/input.mbox";
String pstFileName = "c:/temp/output.pst";
File f = new File(pstFileName);
PersonalStorage pst = null;

try {
    FileOutputStream fos = new FileOutputStream(pstFileName);
    pst = PersonalStorage.create(fos, 1*1024*1024*1024, 0);
    FolderInfo folder = pst.getRootFolder().addSubFolder("myInbox");
    //pst = PersonalStorage.fromFile(pstFileName);
    //FolderInfo folder = pst.getRootFolder().getSubFolder("myInbox");
    InputStream stream = null;
    stream = new BufferedInputStream(new FileInputStream(mboxFileName));
    MboxrdStorageReader reader = new MboxrdStorageReader(stream, true);
    // Start reading messages
    MailMessage message = reader.readNextMessage();
    // Read all messages in a loop
    int cnt=0;
    while (message != null)
    {
        for(int i = 0; i < 5000; i++) {    // set i = 700 will generate 1GB pst smoothly
            folder.addMessage(MapiMessage.fromMailMessage(message));
            cnt++;
            System.out.println( cnt);
        }
        message = reader.readNextMessage();
    }
    // Close the streams
    //fos.close();
    reader.dispose();
    stream.close();
    pst.dispose();
} catch (FileNotFoundException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

Hi

Thanks for your reply.

Could you please clarify which cache you are referring to and what is it used for. What would be the default size of the cache. Also, does it mean that if we are expecting to create a large sized pst then we should have a large block size so that we dont end up expanding the cache a multiple number of times. Does expanding the cache have a high performance overhead.

@gksingh.01

Th original size of cache is 0. For example, we are writing first data (message property data, not whole message), with size 100Kb. In case of blockSize 10Kb, 10 blocks will be added.

If we set blockSize to 1Gb, one block with size 1Gb will be added for PST < 1Gb
We need to set block size based on the estimated PST size.