Free Support Forum - aspose.com

Poor performance adding text to an existing document

I’ve been working on a procedure which should assemble and manipulate existing document to produce a new document. Part of this work consists in converting a .doc file (merged with data via MailMerge.Execute()); part consists in adding existing Pdf files.

I was quite satisfied with the overall procedure, as I can generate 80-100 pdf files per minute, each consisting of 4/8 pages. Then I was asked to add text to the final document. As an example, I have to fill a preprinted form with custom variable data; for some documents, there are about 100 values to add to the final pdf, mainly short text (1 to 6 words). In this case, I can hardly output 3 files per minute, which is not an acceptable value.

After experimenting different approaches, I finally found most of the time is spent in method AppendText for class Aspose.Pdf.Text.TextBuilder. Every call to this method requires almost always more 30 to 100 ms or longer: other factors (for example, font properties) set for the text fragment influence required time. So it’s quite easy to see that adding 100 custom strings to an existing pdf requires in the wrost case about 10s only for calls to AppendText.

Running the program on different (ie faster) hardware does not produce noticeable improvements: with respect to a single core, 3GHz, 2 GB PC, a dual core, 3GHz PC with 4GB ram runs AppendText just a little 15% faster. And I can’t see any further improvement when running on a 4 core, 3GHz, 8GB ram PC.

The code used to reproduce this problem is as simple as


using Aspose.Pdf ;
using Aspose.Pdf .Text;

var document = new Document() ;
document.Pages.Add();

var textCount = 100 ;
for (var i = 0; i < textCount; i++)
{
var tf = new TextFragment("This is line " + i.ToString());
tf.Position.XIndent = tf.Position.YIndent = 100 + 3 * i ;
var builder = new TextBuilder(document.Pages[1]) ;
builder.AppendText(tf) ;
}
document.Save(@“result.pdf”) ;
document.Dispose();


So the final question is:
- is there anything else I can try to get any meaningful improvements?
- if not, is there a faster method to add text to a pdf document?

Last note: I’ve noticed from my latest samples shown that I can get quite an improvement if I group up different calls to AppendText into a single call, and add the values to write into the .Segments collection of the text fragment. This way, I can apparently get a 2.5/3x times improvement. Anyway, I have to rewrite part of the application to “buffer” calls to AppendText this way.
Is this a better way to handle this problem?

TIA

Hi Riccardo,


Thanks for your interest in our products and sorry for replying you late.

I have tested the scenario using code snippet which you have shared earlier while using Aspose.Pdf for .NET 7.5.0 over Windows 7 X64 machine running over intel .40 Ghz with 8GB of RAM and as per my observations, the process took around 6 seconds to add 100 instances of TextFragment object. Can you please share some sample application which can help us in identifying the slow performance issue. We apologize for this inconvenience.
codewarior:
the process took around 6 seconds to add 100 instances of TextFragment object.


Well, I think this is quite similar to my observation: this will sum up to about 10 documents per minute.
When the starting document has some non trivial pages (let's say 6 to 10 pages, with different font faces and sizes), you will easily get at most 5 documents per minute. This is not an acceptable output rate for our requirements: while it's true that not all jobs has such a complex document as output, when such a job occurs, all other elaborations are simply stopped for hours. While the conversion features (Aspose.Words) are really performant, the pdf manipulation routines seems to be many steps behind, at least with respect to performace.

On a different project, we're using another pdf library: trying to do the same operations with this other product we can add the same 100 strings (with font settings) in less than 1 second, running on the same PC. Anyway, since this project is all based on Aspose (Words & Barcode), we'd like not to add another third party dependency.

Any suggestions?

Hi Riccardo,


Thanks for sharing additional information.

I have logged this issue as PDFNEWNET-34609 in our issue tracking system under investigations list and we will further look into the details of this scenario/functionality and will try our level best to further optimize it. As soon as we have made some progress, we would be more than happy to update you with the status of correction. Please be patient and spare us little time. We are sorry for this inconvenience.