Translating text to PDF slow and hangs CPU

The problem described below starting showing up on our server yesterday. I just upgraded to the latest Aspose DLLs for .NET (PDF dll is on version 7.6) and the issue still remains.

Our code is simply taking text files and turning them into PDFs. I'm doing this with the code below (the example was largely pulled from one of your online examples). I know that your online example says that text-to-PDF rendering can be slow and that your recommendation is to read text line-by-line as opposed to generating one big paragraph of text. Our hang-up is not on the preparing side of things, but rather the writing to our MemoryStream (the exact line of code is marked below). Just for good measure I tried a StringReader and creating the document line-by-line and received the same results.

Just converting a 100k of text takes 10 seconds. 200k takes 19, and 300k takes 45. Those processing times are far too slow for us to be able to use this with the variety of files sizes that happen in the real world (easy for a file to be several MB). In one of our live environments, we had several multi-MB files queued up and the process totally locked up our server.

Thanks for your assistance.


CODE:
string text = SecurityUtil.GenerateRandomText(1024 * 300, true, true);

Gen.Pdf doc = new Gen.Pdf();
Gen.Section section = doc.Sections.Add();
Gen.Text genText = new Gen.Text(text);
section.Paragraphs.Add(genText);

// Load to stream
using (System.IO.MemoryStream output = new System.IO.MemoryStream())
{
doc.Save(output); // HANGING HERE
output.Position = 0;
}

Hi Colin,


Thanks for using our products.

Can you please try using the approach shared over Writing PDF directly

Furthermore, when using large size text files, please try using the approach specified on second half of the page. How to Convert a text file to PDF

If you are still not satisfied with the performance, please share some sample files so that we can test the scenario at our end. We are sorry for this inconvenience.

I tried the “writing pdf directly” method and still had the same problems. Please keep in mind that I’m only talking about small files here, so memory shouldn’t be an issue. On the second part of “how to convert a text file to PDF,” I mentioned in my first post that I tried that method – however, again the files are so small that memory really shouldn’t be playing a role here. One other thing, I’m in a situation where temp files are not desired, so I need to make this work with memory streams.


I tested a 500k file and it took almost 2 minutes to convert. 100k of text is taking about 10 seconds.


CODE:
System.Diagnostics.Stopwatch sw = System.Diagnostics.Stopwatch.StartNew();
byte[] bytes = null;
using (System.IO.MemoryStream output = new System.IO.MemoryStream())
{
string text = System.IO.File.ReadAllText(@"\TestFile-100k.txt");

Gen.Pdf doc = new Gen.Pdf(output);
Gen.Section section = doc.Sections.Add();
Gen.Text genText = new Gen.Text(text);
section.Paragraphs.Add(genText);

// Load to stream
doc.Close();
output.Position = 0;
bytes = output.ToArray();
}

sw.Stop();
Console.WriteLine(bytes.Length + " bytes: " + sw.Elapsed.ToString());

Hi Colin,


Thanks
for sharing the details.

I
have tested the scenario and I am able to reproduce the same problem. For the
sake of correction, I have logged it in our issue tracking system as
PDFNEWNET-34786. We
will investigate this issue in details and will keep you updated on the status
of a correction.

We
apologize for your inconvenience.

Hi Colin,


Thanks for your patience.

I am pleased to share that the issue reported earlier has been resolved and its resolution will be included in upcoming release of Aspose.Pdf for .NET 8.3.0, which is planned to release within current week. Furthermore, please try using the following DOM approach of Aspose.Pdf namespace to get better performance.

[C#]

string inFile = “c:/pdftest/TestFile-500k.txt”;<o:p></o:p>

string outFile = "c:/pdftest/TestFile-500k_4930.pdf";

Document doc = new Document();

Page page = doc.Pages.Add();

System.IO.TextReader objReader = new System.IO.StreamReader(inFile);

// Read the file till the end of the file has come

do

{

//Create a new text paragraph & pass text to its constructor as argument

TextFragment t2 = new TextFragment(objReader.ReadLine());

// add the text object to paragraphs collection of section

page.Paragraphs.Add(t2);

// Read till the end of file

} while (objReader.Peek() != -1);

// Close the StreamReader object

objReader.Close();

doc.Save(outFile);

The issues you have found earlier (filed as PDFNEWNET-34786) have been fixed in Aspose.Pdf for .NET 8.3.0.


This message was posted using Notification2Forum from Downloads module by Aspose Notifier.