We are loading a specific PDF and then converting it to a bitmap. Until a recent release of Aspose (v24) this was taking 1min 45 sec in the PDFConverter.HasNextImage() call. After that release it’s down to 29 sec. This, however, is still way to slow for what we would expect. Is there a better way to do what we are trying to do? Can this performance be improved? We can go to a simple Web Site and have this file converted in second!
Thanks.
I’ve linked a sample project with the test file that has the problems.
We checked in our environment using 24.4 and DOM approach for conversion with below code sinppet:
using (Aspose.Pdf.Document thePdfDocument = new Aspose.Pdf.Document(dataDir + "FDFTDA(11981919)_-_TICKETS_-_DocID_17016051.pdf"))
{
System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
sw.Start();
BmpDevice bmpDevice = new BmpDevice();
using (MemoryStream pageBitmapMemoryStream = new MemoryStream())
{
bmpDevice.Process(thePdfDocument.Pages[1], pageBitmapMemoryStream);
}
sw.Stop();
Console.WriteLine("Total seconds taken : " + sw.Elapsed.TotalSeconds);
}
At first execution it took 2 seconds. However, upon every subsequent run, it took 1.29-1.35 seconds in our environment. Can you please try the above approach and let us know if it still takes more time for you?
I tried the alternate approach, but I had to edit it to make it loop on all the pages, since we’re interested in converting all the pages at once:
Public Sub ConvertPDFtoImageDo()
Using thePdfDocument As New Aspose.Pdf.Document("FDFTDA(11981919)_-_TICKETS_-_DocID_17016051.pdf")
Dim sw As New Stopwatch()
sw.Start()
Dim BmpDevice As New BmpDevice()
For Each page As Aspose.Pdf.Page In thePdfDocument.Pages
Using pageBitmapMemoryStream As New MemoryStream()
BmpDevice.Process(page, pageBitmapMemoryStream)
End Using
Next
sw.Stop()
Console.WriteLine("Total seconds taken : " & sw.Elapsed.TotalSeconds)
End Using
End Sub
And I compared the time with our current approach for three runs: #1
GetNextImage total time: 32.315
Convert DOM based Total seconds taken : 28.8354831
#2
GetNextImage total time: 30.619
Convert DOM based Total seconds taken : 28.908888
#3
GetNextImage total time: 31.004
Convert DOM based Total seconds taken : 28.7635003
Would you kindly share the results you used to get with the older version of the API that you were using? Please share below information so that we can proceed accordingly:
API version producing expected results
System information e.g. OS Name and Version, RAM, Processor, etc.
I compared the two versions of Aspose.PDF again against our approach and the one you proposed, these are the results on my machine for the average of 3 runs:
Aspose 20.10
GetNextImage (our current approach) total time: 35.6372941
Dom Based Convert Time: 33.6049738
Aspose 24.4
GetNextImage (our current approach) total time: 30.8609131
Dom Based Convert Time: 28.6527697
Machine specs:
Device name
XXXX
Processor
12th Gen Intel(R) Core™ i5-12400 2.50 GHz
Installed RAM
32.0 GB (31.8 GB usable)
Device ID
XXXX
Product ID
XXXX
System type
64-bit operating system, x64-based processor
OS:
Edition
Windows 11 Pro
Version
22H2
Installed on
7/16/2023
OS build
22621.3447
Experience
Windows Feature Experience Pack 1000.22688.1000.0
We are using VB.NET with .Net framework version 4.8 for this project.
It looks like the latest version is giving better performance in both Facades and DOM approaches. However, the results are not similar to what we observed in our environment with below specifications:
Windows 11 22H3 Pro x64-bit
16G RAM
Console Application - C# - .NET 4.8
Are you sure that no other code routine is being executed during your testing? Are you testing in a separate console application?
There is a big improvement on the latest version compared to our current version 20.10, but still not quite what we’re after.
And there is only a marginal improvement when using the DOM approach as I mentioned earlier.
Sorry for the delayed response. One last thing before we log an investigation ticket for this case. Can you please share your expected time you desire to get from the API?
We would like to see a similar performance to this in-browser tool.
This PDF file has only 1 image component per page (and nothing else I presume), we are thinking it shouldn’t take that much time to just extract the already existing image for the conversion.
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-57152
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.