PDF to DOCX performance in .Net

I’m needing to convert a large number of .pdf’s to .docx as a routine process. Using the code below converts the files, but it seems to be slow (2 hours and counting for 500+ documents). Are there options I can tweak to possibly speed up the performance?

//Convert (.pdf -> .docx)
using Aspose.Pdf.Document pdfDoc = new Aspose.Pdf.Document(filename);
Aspose.Pdf.DocSaveOptions saveOptions = new Aspose.Pdf.DocSaveOptions();
saveOptions.Format = Aspose.Pdf.DocSaveOptions.DocFormat.DocX;

PDFToWordFeedback pdfToWordFeedback = new PDFToWordFeedback(filename);
saveOptions.CustomProgressHandler = pdfToWordFeedback.PDFtoWordProgress;
saveOptions.WarningHandler = pdfToWordFeedback;

pdfDoc.Save(newFilename, saveOptions);

Any suggestions would be greatly appreciated.

@aspears

Would you please share your environment details i.e. application type, OS Name and Version, etc? We will try to share create an example in our environment and share our feedback with you.

The application will be a console application written in c# .NET 8. It will be run on a Windows 10 machine (Windows 11 eventually).

The gist of the application is it loops through a folder of zip files and finds any .PDFs in them. It then converts each PDF into a Word document and rezips the files. Later, the same process is used but converts the Word documents back to PDF. The program is already in use, but we are converting PDFs to PostScript which is less then ideal so we are looking for better options.

Thanks!

@aspears

Thanks for sharing the details. Please spare us some time in preparing some example. The process of creating sample application can take little time and if during the process, we don’t succeed in getting better performance, we will eventually be creating a ticket in our issue tracking system to address the performance issue. We will be sharing more details with you soon.