We’re using Aspose.PDF (20.9) to convert a scanned PDF to .docx - the PDF is therefore simply a vehicle for storing images, 1 image per page.
The one thing I noticed was that the scanned PDF contains pages rotated by 270 degrees
When we save the PDF as a .docx we find occasionally that pages are rendered as big red X es - sometimes when we repeat the attempt a second time it renders correctly.
The code couldn’t be simpler - VB.NET - I’ve left out everything irrelevant such as getting the licences and so on… This is literally all we do:
oPDF = New Aspose.Pdf.Document(sToFileName)
oPDF.Save(sToFileName + “.docx”, Aspose.Pdf.SaveFormat.DocX)
The biggest problem is that when this “goes wrong” and the document is rendered with red Xes instead of the correct images from the scanned PDF, no error is raised. So we have no way of knowing that it went wrong until afterwards when we look at the end product. This has already led to some very unhappy customers.
Let me know if you need any example PDFs