We're sorry Aspose doesn't work properply without JavaScript enabled.

Free Support Forum - aspose.com

Dynamically determine if pdf was a doc or docx when converting PDF to DOCX/DOC

I am trying to determine if the pdf i am trying to convert was originally a doc or docx file. if the file was converted to pdf from a doc file i want the converter to pass the converter: DocSaveOptions.DocFormat.Doc. if it was originally a docx file that was converted to PDF i want to pass the converter this DocSaveOptions.DocFormat.DocX.

How can i determine if the PDF i am passing to aspose was originally a doc, or docx or any other format?

The reason why i need to do this is because i generally have passed DocSaveOptions.DocFormat.DocX to my converter. However some doc PDFs are not converting to DOCX correctly. I only see bullets. so how can i determine what the pdf file was originally?

Sovern SampleResume.pdf (54.5 KB)
i have attached the resume that is failing docx conversion from PDF. is there a fix for this?

my call is very standard:

 tmpDoc.Save(tmpStream, new Aspose.Pdf.DocSaveOptions(){
                        Format = Aspose.Pdf.DocSaveOptions.DocFormat.DocX,
                        Mode = Aspose.Pdf.DocSaveOptions.RecognitionMode.Textbox

There is no way to detect the previous original DOC or DOCX format. As a workaround, you can write such information in the bookmark or field while converting a Word document to PDF, and then retrieve this information with Aspose.Pdf API. We have tested your source PDF with the latest version 17.7 of Aspose.Pdf for .NET API, and the both output Word files (DOC & DOCX) look fine. This is the Zip file of output Word documents: OutputWordDocuments.zip (94.5 KB)

Please let us know in case of any further assistance or questions.

Best Regards,
Imran Rafique