I am using Aspose.PDF for PDF to Docx conversion using following code.
Document pdfDocument = new Document(new FileStream(dataDir + “input.pdf”, FileMode.Open));
DocSaveOptions saveOptions = new DocSaveOptions();
saveOptions.Format = DocSaveOptions.DocFormat.DocX;
saveOptions.RecognizeBullets = true;
saveOptions.Mode = DocSaveOptions.RecognitionMode.Textbox;
saveOptions.RelativeHorizontalProximity = 2.5f;
pdfDocument.Save(dataDir + “\OutputDocx\” +“output.docx”, saveOptions);
Following issues are identified while converting PDF files in Docx:
- Bullet points are converted as special symbols.
- Some textual content are converted as an image.
- Slide colors are changed and text is missing for some PDF files having some PowerPoint slides.
- For PDF files with mix content (text and scanned), converted files have incomplete text.
I have attached PDF files and converted RTF files for two cases mentioned above.Please help me to resolve these issues.