Free Support Forum - aspose.com

Conersion from PDF to DOCX with some problem

Hi,
I want to know about something the conversion from PDF to DOCX.
Sometime the conversion is not perfect, for example I attach a PDF and relative conversion (in a zip file) made by your tool.

This is my settings:

DocSaveOptions saveOptions = new DocSaveOptions();
saveOptions.MaxDistanceBetweenTextLines = -1f;
saveOptions.Mode = DocSaveOptions.RecognitionMode.Flow;
saveOptions.Format = DocSaveOptions.DocFormat.DocX;
saveOptions.RelativeHorizontalProximity = 2.5f;
saveOptions.RecognizeBullets = true;

//this.Pdf ia an Aspose.Pdf.Document
this.Pdf.Save(path, saveOptions);

There is a way for have a better conversion (some options) or is a bug?

There are other problems with my conversion, when I try to edit (with Words) the result file margin of the text row are not always fitted to phrase width, instead sometime they start before the line and/or end after one.
In that cases when I try to edit the file, the text formattation breaks down.

There is a way to have better formattation (some options) or is a bug?

Thank you.

page72.pdf (119.0 KB)
page72.zip (11.8 KB)

@alessioabb,

I have worked with source file shared by you using following sample code and unable to observe the issue. Please try to use following sample code. This will resolve your issue.

Aspose.Pdf.License license = new Aspose.Pdf.License();
license.SetLicense(dir+“Aspose.Total.Product.Family.lic”);
Document doc = new Document(dir+ “72.pdf”);
DocSaveOptions saveOptions = new DocSaveOptions();
saveOptions.Format = DocSaveOptions.DocFormat.DocX;
doc.Save(dir+ “pagenew.DOCX”,saveOptions);

For following please share sample project or source file along with generated result so that we may further investigate to help you out.

pagenew.zip (11.8 KB)

For reply the problem you need to use option saveOptions.Mode = DocSaveOptions.RecognitionMode.Flow.
This option is important because, otherwise, the text, into the resultant docx, is not linked, and is impossibile to edit text in word.
So we need a look and feel like we don’t use the option RecognitionMode.Flow but we need a text linked like we use it.

@alessioabb,

I have observed your issue and like to inform that I have created ticket with ID PDFNET-47676 in our issue tracking system to investigate and resolve this issue as soon as possible. We will share good news with you soon.

Thank you,
Please remember the problem is:

  • Different look and feel from docx generated
  • Margin of the text lines sometime are not fitted with the line width
  • Presence of a carriage return at the end of the text line

@alessioabb,

Thank you for your information. We have mentioned this in our ticket and will share good news with you soon.