Convert PDF to DOCX in C# .NET

Hai,
I am trying to convert a pdf document to docx format which includes table.
But when I try convrsion, I am getting values merged in tables, and it is not having an exact look as pdf file.

code

                        DocSaveOptions saveOption = new DocSaveOptions();
                       saveOption.Mode = DocSaveOptions.RecognitionMode.Flow;
                     saveOption.Format = DocSaveOptions.DocFormat.DocX;
                saveOption.RelativeHorizontalProximity = 2.5f;
                //saveOption.AddReturnToLineEnd = true;
                saveOption.RecognizeBullets = true;

                Aspose.Pdf.Document pdfDocument = new Aspose.Pdf.Document(filePath);

                    pdfDocument.Save(Path.Combine(path, "pdfToWord", fileName + ".docx"), saveOption);

I am having issues with tables, tables are not getting properly aligned in the output document.

KBtemp1-Credit Union.pdf (253.3 KB)

Vaccine Data (2).pdf (701.4 KB)

sample document

Please help me find a solution for thiswhitepaper.pdf (335.7 KB)

@pooja.jayan

We used the below code snippet with Aspose.PDF for .NET 22.1 to convert PDF to DOCX and noticed that two of your files got converted fine. Outputs are attached for your kind reference as well. Please check them and let us know in case you notice any issues:

Document pdfDocument = new Document(dataDir + @"Vaccine Data (2).pdf");

DocSaveOptions saveOptions = new DocSaveOptions();
saveOptions.Format = DocSaveOptions.DocFormat.DocX;
saveOptions.Mode = DocSaveOptions.RecognitionMode.EnhancedFlow;
saveOptions.RelativeHorizontalProximity = 2.5f;
saveOptions.RecognizeBullets = true;
pdfDocument.Save(dataDir + @"Vaccine Data (2).docx", saveOptions);

KBtemp1-Credit Union.docx (119.3 KB)
Vaccine Data (2).docx (1.8 MB)

We did notice that the file whitepaper.pdf did not convert correctly. Hence, have logged an issue as PDFNET-51347 in our issue tracking system. We will further look into it and let you know once it is resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hai,
Thank you for your response.
KBtemp1-Credit Union.docx (119.3 KB)

In the converted docx file ypu shared, text contents looks fine, But when it comes to table, it is found distorted.
table.PNG (13.7 KB)

where in pdf files, it is like
original.PNG (58.7 KB)

Also the index portion of Vaccine Data (2).docx (1.8 MB)
document is not properly converted.Facing some alignment issues there also.

I tried changing the recognitionmode to ‘Flow’, index pages are properly get converted, but still have issues with tables.

Please have a look at this also.

Thankyou

@pooja.jayan

We have logged below tickets for the issues in our issue tracking system:

  • PDFNET-51356 KBtemp1-Credit Union.pdf
  • PDFNET-51357 Vaccine Data (2).pdf

We will look into details of above-logged tickets as well and let you know once they are resolved. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hai,
any updates?

@pooja.jayan

We really regret to share that the earlier logged tickets have not been resolved due to other pending issues in the queue. We will surely inform you as soon as we have some definite updates regarding their resolution. Please be patient and spare us some time.

We are sorry for the inconvenience.

Hai,

Any updates? If possilble, Could you please update with me an expected time that would take to get this issue solved.

@pooja.jayan

We really regret to inform you that it is not possible to share any reliable ETA at the moment as the tickets are not investigated completely. As soon as we complete the analysis against them, we will be able to share updates with you. We highly appreciate your patience and comprehension in this regard.

We apologize for the inconvenience.

Hai,

Any updates on any of the tickets PDFNET-51347, PDFNET-51356, PDFNET-51357

@pooja.jayan

We are afraid that the earlier logged tickets are not yet resolved. They could not get investigated yet. We will inform you via this forum thread once we have some updates regarding their fix. Please give us some time.

We apologize for the inconvenience.

Hai,

Any updates?

@pooja.jayan

We are afraid that the logged tickets are not yet resolved due to other parallel tasks logged prior to them. We will surely notify you via this forum thread once we have some definite updates about resolution of the logged issues. We apologize for your inconvenience.

Hai,
Any updates on the logged ticket?

@pooja.jayan

Sadly, no updates are available about ticket resolution at the moment. Your concerns have already been recorded and we will surely notify you as soon as we have some important updates about tickets resolution. We apologize for your inconvenience.