Pdf to Docx Conversion in Java with Aspose.PDF - converted document completely collapsed

Hi I am evaluting aspose for converting Pdf to Docx but after converted document completely collapsed not in proper alignment I ahve attached that issued document below is my code .Can any one help me on this?Result_E2020-G000-326.zip (31.2 KB)

package com.ratilan.csr.controller;

import com.aspose.pdf.DocSaveOptions;
import com.aspose.pdf.Document;
import com.aspose.pdf.SaveFormat;
import com.aspose.words.License;

public class ConvertPDFToDOCOrDOCXFormat {

public static void main(String[] args) throws Exception {
    //savingToDoc();
    savingToDOCX();
    //usingTheDocSaveOptionsClass();
}

/*public static void savingToDoc() {
    // Open the source PDF document
    Document pdfDocument = new Document(dataDir + "SampleDataTable.pdf");
    // Save the file into Microsoft document format
    pdfDocument.save(dataDir + "TableHeightIssue.doc", SaveFormat.Doc);
}

*/
public static void savingToDOCX() throws Exception {
License license = new License();
license.setLicense(“F://Aspose.Total.Java.lic”);
// Load source PDF file
Document doc = new Document( “F://E2020-G000-326.pdf”);
// Instantiate Doc SaveOptions instance
DocSaveOptions saveOptions = new DocSaveOptions();
// Set output file format as DOCX
saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
// Save resultant DOCX file
doc.save( “F://Result_E2020-G000-326.docx”, saveOptions);
}

public static void usingTheDocSaveOptionsClass() {
    // Open a document
    // Path of input PDF document
    String filePath =  "source.pdf";
    // Instantiate the Document object
    Document document = new Document(filePath);
    // Create DocSaveOptions object
    DocSaveOptions saveOption = new DocSaveOptions();
    // Set the recognition mode as Flow
    saveOption.setMode(DocSaveOptions.RecognitionMode.Flow);
    // Set the Horizontal proximity as 2.5
    saveOption.setRelativeHorizontalProximity(2.5f);
    // Enable the value to recognize bullets during conversion process
    saveOption.setRecognizeBullets(true);
    // Save the resultant DOC file
    document.save( "Resultant.doc", saveOption);
}

}

@Sam0527

Would you kindly attach the respective source PDF document as well so that we can test the scenario in our environment and address it accordingly.

Hi asad,

I have attached below document kindly take a look on that.

Regards
Dilip Kumar
E2020-G000-326.pdf (689.7 KB)

@Sam0527

We converted your document into DOCX using Aspose.PDF for Java 20.11.1 and the following code snippet:

Document doc = new Document(dataDir + "E2020-G000-326.pdf");
DocSaveOptions saveOption = new DocSaveOptions();
saveOption.setMode(DocSaveOptions.RecognitionMode.Flow);
saveOption.setFormat(DocSaveOptions.DocFormat.DocX);
saveOption.setRecognizeBullets(true);
doc.save(dataDir + "sample20.11.1.docx", saveOption);

sample20.11.1.zip (529.3 KB)

Please check the attached output document as well. We were unable to understand what issues it had. Would you kindly highlight the issues you see and share the screenshots with us.

Hi Asad ,

I could see the lot of alignment issues while compoaring with Original Pdf and converted Docx .Please consider this and help me it can fix or not .Here I attched list of alignment issue in docx

Regards
Dilip Kumar KIssues List.zip (181.2 KB)

@Sam0527

Strangely, we did not notice these issues when we opened the converted file at our side in MS Word 2016. Please check attached screenshots that how file is displayed:

tables.png (30.5 KB)
paragraphs.png (25.9 KB)
logo.png (46.6 KB)

Would you please share that in which viewer you are viewing the file? What OS and Office version are you using? We will further proceed to assist you accordingly.

Hi Asad,

I am using MsOffice Word only and having windows OS (Version 10).Kindly take priority and give me some proper solution to go with aspose .Why because we are looking Pdf to Docx and Docx to Pdf convertion once evalution is done completely we need to purchase .

Regards
Dilip Kumar KWord Version.png (15.9 KB)

@Sam0527

We are afraid that we cannot share any solution as we were not able to reproduce this issue at our side. We also observed that you are using 2011 version of the MS Word. Would you kindly try to use MS Word 2016 at your side and let us know in case you still face any issue. We will further proceed to assist you accordingly.

Hi asad,

I am using Office 365 so i have update that one then i have checked that now i could see with as expected.But I am facing still some alignment issue .Please check my attachement file which consist both input and output file.I could see table border issues.Can you please suggest me solution for that ?

Also I am getting only four pages in tha output document.May I know thr reason?Aspose Alingment Issues.zip (1.5 MB)

@Sam0527

The 4 Page limitation is due to the trial version usage. You can use a 30-days free temporary license and set it properly before using any method of the API. This will allow you to evaluate the API without any restrictions.

Furthermore, we noticed that alignment issue was in the footer of every page in output DOCX. Would you please confirm if you see other issues as well and point them out in the attached converted document. We will log an issue in our issue tracking system and share the ID with you.
sample20.11.1.zip (947.6 KB)