Pdf to word conversion - bold formatting from original pdf missing in word

cader · May 26, 2020, 5:35am

Hi

I’m trying out the evaluation version. When i tried to convert the pdf document to docx file.
some of the formating were missing. The bold formating from the original pdf file is missing in the docx file
the number bullets in the orginal pdf files were also simple text and not bullets.

Kindly find below my code in java
Document doc = new Document(“reefer.pdf”);
DocSaveOptions saveOptions = new DocSaveOptions();
saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
saveOptions.setMode(DocSaveOptions.RecognitionMode.Flow);
saveOptions.setRelativeHorizontalProximity(2.5f);
saveOptions.setRecognizeBullets(true);
doc.save(“reefer4.doc”, saveOptions);

Thaks

asad.ali · May 26, 2020, 7:45pm

@cader

Would you kindly share your sample PDF document with us. We will test the scenario in our environment and address it accordingly.

cader · May 28, 2020, 2:35am

testdoc.pdf (474.5 KB)

@asad.ali

Kindly find attached. In this case we can get the bold font. The number bulletins are not bulletins

asad.ali · May 28, 2020, 1:45pm

@cader

We have tested the scenario in our environment while using Aspose.PDF for Java 20.5 and were unable to notice any issue. We have attached an output for your kind reference.

sample.zip (19.6 KB)

Would you please try using latest version of the API and in case you still face any issue, please let us know.

PS: Please save the document with .docx extension i.e. doc.save(“reefer4.docx”, saveOptions);

cader · May 29, 2020, 2:05am

@asad.ali

Thnaks for looking into my issue.
But what i meant was the number bullets appears as bullets but they are not the actual word bullets when i try to edit them . They are just text with tab spacing

image.png (1.1 KB)

asad.ali · May 29, 2020, 5:29pm

@cader

We have logged an issue as PDFJAVA-39459 in our issue tracking system. We will further look into its details and keep you posted with the status of its correction. Please be patient and spare us some time.

We are sorry for the inconvenience.

cader · June 3, 2020, 4:14am

@asad.ali

Thanks for your support.
Could you guide me how i could change the font or fontstyle (bold/italics) of a certain fonts when converting from pdf to word docx. Thanks

asad.ali · June 3, 2020, 5:41pm

@cader

Searching text on the basis of a specific font is not available yet in the Aspose.PDF for .NET and a feature request (PDFNET-48185) has already been logged in our issue tracking system for the purpose of implementation. We will surely inform you as soon as this feature is available.

Furthermore, you can extract/search text from PDF document and modify its formatting using the API. You can please go through the linked documentation articles for more information.