Create word to pdf

In creating the word from the pdf texts come with a type of invalid letter “HWJABL+Helvetica” or “RONPLM+Helvetica-Bold” and print out blank.


Code:

com.aspose.pdf.License license= new com.aspose.pdf.License();
// Load the license file into FileStream object
try {
license.setLicense(new java.io.FileInputStream(“C:\lib\Aspose.Pdf.lic”));
} catch (FileNotFoundException e2) {
// TODO Auto-generated catch block
e2.printStackTrace();
}
com.aspose.pdf.Document document = new com.aspose.pdf.Document(filePath);
//Create DocSaveOptions object
com.aspose.pdf.DocSaveOptions saveOptions = new com.aspose.pdf.DocSaveOptions();
//Set the recognition mode as Flow
saveOptions.setMode(com.aspose.pdf.DocSaveOptions.RecognitionMode.Flow);
saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
//Set the Horizontal proximity as 2.5
saveOptions.setRelativeHorizontalProximity(2.5f);
//Enable the value to recognize bullets during conversion process
saveOptions.setRecognizeBullets(true);
document.save(fileOut, saveOptions);

Hi Mario,


Thanks for your inquiry. We will appreciate it if you please share your sample input document. We will test the scenario at our end and will guide you accordingly.

We are sorry for the inconvenience caused.

Best Regards,

Ok, this is the document input (PDF) and output word.

If the Word document is opened with WinZip, you can see this by inserting a typeface characters…


Something is wrong creating your jar…

I add the xml of this document…

Hi Mario,


Thanks for your patience. Please note that font name generation behavior is by design. PDF file can contain several fonts with same names, so, prefixes are added for guaranty to identify font names uniquely. I have printed the generated DOC file without any issue. Can you please share some more details about the issue your are facing? So we will guide you exactly.

Moreover, The prefixed names are actually subsets embedded into the document. The fonts are subsets in the PDF and also defined with prefixes. The prefixes are different (during document processing new font subsets are generated based on source fonts with new prefixes). It cannot be turned off.

The prefix generation algorithm is simple:
six randomly generated characters][plus sign][font name].

Please feel free to contact us for any further assistance.


Best Regards,

There is a way to add a default font and change that brings the original pdf?

Hi Mario,


Thanks for your inquiry. I am afraid there is not a specific property for whole PDF document but you can replace font of all TextFragments before conversion as following. Hopefully it will help you to accomplish the task.

com.aspose.pdf.Document document = new
com.aspose.pdf.Document(filePath);<o:p></o:p>

//setting default font

com.aspose.pdf.TextFragmentAbsorber tfa = new com.aspose.pdf.TextFragmentAbsorber();

doc.getPages().accept(tfa);

com.aspose.pdf.TextFragmentCollection tfc = tfa.getTextFragments();

for (com.aspose.pdf.TextFragment tf : (Iterable) tfc)

tf.getTextState().setFont(com.aspose.pdf.FontRepository.findFont("MSGothic"));

//Create DocSaveOptions object

com.aspose.pdf.DocSaveOptions saveOptions = new com.aspose.pdf.DocSaveOptions();

//Set the recognition mode as Flow

saveOptions.setMode(com.aspose.pdf.DocSaveOptions.RecognitionMode.Flow);

saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);

//Set the Horizontal proximity as 2.5

saveOptions.setRelativeHorizontalProximity(2.5f);

//Enable the value to recognize bullets during conversion process

saveOptions.setRecognizeBullets(true);

document.save(fileOut, saveOptions);

Please feel free to contact us for any further assistance.


Best Regards,

works perfect, only one problem, the document is bold and no longer respected, is there anyway to maintain backward bolding?

Hi Mario,


Do you mean that source/input PDF contains text in bolder formatting but resultant DOC files does not have text in Bold formatting ? If so is the case, then please share the resource file so that we can test the scenario at our end. We are sorry for this inconvenience.