The docx I converted now has its titles formatted as body text
To retain the formatting attributes of titles when converting a PDF to DOCX using Aspose.PDF for Java, you can utilize the DocSaveOptions
class, which allows you to specify various settings for the conversion process.
Here’s a simple example of how to convert a PDF to DOCX while ensuring that titles are recognized correctly:
import com.aspose.pdf.DocSaveOptions;
import com.aspose.pdf.Document;
public class ConvertPDFToDOCX {
public static void main(String[] args) {
// Load the source PDF document
Document pdfDocument = new Document("path/to/your/input.pdf");
// Create an instance of DocSaveOptions
DocSaveOptions saveOptions = new DocSaveOptions();
// Set the output format to DOCX
saveOptions.setFormat(DocSaveOptions.DocFormat.DocX);
// Optionally, set recognition mode to Flow to improve text layout
saveOptions.setMode(DocSaveOptions.RecognitionMode.Flow);
// Save the document in DOCX format
pdfDocument.save("path/to/your/output.docx", saveOptions);
}
}
In this code, the DocSaveOptions
class is used to specify that the output format should be DOCX. Additionally, setting the recognition mode to Flow
can help in maintaining the structure of the document, which may assist in preserving the formatting of titles versus body text.
Make sure to adjust the paths to your input and output files accordingly. This approach should help in retaining the formatting attributes of titles during the conversion process.
For more advanced options and configurations, you may want to refer to the official documentation on Aspose.PDF for Java, which provides detailed information on handling various conversion scenarios.
Sources:
[1]: Change PDFs to DOC or DOCX Format | Aspose.PDF Java PDF Processing API
Can you please share which code snippet you are using to perform PDF to DOCX conversion? We will test the scenario in our environment and address it accordingly.
docc = ap.Document("【pdf-书签】江苏鲜之源水产食品有限公司报告.pdf")
save_options = ap.DocSaveOptions()
save_options.format = ap.DocSaveOptions.DocFormat.DOC
# Set the recognition mode as Flow
save_options.mode = ap.DocSaveOptions.RecognitionMode.FLOW
# Set the Horizontal proximity as 2.5
save_options.relative_horizontal_proximity = 2.5
# Enable the value to recognize bullets during conversion process
save_options.recognize_bullets = True
docc.save("江苏鲜之源水产食品有限公司报告.docx", save_options)
We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.
Issue ID(s): PDFNET-58234
You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.