Prevent adding text to frame container

We are evaluating Aspose.PDF (v21.8.0) for converting PDF into DOCX.

Text content of PDF to DOCX output (test-output-actual.docx) located inside frames:

var pdfDocument = new Document("test-input.pdf");
pdfDocument.Save("test-output-actual.docx", SaveFormat.DocX);

image.png (36.8 KB)

Is there some option that prevents this? My expected result (test-output-expected.docx) is paragraphs without frame boxes.

test-input.pdf (14.7 KB)
test-output-actual.docx (9.3 KB)
test-output-expected.docx (13.4 KB)

1 Like

@AdamSh

I request you to try the following code and share your feedback.

Document document = new Document(dataDir + "test-input.pdf");
DocSaveOptions options = new DocSaveOptions();
options.Mode = DocSaveOptions.RecognitionMode.Flow;
options.Format = DocSaveOptions.DocFormat.DocX;
document.Save(dataDir + "testframe.docx", options);
2 Likes

This works! Thank you.

@AdamSh

It’s good to know that suggested option has proved to be working on your end.