Not well formatted PDF saved to PDF through Words 21.12 is corrupted

Hello,

A Not well formatted PDF saved through Words is corrupted

Original PDF is encapsulated with :

–12345678-1234-1234-1234-123456789012
Content-Disposition: form-data; name=mainDocument

%PDF-1.3

%%EOF

–12345678-1234-1234-1234-123456789012–

By using Words 25.4 it’s still corrupted

Original file is one page in Latin text and Tahoma Font.

Saved file has 15 pages of Chinese signs .

If manually extracted it’s a valid PDF file.

By using Pdf 25.4 the not well formatted file is correctly saved and readable.

Is there a way to retrieve/retransform the corrupted files ?

I already tried to resave corrupted file and extract text but it stays in Chinese signs.

Regards,

@CedricMasson

Can you please provide more details about the steps you took to save the PDF through Aspose.Words and clarify what you mean by ‘retrieve/retransform the corrupted files’?

Basic code used :

Aspose.Words.Document doc = new Aspose.Words.Document("not_well_formatted.pdf");
doc.Save("corrupted.pdf", SaveFormat.Pdf);

“Corrupted.pdf” is now a 15 pages documents with only Chinese signs and “not_well_formatted.pdf” is no more available.
Content of the file is in this case unreadable.

Is there a way to retrieve a readable content from “Corrupted.pdf” as it was in “not_well_formatted.pdf” ?

Regards,

@CedricMasson Unfortunately, your question is not clear enough. Could you please attach the problematic input and output documents here along with the expected output, if possible? We will check the issue and provide you more information.

In addition, please note, Aspose.Words is designed to work with MS Word documents. MS Word documents are flow documents and they have structure very similar to Aspose.Words Document Object Model. But on the other hand PDF documents are fixed page format documents . While loading PDF document, Aspose.Words converts Fixed Page Document structure into the Flow Document Object Model. Unfortunately, such conversion does not guaranty 100% fidelity.

@alexey.noskov
Thank you for your answer

Document contains personal information about end users and cannot be uploaded publicly.
Is there another way to send it ?

Original pdf is quite small (49kb) but the corrupted output pdf is larger (5.64Mb)

We’re aware that PDF shouldn’t have been handled by Aspose.Words.
It was a misuse coded years ago.

I made some tests and Aspose.Pdf 21.12 already handle perfectly our original pdf file

@CedricMasson Thank you for additional information. But it is still not clear what the problem is. Do you want to revert changes made by Aspose.Words after open/save and restore the original PDF document? If so, there is no way to achieve this.