This returns input + “\r\n\r\n” which is even worse.
And I don’t see why handling a txt document vs rtf or docx with the same content should be different. In my opinion, getText() should return exactly the same string (just input) in all these cases.
I’ve spent some time debugging and found out that the structure of the document in my repro is:
- Paragraph 0
- Run: “This string has no line breaks.”
- Paragraph 1 (no children)
On paragraph 0: getText() returns “This string has no line breaks.\r”, toString(SaveFormat.TEXT) returns “This string has no line breaks.\r\n”. On paragraph 1: getText() returns “\f”, toString(SaveFormat.TEXT) returns “\r\n”.
Update 2: the bug boils down to Aspose adding empty Paragraph 1 to the document structure of a plaintext document without line breaks. This does not happen to docx/rtf documents without line breaks. This also does not happen to plaintext documents that do contain line breaks (if you change input in my repro to “This string has a line break.\n” the document structure remains exactly as above).