Text looks broken after PDF conversion

Hello!

We are using Aspose.PDF 23.9.0 in our project which runs in Linux containers. Most documents we convert looks just fine, but some documents we convert has broken text. Numbers shows correct, but text is broken. I’ve tried using the latest package Aspose.PDF.Drawing 24.5.0 without any improvements to this problem. I have also installed the Windows fonts to the Linux environment.

When running this solution locally on a Windows computer, I have no issues. But when I run the solution on WSL with Ubuntu, I’m able to reproduce the text error. The text looks broken.

The converted document with broken text looks like this:
image.png (2.5 KB)

But is supposed to look like this:
image.png (4.7 KB)

The PDF files font data is as following:

11 0 obj
<<
/Type /Font
/Subtype /TrueType
/BaseFont /Arial
/FontDescriptor 13 0 R
/Encoding 12 0 R
/FirstChar 32
/LastChar 65
/Widths [ 278 278 355 556 556 889 667 191 333 333 389 584 278 333 278 278 556 556 556 556 556 556 556 556 556 556 278 278 584 584 584 556 999 350 ]
/Name /F0
>>
endobj

12 0 obj
<<
/Type /Encoding
/BaseEncoding /WinAnsiEncoding
/Differences [64 /ellipsis/bullet]
>>
endobj

13 0 obj
<<
/Type /FontDescriptor
/FontName /Arial
/Flags 32
/Ascent 917
/CapHeight 0
/Descent 229
/FontBBox [0 0 0 0]
/ItalicAngle 0
/StemV 0
>>
endobj

14 0 obj
<< >>
endobj

15 0 obj
<<
/Type /Font
/Subtype /TrueType
/BaseFont /Microsoft#20Sans#20Serif,Bold
/FontDescriptor 17 0 R
/Encoding 16 0 R
/FirstChar 32
/LastChar 97
/Widths [ 265 298 375 576 576 909 687 211 353 353 409 604 298 353 298 298 576 576 576 576 576 576 576 576 576 576 298 298 604 604 604 576 687 742 687 631 298 687 742 797 687 687 631 576 576 576 576 298 576 248 520 248 852 576 576 576 353 520 298 576 576 520 687 576 585 370 ]
/Name /F1
>>
endobj

16 0 obj
<<
/Type /Encoding
/BaseEncoding /WinAnsiEncoding
/Differences [64 /A/D/E/F/I/K/N/O/P/S/T/a/b/d/e/f/h/i/k/l/m/n/o/p/r/s/t/u/u/v/Aring/aring/ellipsis/bullet]
>>
endobj

17 0 obj
<<
/Type /FontDescriptor
/FontName /Microsoft#20Sans#20Serif,Bold
/Flags 32
/Ascent 903
/CapHeight 0
/Descent 225
/FontBBox [0 0 0 0]
/ItalicAngle 0
/StemV 0
>>
endobj

18 0 obj
<< >>
endobj

19 0 obj
<<
/Type /Font
/Subtype /TrueType
/BaseFont /Microsoft#20Sans#20Serif
/FontDescriptor 21 0 R
/Encoding 20 0 R
/FirstChar 32
/LastChar 113
/Widths [ 265 278 355 556 556 889 667 191 333 333 389 584 278 333 278 278 556 556 556 556 556 556 556 556 556 556 278 278 584 584 584 556 1015 667 667 722 722 667 611 778 722 278 500 667 556 833 722 778 667 722 667 611 722 667 667 556 556 556 556 278 278 556 556 228 500 228 833 556 556 556 333 500 278 556 500 265 667 778 556 556 565 350 ]
/Name /F2
>>
endobj

20 0 obj
<<
/Type /Encoding
/BaseEncoding /WinAnsiEncoding
/Differences [64 /at/A/B/C/D/E/F/G/H/I/J/K/L/M/N/O/P/R/S/T/U/V/Y/a/b/d/e/f/f/g/h/i/k/l/m/n/o/p/r/s/t/u/v/nbspace/Aring/Oslash/aring/oslash/ellipsis/bullet]
>>
endobj

21 0 obj
<<
/Type /FontDescriptor
/FontName /Microsoft#20Sans#20Serif
/Flags 32
/Ascent 903
/CapHeight 0
/Descent 225
/FontBBox [0 0 0 0]
/ItalicAngle 0
/StemV 0
>>
endobj

@simensnc

Would you please try using the latest version of Aspose.Pdf.Drawing instead of Aspose.PDF for .NET? Also, please share your sample PDF (input/output) along with your sample docker file if issue persists with the latest version as well. We will further proceed accordingly.

I have tried with the latest Aspose.PDF.Drawing (24.5.0) version to no luck.

PDF before conversion:
Note that I’ve manually removed some information in the document, this is why some text is not affected in the output after conversion.
invoice-pre-conversion.pdf (98.6 KB)

PDF after conversion:
invoice-post-conversion.pdf (97.9 KB)

I’m not able to give you the Dockerfile for the solution, and locally I’m not running the solution with a Dockerfile but straight to WSL with Visual Studio. I’ve installed the Microsoft fonts, so that is not an issue.

@simensnc

We have opened the following new ticket(s) in our internal issue tracking system and will deliver their fixes according to the terms mentioned in Free Support Policies.

Issue ID(s): PDFNET-57414

You can obtain Paid Support Services if you need support on a priority basis, along with the direct access to our Paid Support management team.