How to read a document without character corruption?

Hi,

The document attached here turns into garbled characters using Aspose.PDF. (see the picture below)
TestPDF.pdf (3.9 KB)
result comparison.png (4.3 KB)

Is it possible to read it without character corruption and how can I make it possible?
Its font settings are following:

  • MS-Gothic
  • Type1(CID)
  • encoding: 90msp-RKSJ-H
  • actual font: MSゴシック
  • actual font type: TrueType

For your information, the language is Japanese and it can be normally read in with Adobe Acrobat Reader DC.
The source is shown below:

string fileName = “”; //read pdf file
string imageFileName = “”; //output the pdf as an image

// if the file is pdf
Aspose.Pdf.Document pdfDocument = null;
pdfDocument = new Aspose.Pdf.Document(fileName);
if (pdfDocument != null)
{
Aspose.Pdf.Page pdfPage = null;
if (pdfDocument.Pages.Count > 0) pdfPage = pdfDocument.Pages[1];
if (pdfPage != null)
{
const int imageResolution = 300;

    System.IO.MemoryStream imageStream = new System.IO.MemoryStream();
    Aspose.Pdf.Devices.Resolution resolution = new Aspose.Pdf.Devices.Resolution(imageResolution);
    Aspose.Pdf.Devices.PngDevice imgDevice = new Aspose.Pdf.Devices.PngDevice(resolution);
    imgDevice.Process(pdfPage, imageStream);
    System.Drawing.Image pdfImage = System.Drawing.Image.FromStream(imageStream);

    pdfImage.Save(System.IO.Path.Combine(imageFileName + ".bmp")); //here the file gets garbled
}
pdfDocument.Dispose();

}

Thank you for your help.

@yImaizumi

We have tested the scenario in our environment while using Aspose.PDF for .NET 21.2 and were able to observe the issue. We also tried to specify font name in rendering options but got no success:

var pngDevice = new PngDevice((new Resolution(300)))
{
 RenderingOptions = new RenderingOptions()
 {
  UseFontHinting = true,
  DefaultFontName = "SimSun"
 }
};

Therefore, have logged an issue as PDFNET-49411 in our issue tracking system. We will further look into its details and keep you posted with the status of its rectification. Please be patient and spare us some time.

We are sorry for the inconvenience.

Thank you for your time and effort.
I’m looking forward to hearing from you about the result of the rectification.

@yImaizumi

We will surely investigate and resolve the ticket on a first come first serve basis and let you know once it is resolved.

1 Like

The issues you have found earlier (filed as PDFNET-49411) have been fixed in Aspose.PDF for .NET 24.1.