Hi,
The document attached here turns into garbled characters using Aspose.PDF. (see the picture below)
TestPDF.pdf (3.9 KB)
result comparison.png (4.3 KB)
Is it possible to read it without character corruption and how can I make it possible?
Its font settings are following:
- MS-Gothic
- Type1(CID)
- encoding: 90msp-RKSJ-H
- actual font: MSゴシック
- actual font type: TrueType
For your information, the language is Japanese and it can be normally read in with Adobe Acrobat Reader DC.
The source is shown below:
string fileName = “”; //read pdf file
string imageFileName = “”; //output the pdf as an image// if the file is pdf
Aspose.Pdf.Document pdfDocument = null;
pdfDocument = new Aspose.Pdf.Document(fileName);
if (pdfDocument != null)
{
Aspose.Pdf.Page pdfPage = null;
if (pdfDocument.Pages.Count > 0) pdfPage = pdfDocument.Pages[1];
if (pdfPage != null)
{
const int imageResolution = 300;System.IO.MemoryStream imageStream = new System.IO.MemoryStream(); Aspose.Pdf.Devices.Resolution resolution = new Aspose.Pdf.Devices.Resolution(imageResolution); Aspose.Pdf.Devices.PngDevice imgDevice = new Aspose.Pdf.Devices.PngDevice(resolution); imgDevice.Process(pdfPage, imageStream); System.Drawing.Image pdfImage = System.Drawing.Image.FromStream(imageStream); pdfImage.Save(System.IO.Path.Combine(imageFileName + ".bmp")); //here the file gets garbled } pdfDocument.Dispose();
}
Thank you for your help.