Hi Robert,
Thanks for the quick response! I will send a document directly to you.
Hi Robert,
Thanks for your help on this. We would need all of the text content. When generating an image from the pdf the contents look more blurry, even with the resolution set to 300.
Hi Robert,
We have carried out the investigation of the issue raised by you in the
following manner:
Exercise #1:
1. Saving one page of the PDF manually using Adobe
2. Perform OCR
Exercise #2:
1. Read PDF document page by page.
2. Convert each page into an image, without any special setting.
3. Perform OCR on each image.
Exercise #3:
1. Read PDF document page by page.
2. Convert each page into an image, with special setting like:
<div><div class="csharpcode">
var resolution = new Aspose.Pdf.Devices.Resolution(300);
var jpegDevice = new JpegDevice(Convert.ToInt32(pdfDocument.Pages[pageCount].PageInfo.Width),
<p class="MsoListParagraph" style="margin-left:.75in;mso-add-space:auto;
text-indent:-.25in;mso-list:l1 level1 lfo3">
3.<span style=“font:7.0pt “Times New Roman””>
Perform OCR on each image.
It has been observed that Exercise #2 generated the much accurate results though
the results are still not up to the mark. Please, note that the issue has been
logged in our issue tracking system with ID OCR-34045.
We will update you accordingly. We truly appreciate your support and
understanding.
Can you elaborate on Exercise #2 a bit? What do you mean “without any special setting”?
var jpegDevice = new JpegDevice(resolution, 100);
Document pdfDocument = new Document(@"C:\Files\test1.pdf");
XImage xImage = pdfDocument.Pages[1].Resources.Images[1]; FileStream outputImage = new FileStream(@"C:\Files\output.jpg", FileMode.Create);
xImage.Save(outputImage, ImageFormat.Jpeg);
outputImage.Close();
OcrEngine ocrEngine = new OcrEngine();
ocrEngine.Image = ImageStream.FromFile(@"C:\Files\output.jpg");
if (ocrEngine.Process())
{
sb.Append(ocrEngine.Text);
sb.Append(Environment.NewLine);
}
The issues you have found earlier (filed as ) have been fixed in this Aspose.Words for JasperReports 18.3 update.