Please find attached pdf, which have ‘image text’ within the red box at the top of the pdf. I have to read the text from the red box in the attached pdf but that part is image.
Is it possible using aspose dll if so please send the trial dll to me and sample code, if its works we will purchase your product.
For converting an image to a searchable PDF document, you can use free Google Tesseract OCR for the purpose. First, convert your image to a PDF, and later convert it into a searchable PDF document as described below. Please install Google Tesseract OCR on your computer from http://code.google.com/p/tesseract-ocr/downloads/list and after that you will have the tesseract.exe console application. Below you can see a usage example.
Moreover, I am sorry to update you that the callback method used to convert a PDF to a searchable PDF document is malfunctioning in the current Aspose.Pdf version. However, the issue has been resolved in the upcoming release i.e. 10.9.0. It will be published at the start of October 2015. We have linked your post to the released issue id (PDFNEWNET-38495), you will be notified as soon as it is published.
[C#]
private string CallBackGetHocr(System.Drawing.Image img)
{
string dir = @"E:\Data\";
img.Save(dir + "ocrtest.jpg");
ProcessStartInfo info = new ProcessStartInfo(@"C:\Program Files (x86)\Tesseract-OCR\tesseract.exe");
info.WindowStyle = ProcessWindowStyle.Hidden;
info.Arguments = @"E:\data\ocrtest.jpg E:\data\out hocr";
Process p = new Process();
p.StartInfo = info;
p.Start();
p.WaitForExit();
StreamReader streamReader = new StreamReader(@"E:\data\out.html");
string text = streamReader.ReadToEnd();
streamReader.Close();
return text;
}
public void Main()
{
Aspose.Pdf.License license = new Aspose.Pdf.License();
license.SetLicense("E:/Data/AsposeLicense/asposetotal/Aspose.Total.lic");
Document doc = new Document();
Page page = doc.Pages.Add();
Aspose.Pdf.Image image = new Aspose.Pdf.Image();
image.File = "E:/Data/invoice13.jpg";
page.Paragraphs.Add(image);
MemoryStream ms = new MemoryStream();
doc.Save(ms);
doc = new Document(ms);
doc.Convert(CallBackGetHocr);
doc.Save("E:/Data/invoice13.jpg_output.pdf");
}
Please feel free to contact us for any further assistance.
Sets consent for sending user data to Google for online advertising purposes.
Sets consent for personalized advertising.
Cookie Notice
To provide you with the best experience, we use cookies for personalization, analytics, and ads. By using our site, you agree to our cookie policy.
More info
Enables storage, such as cookies, related to analytics.
Enables storage, such as cookies, related to advertising.
Sets consent for sending user data to Google for online advertising purposes.
Sets consent for personalized advertising.
Cookie Notice
To provide you with the best experience, we use cookies for personalization, analytics, and ads. By using our site, you agree to our cookie policy.
More info
Enables storage, such as cookies, related to analytics.
Enables storage, such as cookies, related to advertising.
Sets consent for sending user data to Google for online advertising purposes.