Barcode reader: Page Number

I am using the PdfExtractor class method HasNextImage(), our process appears to be stopping when multiple images are detected. We believe that we could handle this issue more appropriately if we were able to determine which page number within the PDF this issue occurred. Some of the PDF’s that we are scanning have issues with the Xerox that is scanning the image (Blurry or dirty scans), this is to be expected as some units are up for repair or replacement. If we can overcome this issue we would be able to move the process on without having our application time out. Is there any way that we could return the page number that the PdfExtractor is currently processing?

Also, I would like to know if there is a way to get the collection of images that is created when calling ExtractImage(). Is this available publicly?

Hi George,

Thank you for contacting support. We can extract images by setting StartPage and EndPage properties. These properties help to determine the page number. Second, the number of iterations of HasNextImage method gives you the collection of images. Please go through the help topics there:

Extract Images from a Particular Page of a PDF Facades

Extract Images from a Range of Pages of a PDF Facades

In case, these articles don’t help, then please provide us the problematic PDF document along with your sample code. We’ll check and guide you accordingly. We’re looking forward to help you.

I do appreciate your prompt response, however we are currently using these methods (StartPage & EndPage) within our code. Unfortunately this doesn’t provide enough information that would allow to detect any errors with the scan for bar codes. We’re simply just wanting to know if any method call exist that would allow us to know what page of the PDF document is begin handled within the HasNextImage.


EX. PDF document with 50 pages, each page of the document has two bar codes that identify the document. Within those 50 pages, 3 different documents exist based on the bar code we are establishing bookmarks that will be saved after the scan is completed. When the first page of the PDf is scanned no bar code is detected, the process appears to continue scanning the first page until it final kills the EXE.

NEED: For each loop in HasNextImage is there any method that will let us know what page number of the PDF HasNextImage is currently on?

We would like to know what page number of the 50 paged PDF is being scanned for a bar code image, if the page number doesn’t change we want to force the reader to move to the next page within the PDF. Is there anything within ASPOSE Bar code that would be able to return this page number?

Thanks in advance,

George W

Hi George,


Thank you for updating us. Your query is related to the Aspose.Pdf API. I’m moving this forum thread to Aspose.Pdf forum. One of my colleagues will reply you soon.

Hi George,


Thanks for your inquiry. You may iterate through PDF document pages as following and keep track of page numbers for your logic. Hopefully following code snippet will help you to accomplish the task.

//open input PDF<o:p></o:p>

PdfExtractor pdfExtractor = new PdfExtractor();

pdfExtractor.BindPdf(myDir + "Input.pdf");

for (int pgno = 1; pgno <= pdfExtractor.Document.Pages.Count; pgno++)

{

//set StartPage and EndPage properties to the page number to

//you want to extract images from

pdfExtractor.StartPage = pgno;

pdfExtractor.EndPage = pgno;

//extract images

pdfExtractor.ExtractImage();

//get extracted images

while (pdfExtractor.HasNextImage())

{

Console.WriteLine("extracting image from page number {0}",pgno);

//read image into memory stream

MemoryStream memoryStream = new MemoryStream();

pdfExtractor.GetNextImage(memoryStream);

//write to disk, if you like, or use it otherwise.

FileStream fileStream = new

FileStream(myDir + DateTime.Now.Ticks.ToString()+"_page"+pgno+".jpg", FileMode.Create);

memoryStream.WriteTo(fileStream);

fileStream.Close();

}

}

Please feel free to contact us for any further assistance.


Best Regards,