Finding barcode in PDF

I am having trouble finding the barcode on PDF. We have the following process
Workflow Steps
1>We have initially created a barcode using DecodeType.Code128 with Aspose.Barcode and put on PDF page ( our clients use this PDF page as separator sheet)
2>Our client then insert this barcode page between several physical documents and scanned them all, which creates big single PDF
3>Our splitting process then, loop through all pages and check if any page is barcode page, and splits the big PDF into individual small PDF
Depends on the client’s scanner settings, sometimes the scanned quality of the barcode page is not that great, and in such case ASPOSE.Barcode unable to read the barcode,
After going through several posts I came up with 3 different approaches to find a barcode on a PDF.

1> Using Aspose.Pdf.Document
2> Using Aspose.Pdf.Facades.PdfExtractor
3> Using Aspose.Pdf.Facades.PdfConverter

(Please see the attached Visual Studio 2017 solution. You have to update the licenses file path in the code and restore aspose packages)

BarcodeFinder.zip (256.5 KB)

Questions
1> Out of these 3 approaches, which would be a preferred approach to find a barcode on a PDF . ( I would really like to get 100% result)
2> In the attached solution I have included 3 test pdf. The test1.pdf and test2.pdf works with all approaches. But test3.pdf only works with PdfConverter. Why other 2 approaches are not finding the barcode on test3.pdf
3> In all 3 approaches I am reading the barcode text and compare it to determine its barcode page. Is there any way to check just the existence of our barcode so I don’t have to read the barcode?

@Laksh

Thank you for contacting support.

We are investigating the scenario in our environment and will get back to you with our findings soon.

@Laksh

We have investigated the data shared by you and have noticed that you are using Aspose.PDF API as well as Aspose.BarCode API to detect a page containing Bar Code. However, we have noticed that specific page contains an image equivalent to page size. Provided your source PDF files do not contain any other full page images except bar codes, then you can efficiently read properties of images in a PDF file and distinguish black and white images with ImagePlacementAbsorber class exposed by Aspose.PDF for .NET API. You may visit Manipulate Images for your kind reference.

We hope this will be helpful. Please feel free to contact us if you need any further assistance.

Thanks @Farhan.Raza

I tried the suggested ImagePlacementAbsorber but it may not work. As i mentioned in my workflow the clients get hard copies of the documents in the postal mail and then after each document they manually insert hard copy of the bar code as a separator. The whole batch is then feed to scanner which produces single large PDF.

When scanner scans the hard copy,it actually takes the image of the page and then insert into PDF. ( i am assuming thats how all scanners work. They just take the image of the hard copy).
So eventually every page in the PDF is a single image and gray scale. So ImagePlacementAbsorber approach may not work.

However i would still like to know why test3.pdf working only with Aspose.Pdf.Facades.PdfConverter, but not with other two approaches. Its the same barcode with different quality.
In one of the post i read in vector based PDF documents bar codes are not represented as images but lines so PdfConverter must be used in such cases.Is test3.pdf vector based? (i dont think it is because its scanned hard copy)

@Laksh

Thank you for sharing your kind feedback.

We have worked with the data shared by you and have found out that Aspose.BarCode API is not able to read bar code text because Aspose.PDF API is not extracting the image fine from test3.pdf document. A ticket with ID PDFNET-45130 has been logged in our issue management system for further investigation and resolution. The ticket ID has been linked with this thread so that you will receive notification as soon as the ticket is resolved.

We are sorry for the inconvenience.

@Farhan.Raza Any updates on Ticket PDFNET-45130

@Laksh

Thank you for getting back to us.

We would like to update you that PDFNET-45130 is pending for investigations owing to previously logged tickets. It will be investigated on its due turn which can take several months. We appreciate your patience and comprehension in this regard.

However, we also offer Paid Support, where issues are used to be investigated with higher priority. Our customers, who have paid support subscription, report their issue there which are meant to be investigated urgently. In case your reported issue is a blocker, you may please consider subscribing for Paid Support. For further information, please visit Paid Support FAQs.