Huge performance penalty when scanning multiple regions of the same image

matthias.heubi · December 12, 2019, 3:19am

Our use case is interactive and thus very performance sensitive as we have very long scans: double sided, 8000+ pixels in length each.

To improve performance, we logically divide the image into slices and run heuristics on those slices to determine the likelihood of a slice containing a barcode, which only takes a few milliseconds. We then focus the Aspose library, using the area parameter, on the most promising slices first (in descending order of probability) and finish up as soon as we’ve found what we’re looking for. Testing shows that our heuristics generally manage to “bubble up” the barcode to the first 1-2 attempted slices. This greatly speeds up the entire process, as we get results from the Aspose library within ~100ms (barcode found on first slice) rather than 1.4s (running reader on full image without area constraint)

HOWEVER: There seems to be a substantial setup cost somewhere inside the library, which is incurred each time we instantiate a reader. So when an image does not contain a readable barcode at all, reading time explodes from 1.4s (running reader on full image) to 15s (trying all slices individually).

It seems to me, that the v9.11.0 API is missing an Area property or a SetArea(Rectangle rect) method, or alternatively, an area argument to the ReadBarCodes() method. so that an already instantiated reader can be refocused to a different area of an image.

(Pseudo)Code:

// analyze image and identify regions likely containing a barcode
Rectangle[] sliceRects = new ImageSlicer(image).GetProbabilisticSliceList();

// instantiate reader upfront, incurring setup cost only once
BarCodeReader reader = new BarCodeReader(image, barcodeType);

// try slices in descending order of probability
foreach (var rect in sliceRects) {
reader.SetArea(rect);
BarCodeResult[] result = reader.ReadBarCodes();
// analyze result and break from loop early if satisfied
}

Any thoughts?

(I’m happy to provide a stripped-down example application in C# that demonstrates the performance gap as it exists today)

ahsaniqbalsidiqui · December 12, 2019, 8:41am

@matthias.heubi,
Could you please provide the sample application and sample data. We will reproduce the problem and provide our feedback after analysis.

matthias.heubi · December 12, 2019, 2:29pm

You can download the example here:
<sorry, had to delete the link here, because I just realized the project contained the OEM license file.
Will post a new liink as a follow-up message once I have created a new download excluding the file>

It processes three different images. Each image is once processed as a single region (“full”) and once in incremental slices (“sliced”).

To simplify the example rather than analyzing and sorting the slice regions, it simply slices the images linearly top to bottom and the three images contain the barcode at different Y coordinates (or not at all) to simulate a barcode being bubbled up to the first sliice, to a later slice or having to go through all the slices because no barcode exists on the scan.

It also logs to the console where you can see the setup cost:
Trying slice at: 7328
setup: 35ms
read: 7ms

Trying slice at: 7360
setup: 39ms
read: 13ms

Trying slice at: 7392
setup: 39ms
read: 4ms

Trying slice at: 7424
setup: 43ms
read: 2ms

setup is the time to instantiate the reader, read is the actual call to ReadBarCodes(). Mind you that the setup is redundant for all successive slices, because the same image is reused again and again, but the current API does not provide a way to specify different regions for repeated invocations of ReadBarCodes().

Thanks for looking into this.

On a side note, your algorithm really kicks ass on low-res grayscale/color scans. None of the competing libraries even reach the same ballpark of recognition accuracy. I suspect you’re the only guys who actually take grayscale intensity into account rather than just doing thresholding as the first processing step!

matthias.heubi · December 12, 2019, 2:59pm

Here’s the new download link with the license file removed:
https://drive.google.com/file/d/14tIcrK5YCs22syHfcdF86WJ9HR-Eb6ZY/view?usp=sharing

Make sure to add a valid Aspose.BarCode.lic file to the project, otherwise it won’t run correctly.

ahsaniqbalsidiqui · December 12, 2019, 6:00pm

@matthias.heubi,
We were able to observe the issue but we need to look into it more. We have logged the issue in our database for investigation and for a fix. Once, we will have some news for you, we will update you in this topic.

This issue has been logged as

BARCODENET-37343 – Huge performance penalty when scanning multiple regions of the same image

alexander.gavriluk · December 16, 2019, 10:49pm

At first, I want to say thank you for your question, which allows to improve performance of some parts of our project.

About your problem.

As you see, the setting the image takes ~39 ms and with missing.jpg you have 233 setups. Which requires ~10 sec only for setting image without recognition.

The main problem consists in coping whole image before region extraction or recognition. We can improve this code but it improve performance in three times from 39 to 18 ms. This also takes a long time.

You can manually copy parts of image (slices) Issue37343.ReadSlicedBarcode and recognize them. This allows to avoid whole image setup problem. But this ruins barcode area detector algorithm which improve performance.

However we have barcode area detectors which scan whole image and find regions with possible barcodes. In most cases it works well. So, you could setup QualitySettings.NormalQuality or QualitySettings.HighQualityDetection(detect special barcodes noised or with low height) and in this mode (Issue37343.ReadBarcodeFull) recognition of whole image takes ~500 ms.

QualitySettings.HighQuality is required only for special barcodes and in most cases QualitySettings.NormalQuality gives satisfying quality and recognizes 99% of barcodes.

In 2020 we will start improving barcode recognition performance and will give more options in QualitySettings to avoid not required recognition cases, which could improve performance more.

Issue37343.zip (1.8 KB)

ahsaniqbalsidiqui · August 10, 2020, 2:20pm

@matthias.heubi,
We are glad to share that this issue is resolved and will be available in our next regular release 20.8 after two weeks. We increased performance in the image loading but reading the whole image still could be faster because our Barcode reading algorithm uses the barcode area precision algorithm and drops out reading from unusable regions. You will be notified here once new version is released and available online for download.

aspose.notifier · August 31, 2020, 4:21pm

The issues you have found earlier (filed as BARCODENET-37343) have been fixed in Aspose.BarCode for .NET 20.8. This message was posted using Bugs notification tool by Amjad_Sahi