Product recognition in store shelves as a sub-graph isomorphism problem

Alessio Tonioni and Luigi Di Stefano - ICIAP17 --> Arxiv link

We address the problem of visual shelf monitoring for planogram compliance, i.e. detecting whether items on store shelves are compliant to a planned layout. We propose to solve this task by a computer vision pipeline that, given the planogram (ideal layout), an image of the observed shelve and one reference image for each sold item, localizes each product and check the compliance between the real arrangement and the planned one.

Step 1: Unconstrained product recognition

This step operates only on model images and the observed shelf image without leveraging on the expected product disposition and is thus referred to as Unconstrained.We rely on a classical multi-object and multi-instance object recognition pipeline based on local features as described in [1]. We experiment deploying multiple types of features jointly voting in the same Hough space to pursue higher sensitivity.The result is a set of noisy detection (both due to missed products and false detections) that we are going to refine by the following steps.


Step 2: Graph Based Consistency Check

We start to leverage on the expected product layout representing the planogram as a grid-like fully connected graph were each node corresponds to a facing in the shelf and is linked to at most eight neighbours along the cardinal directions.We build two such graphs, one based on the ideal layout (reference planogram) and one based on the noisy detections obtained within the previous step (observed planogram).We then search for a sub-graph isomorphism  between the two. If found, this identifies sets of products placed in the same relative positions in both graphs, removes false detections from the observed graph and localizes seamlessly the pictured shelf within the whole aisle modeled in the reference graph.


Step 3: Product verification

We use an iterative procedure whereby each iteration tries to fill the cleaned up observed planogram with one missing object. Starting from the missing product with more detected neighbors, we estimate its position and dimension from the already detected product at one-edge distance, this information defining a ROI were the missing product can be searched by different techniques.If no item is found our system raises a planogram compliance issue. Other way, the new detection is added to the observed planogram and the iteration continues with another missing product.


Experimental Results

We manually annotated with instance level bounding box and reference planograms 70 shelf picture featuring 181 different products from the public Grocery Products Dataset [2] that could be downloaded from Marian George page.
Our costum annotation and reference planograms can be downloaded here: --> Planogram Dataset
Please refere to the README.txt inside the downloaded zip file to understand how to use the annotations, if you have any trouble feel free to contact me at This email address is being protected from spambots. You need JavaScript enabled to view it.

We report mean Precision, Recall and F-Measure obtained with different configurations of each step of our pipeline.

For the first step, we tried different feature detector/descriptor pair using either OpenCV or the original author implementation. We also report results using multiple pipeline (e.g. BRISK+SURF).

To test step 2 of our pipeline, we keep the configuration with the best F-measure (BRISK) and that with the highest recall (BRISK+SURF).

Our graph isomorphism formulation improve massively the precision of the system removing almost all false detections.

We choose BRISK and test different design choices for the final product verification step: template matching, Best Buddies[3] and, agian, local invariant features. We report here the best results.

By using local features for both the first and last step we can achieve an F-Measure as high as 90% .


We would like to thank Centro Studi s.r.l. for funding this research project.


[1] Lowe, D.G.: Distinctive image features from scale-invariant keypoints. IJCV 60(2), 91–110 (2004)

[2] George, M., Floerkemeier, C.: Recognizing products: A per-exemplar multi-label image classification approach. In: ECCV 2014, pp. 440–455. Springer (2014)

[3] Dekel, T., Oron, S., Rubinstein, M., Avidan, S., Freeman, W.T.: Best-buddies similarity for robust template matching. In CVPR 2015 IEEE Conference on. pp. 2021–2029. IEEE (2015)