In Data in brief
To assess the potential of current neural network architectures to reliably identify packaged products within a retail environment, we created an open-source dataset of 295 shelf images of vending machines with 10'035 labelled instances of 109 products. The dataset contains photos of vending machines by the provider Selecta, the largest European operator of vending machines. The vending machines are a mix of machines in public and private office spaces. The vending machines contain food as well as beverage products. The product instances in the vending machine images are labelled with bounding boxes, where a bounding box encapsulates the entire product with as little overlap as possible. The labels corresponding to the bounding box consist of a structured, human-readable labels including brand, product name and size as well as the GTIN of the product. The GTIN is the global standard to identify products in the retail environment and therefore increases the value as a dataset for the retail industry. Contrary to typical object detection datasets that choose labels at a higher level such as a can or bottle for a much wider variety of objects, this dataset chooses a far more detailed label that depends less on the shape but rather on the exact design of the product. The dataset falls into the category of object detection datasets with a large number of objects, which next to the GTIN label, represents a main differentiator of the dataset to other object detection datasets.
Fuchs K, Grundmann T, Haldimann M, Fleisch E
Computer vision, Deep learning, GTIN, Object detection, Packaged products