AI-Powered Grocery Data Validation

A prominent online grocery delivery service streamlined its data validation and classification processes by implementing an AI-powered pipeline solution from Everforth Apex.

SITUATION
Our client, a leading online grocery delivery service, faced challenges with manually auditing and validating thousands of grocery product entries. This involved cross-referencing product images, web pages, and internal taxonomy documents to ensure consistency in product descriptions, categorization, and data quality. The process was labor-intensive, error-prone, and not scalable for millions of Stock Keeping Units (SKUs) and frequent catalog updates across more than 100 retail partners.

92% Accuracy In Automated Classification Validation Compared To Human Baseline

SOLUTION
Everforth Apex implemented an AI-powered pipeline to automate product data validation and classification auditing using deep learning, computer vision, natural language processing (NLP), and Google Cloud Platform (GCP)-native AI tools. The solution included:

Image-to-Text Extraction: Optical character recognition (OCR) models extracted brand names, net weights, and product names from packaging images.
Visual Object Detection: You Only Look Once (YOLO) v8 models detected branded logos, product types, and package features.
Textual NLP Classification: HuggingFace models, fine-tuned on grocery taxonomy datasets, classified products based on textual descriptions.
Semantic Comparison Engine: Google’s Gemma large language model (LLM) compared OCR and YOLO outputs against expected taxonomy definitions to identify discrepancies.
Audit Report Generation: Automated reports highlighted mismatches between image-derived data, web text content, and internal taxonomy expectations.

RESULTS
The implementation of this AI-powered solution yielded significant improvements:

80% reduction in manual auditing time per product entry.
92% accuracy in automated classification validation compared to the human baseline.
35% reduction in data errors due to mismatches between image and text in the first rollout.
Continuous improvement through human-in-the-loop retraining of models.

AI-Powered Grocery Data Validation

Connect withour experts.

Connect with
our experts.