A prominent online grocery delivery service streamlined its data validation and classification processes by implementing an AI-powered pipeline solution from Everforth Apex.

SITUATION​
Our client, a leading online grocery delivery service, faced challenges with manually auditing and validating thousands of grocery product entries. This involved cross-referencing product images, web pages, and internal taxonomy documents to ensure consistency in product descriptions, categorization, and data quality. The process was labor-intensive, error-prone, and not scalable for millions of Stock Keeping Units (SKUs) and frequent catalog updates across more than 100 retail partners.​

92% Accuracy In Automated Classification Validation Compared To Human Baseline

SOLUTION​
Everforth Apex implemented an AI-powered pipeline to automate product data validation and classification auditing using deep learning, computer vision, natural language processing (NLP), and Google Cloud Platform (GCP)-native AI tools. The solution included:​

  • Image-to-Text Extraction: Optical character recognition (OCR) models extracted brand names, net weights, and product names from packaging images.​

  • Visual Object Detection: You Only Look Once (YOLO) v8 models detected branded logos, product types, and package features.​

  • Textual NLP Classification: HuggingFace models, fine-tuned on grocery taxonomy datasets, classified products based on textual descriptions.​

  • Semantic Comparison Engine: Google’s Gemma large language model (LLM) compared OCR and YOLO outputs against expected taxonomy definitions to identify discrepancies.​

  • Audit Report Generation: Automated reports highlighted mismatches between image-derived data, web text content, and internal taxonomy expectations.​

RESULTS ​
The implementation of this AI-powered solution yielded significant improvements:​

  • 80% reduction in manual auditing time per product entry.​

  • 92% accuracy in automated classification validation compared to the human baseline.​

  • 35% reduction in data errors due to mismatches between image and text in the first rollout.​

  • Continuous improvement through human-in-the-loop retraining of models.

Connect with
our experts.

Looking for your next opportunity? View our jobs!

Locations.

×