Companies that manage thousands of products face a problem: product data from manufacturers is often available as unstructured text. In order to use this data in a Product Information Management (PIM) system, it must be classified according to a uniform standard.
ECLASS is an international standard that clearly describes products and facilitates digital collaboration. However, manual assignment to over 45,000 ECLASS product classes requires enormous expertise and is extremely resource-intensive.
To solve this challenge, an intelligent, agent-based system was developed. At the core of the solution is an autonomous agent controlled by a large language model (LLM).
The way it works can be explained in three steps:
Knowledge base: First, the complex, four-level hierarchy of the ECLASS standard was modelled and transferred to a flexible graph database (Neo4j). With around 200,000 nodes and 2 million edges, the graph serves as a comprehensive knowledge base for classification for the agent.
Iterative navigation: Instead of making a decision immediately, the AI agent navigates through the knowledge graph step by step. At each level of the hierarchy, it analyses the product description, makes semantically sound decisions and thus refines the classification iteratively until the exact category is found. Thanks to the ReAct framework (Reasoning & Acting), every ‘thought’ and every action of the agent is logged, making the entire classification process transparent and traceable.
Forward-looking architecture: The Model Context Protocol (MCP) is used for communication between the agent and the database – a kind of universal interface that can be described as ‘USB for AI applications’. This modular structure makes the system flexible and future-proof.
At the end of the process, the agent extracts not only the correct category, but also all relevant product features in a clean, structured JSON format – ready for seamless integration into a PIM system.
The evaluation shows impressive figures: When products were correctly assigned to the ECLASS categories in a hierarchical manner, a high degree of accuracy was achieved with an average path match of 90%. This confirms that the iterative, graph-based navigation approach is robust.
For companies, the use of such technology offers clear added value:
Increased efficiency: The manual effort and associated costs for data preparation are drastically reduced.
Error reduction: Automation minimises human errors that can occur during manual classification.
Better data quality: Structured and correctly classified product data is the basis for an excellent customer experience and smooth internal processes.
The system demonstrates the potential of LLM-based agents for navigating complex taxonomies. The developed solution is more than just a theoretical concept. It serves as an intelligent tool that generates high-quality classification suggestions, thereby greatly reducing the workload for subject matter experts.
In the future, self-correction mechanisms or hybrid search approaches could make the solution even more efficient. One thing is clear: AI agents are a big step towards fully automated, smart product data management.