Skip to main contentSkip to page footer

 |  Blog

Edge, AI and Large Language Models (LLMs): Fascinating combination for the future of technology

Edge AI is becoming increasingly popular, but what exactly is behind it? Initially, the vision was to move AI models that were developed in the cloud to the edge. This enables use cases that were previously typically realised in the cloud.

The introduction of Large Language Models (LLMs) opens up new possibilities, such as interaction with machines via voice assistants.

We have looked at three typical challenges for this example:

1. Challenge: Resource requirements

A key question when using LLMs on edge devices is how much power is actually required. The model size and the so-called context window, i.e. the area that a model can take into account when generating text, play a role here. These factors are decisive for the choice of suitable edge hardware. We have tested this on two systems: an experimental system based on the "Jetson Orin Nano" from NVIDIA and an industrial edge computer from WAGO that is conceivable for real-life use. 

The performance of the models was measured using the metric "token processing per time". A token is an elementary component of an LLM and represents individual parts of a text. When comparing the two devices with the Meta Llama 2 model with 7 billion parameters, it quickly became clear that a GPU can provide a significant performance boost. In the context of industrial automation, this resulted in the desire for devices with a passively cooled, industrial-grade GPU.

2. Challenge: software architecture 

Using LLMs on less powerful devices brings challenges such as dealing with Python code, dependencies, version conflicts and unpredictable library lifecycles. In an industrial environment where sustainable solutions are required, this conflicts with the rapid pace of development in the AI industry.

A microservice architecture has proven to be an effective solution for us. It enables independent implementation and deployment using container technology. It also simplifies the handling of model selection and dependencies.  

3. Challenge: Selecting suitable models 

A microservice architecture allows us to quickly integrate and test new models. It is particularly exciting that quantisation can play a special role for devices with low resources. Significant savings can be achieved by reducing the data type (e.g. from 16-bit float to 4-bit integer). However, this is usually accompanied by lower accuracy in the execution of the models. Another important factor is the number of parameters. Our tests have shown that models with 3 to 7 billion parameters are useful.

Overall, Edge AI and LLMs offer a promising combination for the future by opening up new fields of application. The associated challenges can be overcome by using suitable hardware, a well thought-out software architecture and careful model selection.

About the author


Michael Heller is a computer scientist with a passion for automation and operational technology (OT). As Group Leader and expert in the field of Industrial IoT and Edge at M&M Software, he is enthusiastic about the "things" of Industrial IoT and their connection to neighbouring systems, such as the cloud.

About the author


Pascal Scheck is studying computer science, specialising in artificial intelligence. During his practical semester at M&M Software, he evaluated the possible applications of large language models (LLMs) on edge devices. This practical experience now flows into his work as a student trainee in the Data & AI team.

Created by