
The aim is to strengthen Europe’s digital sovereignty and provide German authorities and institutions in particular with access to powerful GenAI technology. A first milestone is the Teuken-7B language model, developed under the leadership of the Fraunhofer Institutes IAIS and IIS and available under an open source license since November 2024. In this podcast episode “Ausgesprochen digital” we talk about the importance of this project and the path to AI “Made in Europe”.
In conversation with Dr. Nicolas Flores-Herr and Dr. Thomas Wächter
Nicolas Flores-Herr is team leader of Foundation Models & Gen AI Systems and site manager of the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) in Dresden. He brings extensive expertise in the development of intelligent dialog systems for business – including voice assistants, chat and writing bots that enable fast, intuitive access to information.
His focus is on solutions that are not only powerful, but also data protection-compliant and culturally tailored to the European region. As project manager for OpenGPT-X, he plays a key role in developing large AI language models “Made in Europe” and making them available as open source.
Thomas Wächter is Head of AI & Natural Language Processing at Telekom MMS and is responsible for a team of experts who develop AI-based services and solutions for companies in Germany. With over 20 years of experience in text mining, NLP, artificial intelligence and semantic technologies, he creates practical added value from data and language.
He works closely with leading research and innovation partners to translate the latest developments in generative AI into marketable applications. He sees OpenGPT-X and the Teuken-7B model as a key resource for trustworthy AI applications in a wide range of industries.
From the very beginning: The creation of OpenGPT-X
The foundation stone for OpenGPT-X was laid in summer 2021: A consortium led by the Fraunhofer Institutes IAIS and IIS submitted a research proposal to the German Federal Ministry for Economic Affairs and Climate Protection (BMWK) – with the aim of developing powerful, open AI language models “Made in Europe”. The project launch in January 2022 came at a time of growing social attention for generative AI, which peaked with the “ChatGPT moment” in the fall of the same year.
In November 2024, the consortium – which includes TU Dresden, DFKI, Forschungszentrum JĂĽlich and other partners – published Teuken-7B, the project’s first major language model. It was trained on the JĂĽlich supercomputer JUWELS, is completely open source (Apache 2.0) and optimized for use in all 24 official EU languages. OpenGPT-X thus makes an important contribution to Europe’s digital sovereignty and sets new standards for responsible AI applications.
Teuken-7B- An open source language model with European strengths
Teuken-7B is more than just a large AI language model – it is a strategic tool for digital sovereignty, innovation and trustworthy applications in Europe. It combines openness, adaptability and efficiency. Its special features and potential at a glance:
- Open for all: open source as a driver of innovation. With over 60,000 downloads in a very short time, Teuken-7B shows how powerful a committed community can be. Publication under the Apache 2.0 license enables free use, further development and integration – even for commercial purposes. Companies can fully customize the model to their own requirements without having to rely on proprietary black box technology.
- Multilingual from the ground up: Built for a diverse Europe. Unlike many existing models, Teuken-7B has been consistently trained multilingually – in all 24 official EU languages and with 50% non-English training data. This ensures stable performance across language barriers and enables reliable AI applications for internationally active companies and institutions.
- Tailor-made instead of off-the-shelf: customizability included. With instruction tuning for dialogue-based use and open access to the architecture, organizations can train Teuken-7B for their specific scenarios – whether for customer service, knowledge management or specialist applications with sensitive content. Security-critical areas of application such as robotics, the automotive industry or medicine also benefit from complete control over the model and data.
- Efficiency meets performance: European tokenizer as a game changer. Thanks to a specially developed tokenizer, Teuken-7B works more energy- and cost-efficiently than many other language models – especially for complex European languages such as German, Finnish or Hungarian. This technological basis was specifically researched in the project to enable sustainable and powerful AI applications.
- Trust through transparency: traceable origin, clear standards. The training data was thoroughly checked with regard to its use under licensing law for publication in order to ensure that the data used may be used for model training. This procedure is an important aspect in the context of the European AI Act.
Evaluation of AI models: Why size isn’t everything
The size of a model is not automatically a sign of quality. The decisive factor is how well it fits the specific requirements. Teuken-7B proves that a lean, transparent model with European values can not only be a powerful alternative, but in many cases the superior choice.
Application areas and hosting options of Teuken-7B: Flexibility meets data sovereignty
Teuken-7B is a versatile and adaptable AI model that companies can use in various industries. Whether for text analysis, document management or semantic search – the model offers customized solutions that can be tailored to industry-specific requirements through fine-tuning. In document analysis and knowledge management in particular, Teuken-7B enables targeted and efficient processing of large volumes of data through Retrieval Augmented Generation (RAG).
- AI Foundation Services: Security and scalability at the highest level.
With AI Foundation Services, Deutsche Telekom subsidiary T-Systems offers a comprehensive range of operating models that are specially tailored to the needs of European companies and public authorities. Teuken-7B can be operated in highly secure, GDPR-compliant data centers in Germany and Europe, or even on dedicated infrastructure at the customer’s premises – ideal for sensitive and particularly sensitive data. This flexibility is particularly important for companies in regulated industries such as the financial sector, the healthcare industry or the public sector, which place the highest demands on security and compliance. The AI Foundation Services thus offer companies the opportunity to develop and operate generative AI applications on scalable, secure platforms.
- Business GPT: Integrated AI solutions for companies.
Teuken-7B is directly integrated into the standardized Telekom product Business GPT, which provides “out-of-the-box” RAG applications and functions for processing documents and internal company chats. Using a standardized API, companies can seamlessly integrate the model into existing AI assistants, agents and chatbots and thus quickly benefit from the power of generative AI.
Future prospects: Further development and challenges
The journey of OpenGPT-X and Teuken-7B is just the beginning of an exciting development in the field of generative AI. The consortium will continue to work on improving and expanding the model variants. Nicolas Flores-Herr emphasizes that the ongoing research aims to further increase the performance and adaptability of the models. Future developments could result in models that are not only broadly applicable, but also tailored in depth to specific industry needs.
An important focus is on the integration of multimodal data types such as text, images and audio. This development could make the AI models of OpenGPT-X even more powerful and enable a broader range of applications that go beyond pure text processing. Thomas Wächter emphasizes that by integrating these data types, AI systems could be able to provide even more comprehensive and contextualized answers – for example in areas such as media analysis, product development or medical diagnostics.
Another key topic is and remains data sovereignty and the explainability of AI models. The transparency and traceability of models is becoming increasingly important, especially in light of the EU AI Act. OpenGPT-X sets new standards here with its open structure. In this podcast episode, you can hear what further developments and potentials are emerging here. We wish you good entertainment!
This episode is hosted by Steffen Wenzel, co-founder and Managing Director of politik-digital and Stefanie LiĂźe, Senior Sales Manager at Telekom MMS.
– – – – – –
Further links
👉 www.telekom-mms.com
👉 To the podcast “AI ‘Made in Germany’: How OpenGPT-X contributes to Europe’s digital sovereignty”
👉 Telekom offers ‘Made in Germany’ language model from OpenGPT-X
Graphic: Telekom MMS