Preserving linguistic and cultural diversity in Europe and promoting technological excellence and leadership
A European Digital Infrastructure Consortium (EDIC) is a new mechanism for multi-country projects created under the Digital Decade Policy Programme 2030. EDICs allow Member States to pool funding and other resources in a flexible and efficient way, to invest in transformative digital projects. EDICs can also ensure common standards and interoperability.
For more information on what an EDIC is and what its main features are, please have a look at our News section.
The Alliance for Language Technologies
The ALT-EDIC, the Alliance for Language Technologies, was proposed in December 2023 as one of the first EDICs. On 7 February 2024, the European Commission officially set up the ALT-EDIC with the Implementing Decision (EU) 2024/458.
Coordinated by France, the ALT-EDIC counts
- Seventeen Members States: Bulgaria, Croatia, Czechia, Denmark, Finland, France, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Luxembourg, Netherlands, Poland, Slovenia, and Spain;
- Eight observing Member States: Austria, Belgium, Cyprus, Estonia, Malta, Portugal, Romania, and Slovakia.
The role of ALT-EDIC is to create a common European data infrastructure and services for language technologies in order to strengthen Europe's technological competitiveness while supporting its cultural diversity. ALT-EDIC's primary action involves collecting and federating language and multimodal data from across the European Union and its Member States. The consolidation of this language data will enable ALT-EDIC to foster the development of innovative Large Language Models with robust multilingual and multimodal capabilities.
Specifically, ALT-EDIC shall carry out the following activities:
ALT-EDIC will leverage on the Language Data Space and federate existing language and multimodal resources from the EU and Member States in all European, national and regional languages, including through the creation of strategic data such as for languages with few speakers (less than 10 million speakers) in which there are inherent limitations to training Large Language Models (LLMs).
ALT-EDIC will create a repository of existing open-source language models for reuse by industrial actors and develop specific methods for fine-tuning especially for SMEs, and will provide evaluation, certification and normalisation methodologies with a particular focus on potential discrimination and bias introduced by Natural Language Processing (NLP) models.
ALT-EDIC will act as a pool seed fund, bringing together public and private resources to launch and develop new Large Language Model projects and Foundation Models with multimodal capabilities, including through providing access to the necessary European High-Performance Computing.
ALT-EDIC will contribute to the development of evaluation methodologies with a particular focus on potential discrimination and bias introduced by NLP models as well for to provide dedicated support to institutions for investing in LTs.
ALT-EDIC will act as an advisory point for public administrations as well as reaching the public through a cultural program based on artificial intelligence for languages and enabling LT end-users, who are also data producers, to take up the challenges of artificial intelligence and language technologies in a multilingual context and contribute to the enlightenment of the European citizen on the matter of artificial intelligence.
What is the link between the ALT-EDIC and LDS?
The ALT-EDIC is a multi-country project, run and funded by the Member States who have agreed to join it. By pooling resources, the members should achieve the critical mass of data and other resources needed to create and finetune Large Language Models, which any single member would find difficult to do alone.
The LDS is one of several data spaces, supported by the Commission to nurture a data ecosystem across many sectors. The LDS will establish a governance structure for the exchange of data from various sectors, that can be used to develop language technology tools. This data will also be available to the ALT-EDIC. The LDS is financed by a contract under the Digital programme.