The groundwork for the Language Data Space (LDS) was laid by the European Language Resource Coordination (ELRC) initiative, which was carried out from June 2014 to January 2023 in the framework of the Connecting Europe Facility (CEF) Automated Translation (AT) programme.
ELRC was a keystone in establishing a multilingual digital ecosystem in Europe, facilitating the collection and sharing of over 3,306 language resources, including written and spoken corpora, grammars, and terminology databases, from and for the public sector. Through a number of conferences and workshops, the initiative added impetus to the collection and utilisation of multilingual data across Europe and fostered a community of practice dedicated to breaking down language barriers and promoting inclusivity.
In the scope of the ELRC initiative, assessments and evaluations were also carried out for supporting the development of several CEF Generic Services projects and of eTranslation, the European Commission’s Machine Translation system available under Digital Europe Language Tools.
Please note that the e-mail address info[at]lr-coordination.eu is not active anymore. For any info, please contact secretariatlanguage-data-space [dot] eu (secretariat[at]language-data-space[dot]eu).
From 2015 to 2022, ELRC hosted a variety of events at both European and national level. These events aimed to heighten awareness about the importance of language data, encourage its sharing, address legal and technical challenges related to data sharing, and discuss practices in using machine translation and other language technologies.
The ELRC Workshops prompted local initiatives to underline the indispensable value of language data, focusing on effective language management and the development of appropriate guidelines.
The ELRC Conferences established a platform for sharing knowledge and updates, which encouraged collaborative efforts in envisioning and shaping Europe's multilingual future.
ELRC Learning Centre
- What are language resources?
The term language resources refers to sets of language data and descriptions in machine readable form, including written and spoken corpora, grammar and terminology databases. Language resources can be used to build, improve, or evaluate natural language systems such as machine translation engines.
- ELRC White Paper
The first edition of the ELRC White Paper, titled Sustainable Language Data Sharing to Support Language Equality in Multilingual Europe – Why Language Data Matters, was published in December 2019. It provided an analysis of European practices for sharing language data and the challenges posed by this endeavour, as well as clear recommendations for policy-level decision makers on how to overcome such challenges.
The second edition of the ELRC White Paper, titled AI for Multilingual Europe – Why Language Data Matters, was published in December 2022. It outlined recent developments in language technology and the role of language data towards artificial intelligence (AI). The paper investigated practices, challenges, and recommendations, providing insights into AI applications, data policies and the major players in the CELT-affiliated countries.