Fostering Innovation: KIT and UM’s Collaborative Leap in NLP and Machine Translation

On January 26th 2024, Maastricht University (UM) VOXReality team was hosted by the Artificial Intelligence 4 Language Technologies (AI4LT) group of the Karlsruhe Institute of Technology (KIT). It was a long day workshop where both groups presented their work in Natural Language Processing (NLP) and more specifically Machine Translation (MT). Synergies between the two groups promise a bright future for applied language technologies!

UM kicked-off the day by presenting the VOXreality project and more specifically the 3 use-cases along with the general objectives: (1) improve human-to-machine and human-to-human XR experiences, (2) widen multilingual translation and adapting it to different contexts (3) extend and improve the visual grounding of language models, (4) provide accessible pretrained XR models optimized for deployment and (5) demonstrate clear integration paths for the pretrained models. UM’s team member, Yusuf Can Semerci elaborated (as the scientific and technical coordinator) on the technical excellence of the project which is guaranteed by applying the state-of-the-art methods in automatic speech recognition (ASR), multilingual machine translation, vision and language models and generative dialogue systems.

UM’s team has 2 active PhD candidates who shared their latest research endeavors. Abderrahmane Issam explained his latest work on efficient simultaneous machine translation (SiMT). The goal of SiMT is to provide accurate and as real-time as possible translations by developing policies that are able to balance the quality of the produced translation versus that lag which is sometimes necessary so that the model has enough information to translate properly. UM’s proposed method learns when you need to wait for more input data in the original language before starting to produce the translation taking into account the uncertainty that comes with real-time applications. Results are promising both in terms of translation accuracy but also reducing the necessary lag.

Pawel Maka presented his published paper on context-aware machine translation. Context plays an important role in all language applications: in machine translation it is essential to remove the vagueness from e.g. which pronoun should be used. Context can be represented in different ways and usually includes the previous (or next) sentences for the one that we want to translate which can either be on the source or the target language. Of course, the bigger the context, the more computationally expensive is to run a translation model. Therefore, UM proposed different methods on how context can be efficiently “compressed” through techniques like caching and shortening. Proposed methods are competitive in terms of performance both in terms of accuracy but also in terms of resources used (e.g. memory).

On the other hand, KIT’s team presented the EU project Meetween. Meetween is aiming to revolutionize video conferencing platforms, breaking linguistic barriers and geographical constraints. Meetween aspires to deliver open-source AI models and datasets and more specifically multilingual AI models that focus on speech but support text, audio and video both as inputs and outputs and multimodal and multilingual datasets that cover all official EU languages.

KIT’s team of PhD candidates presented their work in (1) multilingual translation in low-resource cases (i.e. for languages that are not widely spoken or for cases when data is not available), (2) low-resource automatic-speech recognition, (3) the use of Large Language Models (LLM) in context-aware machine translation and (4) quality/confidence estimation for machine translation.

We were happy to identify the overlaps between both EU projects (VOXReality and Meetween) as well as the UM and KIT teams. At the heart of both projects lies a common objective: harness the power of advanced AI technologies, particularly in the realms of Natural Language Processing (NLP) and Machine Translation (MT), to facilitate seamless communication across linguistic and geographical barriers. While the applications and approaches may differ, the essence of their goals remains intertwined. VOXreality (by UM), seeks to enhance extended reality (XR) experiences by integrating natural language understanding with computer vision. On the other hand, KIT’s Meetween project takes a different but complementary approach to revolutionizing communication platforms. By fostering an environment of open collaboration and knowledge exchange, UM and KIT are more than excited for what the future brings in terms of their collaboration.

Picture of Jerry Spanakis

Jerry Spanakis

Assistant Professor in Data Mining & Machine Learning at Maastricht University

Twitter
LinkedIn
Shopping Basket