Back to All Events

Jörg Tiedemann: Releasing the MAMMOTH - a framework for large-scale modular multilingual NLP models

Abstract: Neural language models have been grown in size and importance over the past years. We address two challenging aspects in the field of NLP: The support of a wide variety of languages and the runtime efficiency of such models. We focus on encoder-decoder models and modular architectures that balance between task-specific components and parameter sharing. In particular, we want to achieve effective cross-lingual transfer learning while keeping language-specific modules that can operate independently. The latter is important for efficient inference reducing computational costs and energy consumption at runtime, a crucial task for modern NLP.

There are several ways of implementing multilingual NLP systems but little consensus as to whether different approaches exhibit similar effects. Are the trends that we observe when adding more languages the same as those we observe when sharing more parameters? MAMMOTH (https://github.com/Helsinki-NLP/Mammoth) is a flexible framework for training various types of modular architectures making it possible to systematically compare different approaches.

Special care is taken to optimize the scalability in multinode training on large HPC clusters such as LUMI. I will report the current stage of our research including initial results, our efforts on hyper-parameter tuning, the optimization of modular architectures, scalability benchmarks and the final goal of training a large-scale multilingual translation model with massively parallel data sets.

Speaker:  Jörg Tiedemann works as professor of language technology at the Department of Digital Humanities at the University of Helsinki. His main research interest is in cross-lingual NLP and machine translation.

Affiliation: University of Helsinki

Place of Seminar:  Kumpula exactum CK111 (in person) & zoom ( Meeting ID: 640 5738 7231 ; Passcode: 825217)

Earlier Event: November 22
AIX Forum: AI and Quantum Computing
Later Event: November 27
AI Monday: AI in Industry