Forbes AI Award to FCAI-related research on image-creating GAN models

The magazine pays tribute to the development of a model that starkly reduces the amount of data needed for generating new, artificial images.

GAN generated images created with a limited amount of training data. Source: Karras et al. (2020): Training Generative Adversarial Networks with Limited Data. — GAN generated images created with a limited amount of training data. Source: Karras et al. (2020): *Training Generative Adversarial Networks with Limited Data*.

The American business magazine Forbes hails research relating to FCAI on its AI Awards 2020 listing. On a list of altogether six award recipients, Forbes has chosen for “Most disruptive innovator” the FCAI partner company Nvidia.

Specifically, Forbes praises “a series of breakthroughs” reducing the amount of data which is needed in generative adversarial networks (GAN). This research has been conducted by a group of researchers at Nvidia, one of whom is Aalto University Associate Professor and FCAI member Jaakko Lehtinen.

GAN models, Lehtinen explains, learn from pre-existing data sets in order to create new data with similar properties. In Lehtinen and his colleagues’ work, the data in question are pictures. Lehtinen exemplifies with cats:

“If you’re shown pictures of cats, you can, although you’ve never seen them before, recognize that they portray cats. This is because there are particular similarities – cats tend to look a particular way, be in particular places, positions, and so on”, he explains.

The aim of a GAN model, then, is to find such regularities in the original set, and to produce new data – for instance artificial cat photos – that follow the guidelines identified. The problem is that high-quality results typically require tens or even hundreds of thousands of pictures for the algorithm to learn from. Often, these are not available.

Lehtinen and his colleagues, however, have succeeded in reducing the number of pictures to up to one tenth of what has previously been needed. This has been made possible by changes in the process by which the algorithm learns from the original data.

“This is achieved by, so to speak, equipping the algorithm with glasses that are broken in very particular ways as it studies the set of example pictures. This has an exciting combined effect that helps the algorithm produce better pictures”, Lehtinen says.

In practice, the results of such research are increasingly utilized, for instance, in the field of medical research, where research data tends to be sensitive and therefore difficult to acquire. As hospitals are understandably protective of their patients’ picture data, a model which generates new data sets that are based on a limited number of original picture examples, and which no longer entail sensitive personal information, can be of great value.

The model that Lehtinen and his colleagues have developed and now improved belongs to the StyleGAN model family. In the context of FCAI, this research closely relates to the research programs concerned with simulator-based inference (R2) as well as data-efficient deep learning (R3).

Further information

Link to the research article: https://papers.nips.cc/paper/2020/file/8d30aa96e72440759f74bd2306c1fa3d-Paper.pdf

Tero Karras’ talk about the GAN model at the Machine Learning Coffee Seminar