Ievgen Redko: "An 18th century math theory steers my work on 21st century AI"

Using cross-cutting theoretical approaches, Ievgen Redko explores how and why machine learning works. He is currently (2021-2022) a visiting professor at FCAI hosted by Ville Kyrki’s Intelligent Robotics group. 

What brought you to FCAI? 

I was looking for an opportunity to gain experience of working in a different environment abroad and apply my knowledge in new fields, like robotics and reinforcement learning. I didn’t know about FCAI before, but I found it and got in contact with Ville Kyrki. I knew Aalto University and the machine learning group, but less about robotics and the electrical engineering department. Aalto is pretty famous in our community. The whole process was smooth, since Ville had a postdoc working on topics related to what I do, so it was an easy entrance and way to start collaborating. It turns out that robotics uses transfer learning in much the same way as machine learning does [more on that below]. 

Discovering the AI and machine learning community in Finland was huge, I found that Finland is avant-garde in these fields. The Finnish system of financing and supporting research in AI is one of the best in the world and in Europe. 

What are your research interests and how did you end up in this field? 

I come from an applied mathematics background and when I started my PhD, I had never heard of machine learning. It was a coincidence that I applied for the position. The topic of my PhD was transfer learning, that is, transferring knowledge from one area to another. In machine learning each new task is commonly learned with heavy supervision, which is not what people usually do. We leverage knowledge from other tasks and transpose it to other areas.  

Transfer learning is my primary area of research. Why do I find it interesting? When AI started, people didn’t care about efficiency or the fidelity of how machine learning algorithms learn, compared to how people learn, because there was lots of data and AI systems can learn from that. Now a shift is happening, towards efficiency in learning. Can we fine-tune without having to re-learn from scratch each time? This is interesting because it’s intuitive, and how people learn: the more knowledge we gather, we fine-tune to each new task we try to solve. 

The tool that I’m using is the theory of optimal transportation. It’s a mathematical theory used in machine learning that was originally proposed in the 18th century by a French researcher. It existed for centuries but was rediscovered recently and has grown into an active subfield. It’s a nice match between the old school and modern. 

A compilation illustrating (from left to right) one optimal transport problem of getting milk into cheese; a piece of text from the original French by mathematician Gaspard Monge; and an application of this theory in an interpolation of Redko’s face on the Mona Lisa. Image by Ievgen Redko, originally published on his blog

At FCAI you have pursued a new direction that you outlined in a recent seminar. Why should deep neural networks be analyzed?  

Because they are mysterious. If you have a huge model with more parameters than data points, the problem becomes ill-posed, with too many solutions, yet deep neural networks converge to a good solution, against the principles of learning and statistics. Why do they work and how do they lead to good results? This is an important and emerging challenge, to understand overparametrized models and deep neural networks. 

When I was a student, deep learning seemed like overkill to me, because it’s not how human beings learn. We don’t process that many data points to become efficient. Now I see deep learning and human learning differently. Maybe deep learning is closer to how we function. We do process a huge amount of data, including all the prior knowledge in our DNA and everything coming in through our eyes and other senses. 

What’s the societal significance of your research? Are there any practical impacts? 

Most of my work is curiosity-driven and far from applications, but recently I worked on a collaboration about algorithmic fairness in machine learning. This means ensuring that algorithms are not biased based on attributes like sex or ethnicity. 

There are two steps: identifying the source of bias and correcting for it. First we analyze the predictions of an algorithm, so whether they favor certain outcomes for certain inputs. Does your IP address, which indicates where you live, bias the offers and prices you receive for insurance, for example? This can be very flawed, not due to machine learning itself but because of the underlying data and how it was collected.  

We should correct for biases when we collect data or we should try to make sure that algorithms will recognize biases and will give equally likely outcomes. In the United States the law requires that outcomes based on sensitive attributes only have a low likelihood of happening [what is legally known as disparate impact]. This is especially important on all social media platforms and anywhere online that serves ads. 

How has your time been at FCAI so far? 

Everything has been good. After six months we are starting to have results that will lead to a publication. The whole experience has been great, the infrastructure and attitudes of people and the spirit of collaboration, just the whole environment was really nice. FCAI is a really great initiative, I did not expect that it would be so simple to apply and come to work here. I would encourage others to discover this culture and environment, which is excellent for being productive, launching collaborations and new projects. I really enjoyed Finland, because I miss the true winters. I was happy with the snow and freezing temperatures this past winter!