Improving data literacy to support ethical and responsible decision-making
This video is a part of the FCAI success stories series. In the video series, we explain why fundamental research in AI is needed, and how research results create solutions to the needs of people, society and companies.
Big data analytics, machine learning and AI have created unprecedented possibilities for using data in a way that benefits us all – but there are also many possibilities for doing harm.
“Technology is advancing from laboratories to society, but is our society ready for this? Can citizens and decision-makers understand data and AI, and utilize data analysis tools responsibly, ethically, and for the benefit of us all?”, asks Karoliina Snell, interaction coordinator of DataLit.
DataLit is an interdisciplinary collaboration between social sciences, law, and AI methods, that aims at understandable and trustworthy practices for utilising Finnish health, social, and welfare data. It brings together people from many different fields.
“We need discussion between many experts and fields of science, like computer science, sociology, law, and cognitive science. We need tools and understanding on many levels”, says Snell.
DataLit wants to raise awareness about the need for data literacy. It approaches this by busting myths and common misunderstandings about data.
Ethical problems with data
The data sets around us do not always represent the whole population they claim to represent; for instance, a data set might not include data from minorities or all socio-economic groups. A well-known example comes from facial recognition software, the training data for which has often been composed mainly of white men. Consequently, there have been problems with recognizing people of colour. Data bias can produce inequality, and in worst cases, racism and discrimination.
“Another ethical problem is not related to how machines learn and interpret data, but to how we humans understand it. If we don’t know how models, algorithms and databases are constructed, or cannot interpret numbers or visualisations, we cannot make responsible and informed decisions”, says Snell.
The project also analyses key ethical concepts such as privacy. To find out whether there is a constant trade-off between privacy and the effective use of data, it is necessary to both understand how tools like machine learning and differential privacy work, and to grasp the details of legal regulation as well as citizens’ beliefs about legitimate uses of data.
“Securing privacy is highly important, but even if we can guarantee privacy, people might not want to share their information to all uses”, says Snell.
Aiming at good data governance
The DataLit project develops understandable and trustworthy practices for processing, analysing, and utilising Finnish health, social, and welfare data. For example, the project develops concrete methods for creating anonymised synthetic data in order to make the use of personal data in various tasks easier.
"Good data governance requires both an understanding of what data is and a grasp of the complex socio-legal-technical issues related to it”, says Petri Ylikoski, the leader of the DataLit consortium and professor at the University of Helsinki.
According to Ylikoski, the organizations that have our data must be transparent and responsible in how they take care of it. Nevertheless, also citizens and decision-makers have to understand what they are dealing with. In other words, both need data literacy.
"Ultimately, improved data literacy means better decision-making, more realistic expectations about the possibilities of data analytics, and sharper critical discourse on the future dangers of these technologies.”, Ylikoski summarizes the purpose of the project.
DataLit was granted a major Academy of Finland funding. The first phase of the DataLit project will take three years and its total budget is € 3.9 million. In addition to the University of Helsinki and Aalto University, the University of Eastern Finland and several other cooperative partners are involved in the project.