Organizers
Talks
Machine Learning in the Presence of Adversaries
Abstract: Machine Learning in the presence of adversaries AI, and machine learning in particular, is being applied in a wide range of problem domains. Security/privacy problems are no exception. Typically, effectiveness of ML applications is evaluated in terms of considerations like accuracy and cost. An equally important consideration is how adversaries can interfere with ML applications. Considering the adversary at different stages of a ML pipeline can help us understand different security and privacy problems of ML applications.
Speaker: N. Asokan
Android Malware Classification: How to Deal With Extremely Sparse Features
Abstract: In this talk I provide some insights about the specifics of working with high-dimensional features from Android files. Since each package to be classified can be described based on strings extracted from its files, the overall feature size grows drastically with the size of training set. To deal with sparse feature sets, we experimented with various approaches including log-odds ratio, random projections, feature clustering and non-random matrix factorization. In this talk, I describe the framework for Android Malware Classification with a focus on the proposed dimensionality reduction approaches.
Speaker: Luiza Sayfullina
Reducing False Positives in Intrusion Detection
Abstract: The F-Secure Rapid Detection and Response Service is an intrusion detection service provided by F-secure to companies. In this solution, we analyze the events generated by the clients, and raise an alarm when suspicious behavior occurs. These alarms are further analyzed by experts, and if needed, a client is contacted. Some of these alarms are false positives, resulting in unnecessary analysis work by the experts, and in this talk we describe the challenges, and the approach to reduce such false positives.
Speaker: Nikolaj Tatti
Stealing DNN Models: Attacks and Defenses
Abstract: Today machine learning models constitute business advantages to several companies. Companies want to leverage ML models to provide prediction services to clients. However, direct (i.e. white-box) access to models has been shown to be vulnerable to adversarial machine learning, where a malicious client may craft adversarial examples — samples that that by design are misclassified by the model. This has serious consequences for several business sectors, including autonomous driving and malware detection. Model confidentiality is paramount in these scenarios.
Consequently, model owners do not want to reveal the model to the client, but may provide black-box predictions via well-defined APIs to them. Nevertheless, prediction APIs still leak information (predictions) that make it possible to mount model extraction attacks by repeatedly querying the model via the prediction API. Model extraction attacks threaten the confidentiality of the model, as well as the integrity, since the stolen model can be used to create transferable adversarial examples.
We analyze model extraction attacks on DNNs via structured tests and present a new way of generating synthetic queries, which outperforms state-of-the-art. We then propose a generic approach to effectively detect model extraction attacks: PRADA. It analyzes the distribution of successive queries to the model evolves and detects abrupt deviations. We show that PRADA can detect all known model extraction attacks with a 100% success rate and no false positives.
Speaker: Mika Juuti
Differential Privacy and Machine Learning
Abstract: Differential privacy provides a flexible framework for privacy-aware computation. It provides strong privacy guarantees through requiring that the results of the computation should not depend too strongly on any single individual’s data. In my talk I will introduce differentially private machine learning, with an emphasis on Bayesian methods. I will also present an application of differentially private machine learning to personalised cancer drug sensitivity prediction using gene expression data.
Speaker: Antti Honkela
Differentially private Bayesian learning on distributed data
Abstract: Many applications of machine learning, for example in health care, would benefit from methods that can guarantee privacy of data subjects. Differential privacy (DP) has become established as a standard for protecting learning results. The standard DP algorithms require a single trusted party to have access to the entire data, which is a clear weakness, or add prohibitive amounts of noise to learning. I discuss a novel method for DP Bayesian learning in a distributed setting, where each party only holds a single sample or a few samples of the data. The method relies on secure multi-party computation combined with the well-established Gaussian mechanism for DP. The talk is based on our recent paper.
Speaker: Mikko Heikkilä
Privacy Preservation with Federated Learning in Personalized Recommendation Systems
Abstract: Recent events have brought public attention to how companies capture, store and exploit user’s personal data in their various services.
The EU’s GDPR enforcement starts in May 2018 regulating how companies access, store and process user data. Companies can now suffer reputational damage and large financial penalties if they fail to respect the rights of users and how they manage their data. At Huawei we are looking at different approaches to enhancing Huawei user privacy while at the same time providing an optimal user experience. In this talk we discuss one approach to generating personalized recommendations for use in different Huawei mobile services based on Federated Learning. The target of the research has been to generate high quality personalized recommendations on mobile devices without moving the user data from the user’s own device.
Speaker: Adrian Flanagan