Sanghamitra Dutta: Machine Learning using Unreliable Components - From Matrix Operations to Neural Networks and Stochastic Gradient Descent

Name: Sanghamitra Dutta: Machine Learning using Unreliable Components - From Matrix Operations to Neural Networks and Stochastic Gradient Descent — FCAI
Start: 2018-04-16T09:00:00+0300
End: 2018-04-16T10:00:00+0300
Location: Aalto University

Monday, April 16, 2018
9:00 AM 10:00 AM 09:00 10:00

Aalto University Konemiehentie 2 CS building, lh T6 Finland (map)

Google Calendar ICS

Abstract: Reliable computation at scale is one key challenge in large-scale machine learning today.Unreliability in computation can manifest itself in many forms, e.g. (i) “straggling” of a few slow processing nodes which can delay your entire computation, e.g., in synchronous gradient descent; (ii) processor failures; (iii) “soft-errors,” which are undetected errors where nodes can produce garbage outputs. My focus is on the problem of training using unreliable nodes.

First, I will introduce the problem of training model parallel neural networks in the presence of soft-errors. This problem was in fact the motivation of von Neumann’s 1956 study, which started the field of computing using unreliable components. We propose “CodeNet”, a unified, error-correction coding-based strategy that is weaved into the linear algebraic operations of neural network training to provide resilience to errors in every operation during training. I will also survey some of the notable results in the emerging area of “coded computing,” including my own work on matrix-vector and matrix-matrix products, that outperform classical results in fault-tolerant computing by arbitrarily large factors in expected time. Next, I will discuss the error-runtime trade-offs of various data parallel approaches in training machine learning models in presence of stragglers, in particular, synchronous and asynchronous variants of SGD. Finally, I will discuss some open problems in this exciting and interdisciplinary area.

Parts of this work is accepted at AISTATS 2018 and ISIT 2018.

Speaker: Sanghamitra Dutta

Affiliation: PhD Candidate, Carnegie Mellon University

Place of Seminar: Aalto University

Posted in Spring 2018
Tagged Machine Learning Coffee Seminar