Abstract: In the era of big data, large-scale classification involving tens of thousand target categories is not uncommon. Also referred to as Extreme Classification, it has also been recently shown that the machine learning challenges arising in ranking, recommendation systems and web-advertising can be effectively addressed by reducing it to extreme multi-label classification framework. In this talk, I will discuss my two recent works, and present TerseSVM and DiSMEC algorithms for extreme multi-class and multi-label classification. The precision@k and nDCG@k results using DiSMEC improve by upto 20% on benchmark datasets over state-of-the-art methods, which are used by Microsoft in production system of Bing Search. The training process for these algorithms makes use of openMP based distributed architectures, and is able to leverage thousands of cores for computation.
Speaker: Rohit Babbar
Affiliation: Professor of Computer Science, Aalto University
Place of Seminar: Aalto University