ECLT Joint Seminar
LSD, The Distortion of Locality Sensitive Hashing
Alessandro Panconesi, Sapienza University of Rome
A common theme in machine learning and data mining is how to compute similarities and distances efficiently between data points. Locality sensitive hashing (LSH) is a powerful and elegant paradigm by which distances can be quickly estimated by means of a familiy of hash functions. A lot of work has been done to understand when LSH can be used to estimate distances and similarties.
In this paper we focus on similarities that are provably non-LSHable and propose a notion of distortion to capture their approximability by means of LSH-able similarities. We consider several well-known similarities and show tight upper and lower bounds on their distortion. We also experimentally show that our upper bounds translate to effective algorithms in practice.
Alessandro Panconesi is full professor of computer science at Sapienza, University of Rome, Computer Science department. He earned a PhD in computer science from Cornell University. His main research interest is the design and analysis of algorithms, especially probabilistic ones. For his research he has received several international awards such as: the ACM Danny Lewin Award, two Google Focused Awards, faculty awards from IBM and Yahoo! and the 2019 Edsger W. Dijkstra Prize. He is staunch supporter of AS Roma.
Twin Peaks, a stochastic model for recurrent cascades
Giuseppe Re, Sapienza University of Rome
Understanding information dynamics and their resulting cascades is a central topic in social network analysis and economics. In recent seminal work, Cheng et al. analyzed multiples cascades on Facebook over several months, and noticed that many of them exhibit a recurring behaviour. They tend to have multiple peaks of popularity, with periods of quiescence in between. In this paper, we propose the first mathematical model that provably explains this interesting phenomenon. Our model is simple and shows that a good clustering structure suffices to observe this interesting recurring behaviour with a standard information diffusion model. Furthermore, we complement our theoretical analysis with an experimental evaluation where we show that our model is able to reproduce the observed phenomenon on several social networks.
Giuseppe Re is a Computer Science PhD student at Sapienza University of Rome. He pursued his BSc in Mathematics and his MSc in Computer Science at Sapienza University of Rome. His research is oriented towards different problems at the intersection of Theory of Algorithms and Machine Learning, mostly related to Clustering. He also interned at Google London as a Software Engineer.