Lectures on
"High-Dimensional Data Analysis"
21-22 october 2021


Program

Thursday, October, 21st
Aurélien Garivier This course will introduce the notion of concentration of measure and highlight its applications, notably in high dimensional data processing and machine learning. The course will start from deviations inequalities for averages of independent variables, and illustrate their interest for the analysis of random graphs and random projections for dimension reduction. It will then be shown how other high-dimensional random functions concentrate, and what guarantees this concentration yields for randomized algorithms and machine learning procedures to learn from large training collections.

[Video of the morning session] (restricted access)
[Video of the afternoon session] (restricted access)
8:45 Welcome
9.00-10:50 Lecture
10:50-11:10 Break
11.10-12:15 Tutorial
[Exercices]
12:15-14:00 Lunch
14:00-15:00 Lecture
15:00-15:15 Break
15:15-16:15 Tutorial
16:15-16:30 Break
16:30-17:30 Lecture

Friday, October, 22nd
9:30-10:00 Coffee and introduction
10.00-10.45 Agnès Desolneux A Wasserstein-type distance in the space of Gaussian mixture models
Abstract.
[Slides]
[Video of the talk] (restricted access)
Abstract
In this talk, we introduce a Wasserstein-type distance on the set of Gaussian mixture models. This distance is defined by restricting the set of possible coupling measures in the optimal transport problem to Gaussian mixture models. We derive a very simple discrete formulation for this distance, which makes it suitable for high dimensional problems. We also study the corresponding multi-marginal and barycenter formulations. We show some properties of this Wasserstein-type distance, and we illustrate its practical use with some examples in image processing. This is a joint work with Julie Delon.
10.45-11:15 François-Xavier Vialard Breaking the Curse of Dimension in Smooth Optimal Transport Estimation
[Slides]
Abstract
TBD
11:15-11:45 Break
11:45-12:15 Elsa Cazelles A novel notion of barycenter for probability distributions based on optimal weak mass transport
Abstract.
[Slides]
[Video of the talk] (restricted access)
Abstract
We introduce weak barycenters of a family of probability distributions, based on the recently developed notion of optimal weak transport of mass. We provide a theoretical analysis of this object and discuss its interpretation in the light of convex ordering between probability measures. In particular, we show that, rather than averaging in a geometric way the input distributions, as the Wasserstein barycenter based on classic optimal transport does, weak barycenters extract common geometric information shared by all the input distributions, encoded as a latent random variable that underlies all of them. We also provide an iterative algorithm to compute a weak barycenter for a finite family of input distributions, and a stochastic algorithm that computes them for arbitrary populations of laws. The latter approach is particularly well suited for the streaming setting, i.e., when distributions are observed sequentially.
12:15-14:00 Lunch
14:00-14:45 Lénaïc Chizat Analysis of gradient descent on wide two-layer ReLU neural networks
Abstract.
[Slides]
[Video of the talk] (restricted access)
Abstract
In this talk, we propose an analysis of gradient descent on wide two-layer ReLU neural networks that leads to sharp characterizations of the learned predictor. The main idea is to study the training dynamics when the width of the hidden layer goes to infinity, which is a Wasserstein gradient flow. While this dynamics evolves on a non-convex landscape, we show that for appropriate initializations, its limit, when it exists, is a global minimizer. We also study the implicit regularization of this algorithm when the objective is the unregularized logistic loss, which leads to a max-margin classifier in a certain functional space. We finally discuss what these results tell us about the generalization performance, and in particular how these models compare to kernel methods.
14:45-15:15 Geneviève Robin Outliers detection in networks with missing links
Abstract.
[Slides]
[Video of the talk] (restricted access)
Abstract
Networks are powerful to analyse complex systems: agents are represented as nodes, and pairwise interactions between agents are recorded as edges between these nodes. An important field of application is sociology, where the recent development of online social networks offers unprecedented possibilities. In this example, estimating the underlying probabilities of interactions between agents is a key problem. However, in many examples existing models fail to describe networks containing a small number of outlier nodes with abnormal behaviour. For example, fake accounts created to spread fake news. In addition, real-life networks are likely to be incompletely observed, with missing links due to individual non-response or machine failures. Identifying outliers in the presence of missing links is therefore a crucial problem. In this work, we introduce a new algorithm to detect outliers in a network that simultaneously predicts the missing links. Under fairly general assumptions, our algorithm exactly detects the outliers, and achieves the best known error for the prediction of missing links with polynomial computation cost. We also illustrate the method numerically and present to applications in epidemiology and social network analysis.
15:15-15:45 Break
15:45-16:15 Nicolas Keriven Graph neural networks on large random graphs: convergence, stability, universality
Abstract.
[Slides]
[Video of the talk] (restricted access)
Abstract
In this talk, we will discuss some theoretical properties of Graph Neural Networks (GNNs) on large graphs. Indeed, most existing analyses of GNNs are purely combinatorial and may fail to reflect their behavior regarding large-scale structures in graphs, such as communities of well- connected nodes or manifold structures. To address this, we assume that the graphs of interest are generated with classical models of random graphs. We first give non-asymptotic convergence bounds of GNNs toward “continuous” equivalents as the number of nodes grows. We then study their stability to small deformations of the underlying random graph model, a crucial property in traditional CNNs. Finally, we study their universality and approximation power, and show how some recent GNNs are more powerful than others.
16:15-16:45 Olga Mula Model order reduction by interpolation in Wasserstein spaces
Abstract.
[Video of the talk] (restricted access)
Abstract
TBD
16:45-17:00 Conclusions