BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.2.3.1//EN
TZID:Europe/Paris
X-WR-TIMEZONE:Europe/Paris
BEGIN:VEVENT
UID:2769@i2m.univ-amu.fr
DTSTART;TZID=Europe/Paris:20190301T140000
DTEND;TZID=Europe/Paris:20190301T150000
DTSTAMP:20190214T130000Z
URL:https://www.i2m.univ-amu.fr/evenements/a-general-system-of-differentia
 l-equations-to-model-first-order-adaptive-algorithms-application-to-adam/
SUMMARY: (...): A general system of differential equations to model first o
 rder adaptive algorithms. Application to ADAM.
DESCRIPTION:: A couple of years ago\, adaptive algorithms such as ADAM\, RM
 SPROP\, AMSGRAD\, ADAGRAD became the default method of choice for training
  machine learning models. Practitioners commonly observed that the value o
 f the training loss decays faster than for stochastic gradient descent\, b
 ut the inherent reason is still not understood. A motivation of our work w
 as to understand what properties make them so well suited for deep learnin
 g. In this talk\, I will analyze adaptive algorithms by studying their con
 tinuous time counterpart.I will first explain the connection between the o
 ptimization algorithms and the continuous differential equations. Then\, I
  will give sufficient conditions to guarantee convergence of trajectories 
 towards a critical value and will discuss some properties of adaptive algo
 rithms.This is joint work with A. Belotto Da Silva.http://www.math.toronto
 .edu/gazeauma/
END:VEVENT
BEGIN:VTIMEZONE
TZID:Europe/Paris
X-LIC-LOCATION:Europe/Paris
BEGIN:STANDARD
DTSTART:20181028T020000
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
END:STANDARD
END:VTIMEZONE
END:VCALENDAR