BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//wp-events-plugin.com//7.4.0.1//EN
TZID:Europe/Paris
X-WR-TIMEZONE:Europe/Paris
BEGIN:VEVENT
UID:6160@i2m.univ-amu.fr
DTSTART;TZID=Europe/Paris:20220204T143000
DTEND;TZID=Europe/Paris:20220204T153000
DTSTAMP:20241120T200924Z
URL:https://www.i2m.univ-amu.fr/evenements/vision-language-transformers-le
 arning-multimodal-representations-a-probing-perspective/
SUMMARY:Emmanuelle SALIN (LIS\, Aix-Marseille Université): Vision-Language
  Transformers Learning Multimodal Representations? A Probing Perspective
DESCRIPTION:Emmanuelle SALIN: In recent years\, joint text-image embeddings
  have significantly improved thanks to the development of transformer-base
 d Vision-Language models. Despite these advances\, we still need to better
  understand the representations produced by those models. In this paper\, 
 we compare pre-trained and fine-tuned representations at a vision\, langua
 ge and multimodal level. To that end\, we use a set of probing tasks to ev
 aluate the performance of state-of-the-art Vision-Language models and intr
 oduce new datasets specifically for multimodal probing. These datasets are
  carefully designed to address a range of multimodal capabilities while mi
 nimizing the potential for models to rely on bias. Although the results co
 nfirm the ability of Vision-Language models to understand color at a multi
 modal level\, the models seem to prefer relying on bias in text data for o
 bject position and size. On semantically adversarial examples\, we find th
 at those models are able to pinpoint fine-grained multimodal differences. 
 Finally\, we also notice that fine-tuning a Vision-Language model on multi
 modal tasks does not necessarily improve its multimodal ability. We make a
 ll datasets and code available to replicate experiments.\n&nbsp\;
CATEGORIES:Séminaire,Signal et Apprentissage
END:VEVENT
BEGIN:VTIMEZONE
TZID:Europe/Paris
X-LIC-LOCATION:Europe/Paris
BEGIN:STANDARD
DTSTART:20211031T020000
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
END:STANDARD
END:VTIMEZONE
END:VCALENDAR