A Brief Overview of the Success Story of Large Language Models

Timo Denk, Zalando

May 20, 2021

Abstract

BERT here, GPT there, [you name it] is all you need... Self-supervised, large-scale language models have been vastly successful in recent years. Today's top scorers on NLP benchmarks are almost exclusively Transformer-based architectures. This DSC talk will give a brief overview of the success story of large language models. We will look into the most prominent papers that are built on top of the Transformer, discuss current research directions, and develop an intuition for why ultra-large language models work so exceptionally well.

Bio

Timo Denk received his Bachelor's degree in Computer Science from DHBW Karlsruhe in 2019, working on Wordgrid, an approach to understanding documents with 2-dimensional structure, by using word-level information. With his wide range of interests, he worked on a wide variety of projects during his studies, ranging from 3d-printing a violin to a smart poker table which detects player's cards and calculates winning probabilities for each player. Since 2016, he also operates a hardware and software development company focusing on microcontroller programming. In 2019, he joined Zalando's outfits team as an Applied Scientist, where he works on outfit generation with methods from natural language processing (NLP).
For more information about Timo please check out his website timodenk.com,
and for more information about this Meetup have a look at freiburg.ai.

Event Info

The event will take place on Thursday, May 20th, 2021 at 8:00pm . It will be a joint event with Freiburg AI and will take place online. You can join the live stream at this URL! Kindly help us plan ahead by registering for the event on our meetup page.