New Advances in Multimodal Reasoning by Professor Paul Liang

22 Jan 2026 11.30 AM - 01.00 PM LT10 Current Students, Industry/Academic Partners

Abstract

Today's language models are increasingly capable of reasoning over multiple steps with verification and backtracking to solve challenging problems. However, multimodal reasoning models that can reason over integrated modalities—text, images, audio, video, sensors, and external knowledge—are still lacking and represent a key frontier for AI.

The talk describes the group’s work on advancing multimodal reasoning through new benchmarks and the training of multimodal foundation models capable of interactive and long‑range reasoning, with real‑world applications in sensing, health, and wellbeing.

 

Biography

Professor Paul Liang is an Assistant Professor at the MIT Media Lab and MIT EECS. His research advances the foundations of multisensory artificial intelligence to enhance human experience.

He has received numerous awards, including the Siebel Scholars Award, Waibel Presidential Fellowship, Facebook PhD Fellowship, Center for ML and Health Fellowship, Rising Stars in Data Science, and three best paper awards. He also received the Alan J. Perlis Graduate Student Teaching Award for developing new courses on multimodal AI.