Advancing Exploration in Reinforcement Learning: Toward Practical Embodied Control by Professor Leong Hou U

22 Jan 2026 03.30 PM - 04.30 PM LT14 (NS2-04-09) Current Students, Industry/Academic Partners

Abstract

Exploration remains a key barrier to deploying reinforcement learning in realistic embodied settings, where agents must act under high-dimensional visual observations, sparse and delayed rewards, and often overactuated control interfaces. This talk presents a line of research that makes exploration more practical and scalable by progressively introducing structure into both representation and intrinsic motivation. We first revisit metric-based intrinsic bonuses and propose an effective discrepancy metric with adaptive scaling to improve robustness on hard exploration benchmarks. We then move beyond raw novelty by learning compact representations in a behavioral metric space and rewarding value-diverse, behaviorally distinct trajectories for scalable exploration in high-dimensional environments. To address long-horizon embodied tasks, we introduce latent “foresight” via diffusion-based self-prediction and a latent-space exploration reward, demonstrating gains in navigation/manipulation and real-world indoor deployment. Finally, for overactuated musculoskeletal control, we discover disentangled synergy patterns and learn policies entirely in a synergy-aware latent action space, improving efficiency and generalization.

 

Biography

Leong Hou U is currently an Associate Professor in the Department of Computer and Information Science at the University of Macau, Director of the Data Science Center. His research focuses on interdisciplinary areas at the intersection of artificial intelligence and data engineering, including traffic data optimization, spatiotemporal databases, large-scale data visualization, graph neural networks, and reinforcement learning. His team has published over 80 papers in leading journals and conferences such as SIGMOD, VLDB, ICDE, NeurIPS, AAAI, ICLR, IJCAI, and KDD. In recent years, the team has led and participated in multiple national and regional key R&D projects, including the National Key R&D Program on efficient integration and dynamic cognition technologies for urban public services, the Macau Science and Technology Development Fund key project on collaborative intelligence–driven autonomous driving, and a 2024 project on urban traffic perception fusion and intelligent reasoning that received the Second Prize of the Science and Technology Invention Award. He is also actively engaged in the international research community, serving in program and organizing committees for major conferences such as BigData, IJCAI, ICDE, DASFAA, and PAKDD, and has been a committee member of the China Association of Young Scientists (Information and Electronic Science) and the Urban Planning Committee of the Macao SAR Government since 2020, promoting the integration of scientific research with urban development policy.