Characterizing the Influence of Training Data to Enable Trustworthy Machine Learning by Mr Nick Jia
Abstract
As machine learning (ML) systems are now pervasive in our daily lives, it is urgent to understand to what extent they can be trusted. Demonstrating the trustworthiness of an ML system is challenging due to the “black‑box” nature of ML models: with only access to final model parameters, it is difficult to pinpoint the root causes of specific vulnerabilities or behaviors. For example, we cannot easily tell which underlying training data is responsible. In this talk, I will demonstrate how we can tackle this challenge by looking beyond model parameters: my analysis of the training process characterizes the contribution of training data while learning model parameters. Specifically, analyzing and controlling data’s significance over the training process allows me to not only guarantee security in ML systems, but also have ML systems meet critical societal and regulatory demands such as privacy and AI governance.
About the Speaker
Hengrui (Nick) Jia is a PhD student at the University of Toronto and Vector Institute, advised by Prof. Nicolas Papernot. His research interests are at the intersection of security and machine learning. To enable more trustworthy machine learning, he has worked on topics including model ownership resolution, machine unlearning, backdoor attacks, and differential privacy. He is also a recipient of the Vector Scholarship in AI, the Mary H. Beatty Fellowship, and the Ontario Graduate Scholarship.