H.B. Keller Colloquium

Monday, May 12, 2025

4:00pm to 5:00pm

Annenberg 105

Recent Advances in AI Safety and Robustness

Zico Kolter, Professor, Machine Learning Department, Carnegie Mellon University,

As AI systems become more powerful, it is increasingly important that developers be able to strictly enforce desired policies for the systems. Unfortunately, via techniques such as adversarial attacks, it has traditionally been possible to circumvent model policies, allowing bad actors to manipulate LLMs for unintended and potentially harmful purposes. In this talk, I will highlight several recent directions of work that are making progress in addressing these challenges, including methods for robustness to jailbreaks, safety pre-training, and methods for preventing undesirable model distillation. I will additionally highlight some of the areas I believe to be most crucial for future work in the field.

For more information, please contact Narin Seraydarian by phone at (626) 395-6580 or by email at [email protected].

Event Series

H. B. Keller Colloquium Series

Event Sponsors

Computing and Mathematical Sciences (CMS) More Events from this Sponsor