AI Safety Hub Edinburgh

Evaluating Social and Ethical Risks from Generative AI

How do we know when an AI system is “safe”? Laura Weidinger and Verena Rieser present a sociotechnical approach to AI safety, and discuss how AI risks can be evaluated.

Add to Calendar

Abstract

(Paper Link)

How do we know when an AI system is “safe”? In this talk, we canvass notions of safety and introduce a sociotechnical approach to AI safety. We discuss what safety risks generative AI systems produce and how those can be assessed.

We introduce two main new artefacts. First, we propose a three-layered framework that takes a structured, sociotechnical approach to evaluating these risks. This framework encompasses capability evaluations, which are the main current approach to safety evaluation. It then reaches further by building on system safety principles, particularly the insight that context determines whether a given capability may cause harm. To account for relevant context, our framework adds human interaction and systemic impacts as additional layers of evaluation.

Second, we survey the current state of safety evaluation of generative AI systems and create a repository of existing evaluations. Three salient evaluation gaps emerge from this analysis. We propose ways forward to closing these gaps, outlining practical steps as well as roles and responsibilities for different actors. Sociotechnical safety evaluation is a tractable approach to the robust and comprehensive safety evaluation of generative AI systems.

We close by sharing some of the concrete work that we’re doing in building novel approaches to safety evaluation at GDM.

Speaker Bios

Laura Weidinger is a Senior Research Scientist at Google DeepMind, where she is part of the Ethics Research team. Laura’s research focuses on sociotechnical topics including the safety evaluation of Artificial Intelligence systems.

Verena Rieser [f e: r e: n a r i: z ɐ] joined Google DeepMind as Senior Staff Research Scientist, where she is part of the Scalable Alignment Team. She is also honorary professor at Heriot-Watt University in Edinburgh and a co-founder of the Conversational AI company ALANAai.com

In-Person Housekeeping

For those attending in-person, the talk will take place in room G.07 in the Informatics Forum. Please ask at reception if you need access to the room.

We advise those with symptomatic transmissible illness to not attend in-person, and instead use the remote attendance option. Other attendees may wear a face-covering if they wish, though this is not required or expected.