← Back to home

AI Safety and Alignment Curriculum

EducationAI Safety
AI SafetyAlignmentEducationCurriculumFellowship

An 8-week structured curriculum introducing fellows and students to AI Safety and Alignment. Covers machine learning foundations, alignment challenges, interpretability, control methods, red-teaming, and governance.

Overview

This curriculum provides a comprehensive introduction to AI Safety and Alignment, structured as an 8-week guided reading program. It begins with foundational machine learning concepts (Week 0, optional prerequisite) and progresses through increasingly advanced topics including mechanistic interpretability, scalable oversight, deceptive alignment, and governance frameworks. Designed for individuals preparing to work on AI safety research, particularly those joining fellowships like MATS, SERI, or similar alignment programs.

Who This Is For

Fellows, students, and researchers preparing to work in AI Safety and Alignment

What's Included

  • 8-week structured syllabus with progressive difficulty
  • Week-by-week topic themes and conceptual focus
  • Curated core readings and optional further readings
  • Time estimates for each reading/video
  • Covers ML foundations, alignment challenges, interpretability, control methods, and governance
  • Designed for fellowship preparation and self-study

Curriculum Structure

The curriculum is organized as a progressive 8-week program, with each week focusing on a specific theme or set of concepts:

  • Week 0 (Optional Prerequisite): Foundational machine learning concepts for those new to the field
  • Weeks 1-8: Progressive exploration of AI Safety and Alignment topics, from basic concepts to advanced research areas

Each week includes curated core readings, optional further readings, and time estimates to help you plan your study schedule. Topics covered include mechanistic interpretability, scalable oversight, deceptive alignment, red-teaming methodologies, and governance frameworks.