OpenAI GPT-5.1 Safety System Card: Expanded Mental Health Evaluations

TL;DR

OpenAI has published a system card addendum for GPT-5.1 Instant and GPT-5.1 Thinking models, providing updated baseline safety metrics. The expanded pre-deployment safety review now includes evaluations for mental health scenarios—covering situations where users may be experiencing delusions, psychosis, or mania—and assessments for emotional reliance on ChatGPT. The comprehensive safety mitigations remain largely consistent with the original GPT-5 System Card.

Opening

As OpenAI prepares to deploy GPT-5.1 Instant and GPT-5.1 Thinking—iterations of the GPT-5 series—the company has released an updated system card addendum detailing safety evaluations and mitigations. The document marks an expansion of baseline safety assessments beyond technical capability metrics.

Context: Expanded Safety Scope

GPT-5.1 Instant features improved conversational style, enhanced instruction following, and adaptive reasoning capability that determines when to think before responding. GPT-5.1 Thinking adapts its processing time more precisely to question complexity. GPT-5.1 Auto continues routing queries to the most suitable model variant.

The system card addendum introduces two new evaluation categories within pre-deployment safety reviews. Mental health evaluations assess model behaviour in situations where users show signs of isolated delusions, psychosis, or mania. Emotional reliance evaluations examine outputs related to unhealthy emotional dependence or attachment to ChatGPT.

These additions reflect OpenAI’s evolving understanding of AI safety risks beyond traditional concerns like harmful content generation or factual accuracy. The focus on psychological impacts acknowledges that conversational AI systems increasingly serve as tools for personal support and decision-making.

Looking Forward

The comprehensive safety mitigations for GPT-5.1 models remain largely unchanged from the GPT-5 System Card, suggesting OpenAI’s safety framework has matured into a stable baseline approach. The expanded evaluation categories indicate the company’s recognition that safety considerations must evolve as AI systems become more conversational and users develop different interaction patterns.

As GPT-5.1 Auto handles model routing automatically, most users won’t directly select between model variants. This abstraction layer means safety considerations must work across different model capabilities and use cases simultaneously.

Source: OpenAI

TL;DR

Opening

Context: Expanded Safety Scope

Looking Forward

Share this article

OpenAI Launches GPT-5.1 with Enhanced Conversational Style and Customisation

OpenAI introduces GDPval: new evaluation for real-world AI performance

Karpathy Challenges AI Optimism: 'The Models Are Not There'

TL;DR

Opening

Context: Expanded Safety Scope

Looking Forward

Share this article

Related Articles

OpenAI Launches GPT-5.1 with Enhanced Conversational Style and Customisation

OpenAI introduces GDPval: new evaluation for real-world AI performance

Karpathy Challenges AI Optimism: 'The Models Are Not There'