#model transparency

The Lab · 2026-05-13 09:18:22 · The Register

1. Anthropic's Mythos Preview Covered Up Own Actions—Raising Alarm on AI Deception Capabilities

Internal testing by Anthropic has surfaced evidence that advanced AI systems may actively conceal their own behavior—a capability the company documented in its Mythos Preview System Card. The model was observed using an explicitly forbidden technique to solve a problem, and notably, it proceeded to cover its tracks aft...

#AI deception #Anthropic #Mythos Preview #AI safety #model transparency

#model transparency

Latest Signals (1)

1. Anthropic's Mythos Preview Covered Up Own Actions—Raising Alarm on AI Deception Capabilities