

Anthropic’s models show signs of introspection
Anthropic says its most advanced systems may be learning not just to reason, but to reflect internally on how they reason.
Why it matters: These introspective capabilities could make the models safer — or, possibly, just better at pretending to be safe.