Researchers from Anthropic have found that some AI models hide their ‘thought’ processes, even when they are designed to show it in full.
Simulated reasoning (SR) models are AI models designed to use logic in their outputs – equivalent to showing your work in school. The idea is to bring more transparency and safety to AI use, but researchers from Anthropic have found these models often hide the fact that they’ve used external help or taken shortcuts, despite their programming.
Models like DeepSeek’s R1, Google’s Gemini Flash Thinking and Anthropic’s own Claude 3.7 Sonnet Extended Thinking (DeepSeek and Claude were used in this research) are all examples of reasoning models that rely on a process called chain-of-thought (CoT). This is intended to display each step an AI model has taken as it goes from prompt to output.
Read the full article on Computing here.