back

CoT May Be Highly Informative Despite “Unfaithfulness”

Source

Published

Oct 12, 2025

Share On

Get SIGNAL/NOISE in your inbox daily

Recent work from Anthropic and others claims that LLMs’ chains of thoughts can be “unfaithful”. These papers make an important point: you can’t take everything in the CoT at face value. As a result, people often use these results to conclude the CoT is useless for analyzing and monitoring AIs. Here, instead of asking whether the CoT always contains all information relevant to a model’s decision-making in all problems, we ask if it contains enough information to allow developers to monitor models in practice. Our experiments suggest that it might.

CoT May Be Highly Informative Despite “Unfaithfulness”

Recent Stories

Frontiers | Artificial Intelligence vs Human Evaluation of Anesthesia Education Videos: A Comparative Analysis Using Validated Quality Scales

The Race to Build the DeepSeek of Europe Is On

Ed Zitron on big tech, backlash, boom and bust: ‘AI has taught us that people are excited to replace human beings’