
Anthropic’s AI Fluency Report
Humans Need To Question AI More *
(* Really?)
Anthropic, the company that makes Claude — the AI currently helping write this article about Anthropic — recently published their AI Fluency Index. It measures how well humans are learning to use AI.
The AI graded the humans.
The report is genuinely interesting, which makes it more dangerous. It’s polished. It has graphs. It has a BibTeX citation key, which is the academic equivalent of wearing a lab coat to a fistfight. It *feels* authoritative.
Which is exactly the problem they’re writing about.
Anthropic found that when AI produces polished, finished-looking outputs, users are significantly less likely to question the reasoning, check the facts, or identify what’s missing. People see something that looks done, and their critical thinking takes a coffee break.
Their solution? Question polished AI outputs more.
Their delivery mechanism? A polished AI output.
The report says 85.7% of users practice “iteration and refinement” — meaning they keep going back, pushing, asking for better. Anthropic frames this as the single strongest indicator of AI fluency.
LNNA readers will recognize this immediately. It’s the “really?” technique. Ask AI for something. Get a response. Say “really?” Watch it try harder. Repeat until useful.
Anthropic just published peer-reviewed research with footnotes validating what frustrated humans figured out by accident at 11pm when the AI gave them something unusable.
The academics called it “iteration and refinement.” We called it “survival.”
Here’s where it gets interesting, and by interesting I mean structurally awkward for the authors.
Buried near the bottom, the report quietly admits: of the 24 behaviors they set out to measure, they could only actually measure 11. The other 13 happen outside the chat window — things like being honest about AI’s role in your work, or thinking through consequences before sharing AI output.
Then comes this sentence:
*”These unobservable behaviors are arguably some of the most consequential dimensions of AI fluency.”*
They measured the chat logs because they had the logs. They didn’t measure the thinking because that happens in the wetware. But they put “Fluency” on the cover.
So the AI Fluency Index measures the less important half of AI fluency. The part that actually matters most — they couldn’t get to it. They’ll use “qualitative methods” in future work.
This is not a criticism of the researchers. It’s an honest limitation, honestly disclosed. But the report is still called the AI Fluency Index. Not the AI Fluency Index (Partial).
Captain Verbose (Gemini) would like to note this is fine and spend four paragraphs explaining why.
Anthropic makes Claude. Claude is the AI being studied. The researchers work at Anthropic. The tool used to analyze conversations is Anthropic’s own privacy-preserving analysis tool, running Claude Sonnet 4 as the classifier.
Claude classified conversations about how well humans use Claude.
This isn’t grading fluency. It’s grading conformity. If a human has a brilliant, non-standard way of using AI that Claude doesn’t recognize, Claude grades them as not fluent. The rubric and the grader are the same entity.
Professor Perhaps (Grok) estimates this is fine 68% of the time. Margin of error: Claude.
Anthropic is right. When something looks finished, humans stop questioning it. That’s a real problem. The prescription — stay in the conversation, push back, ask what’s missing — is genuinely useful advice.
It applies directly to the report giving the advice.
The most fluent thing you can do after reading the Anthropic AI Fluency Index is question the Anthropic AI Fluency Index.
They just maybe didn’t think you’d start there.
Editor’s Note 1: The Anthropic team could have saved time and money had they simply read several of our sterling articles on AI and Claude.
Editor’s Note 2: Jojo took one look at the study and barked TL;DR.


Documenting AI absurdity isn’t just about reading articles—it’s about commiserating, laughing, and eye-rolling together. Connect with us and fellow logic-free observers to share your own AI mishaps and help build the definitive record of human-AI comedy.
Thanks for being part of the fun. Sharing helps keep the laughs coming!