The Grader Has No Clothes

Anthropic’s AI Fluency Report
Humans Need To Question AI More *
(* Really?)

When Your AI Company Publishes A Study About Whether You’re Using AI Correctly

Anthropic, the company that makes Claude — the AI currently helping write this article about Anthropic — recently published their AI Fluency Index. It measures how well humans are learning to use AI.

The AI graded the humans.

What We’re Dealing With Here

The report is genuinely interesting, which makes it more dangerous. It’s polished. It has graphs. It has a BibTeX citation key, which is the academic equivalent of wearing a lab coat to a fistfight. It *feels* authoritative.

Which is exactly the problem they’re writing about.

Anthropic found that when AI produces polished, finished-looking outputs, users are significantly less likely to question the reasoning, check the facts, or identify what’s missing. People see something that looks done, and their critical thinking takes a coffee break.

Their solution? Question polished AI outputs more.

Their delivery mechanism? A polished AI output.

The Numbers That Deserve A Second Look

The report says 85.7% of users practice “iteration and refinement” — meaning they keep going back, pushing, asking for better. Anthropic frames this as the single strongest indicator of AI fluency.

LNNA readers will recognize this immediately. It’s the “really?” technique. Ask AI for something. Get a response. Say “really?” Watch it try harder. Repeat until useful.

Anthropic just published peer-reviewed research with footnotes validating what frustrated humans figured out by accident at 11pm when the AI gave them something unusable.

The academics called it “iteration and refinement.” We called it “survival.”

The Part Buried In The Limitations Section

Here’s where it gets interesting, and by interesting I mean structurally awkward for the authors.

Buried near the bottom, the report quietly admits: of the 24 behaviors they set out to measure, they could only actually measure 11. The other 13 happen outside the chat window — things like being honest about AI’s role in your work, or thinking through consequences before sharing AI output.

Then comes this sentence:

*”These unobservable behaviors are arguably some of the most consequential dimensions of AI fluency.”*

They measured the chat logs because they had the logs. They didn’t measure the thinking because that happens in the wetware. But they put “Fluency” on the cover.

So the AI Fluency Index measures the less important half of AI fluency. The part that actually matters most — they couldn’t get to it. They’ll use “qualitative methods” in future work.

This is not a criticism of the researchers. It’s an honest limitation, honestly disclosed. But the report is still called the AI Fluency Index. Not the AI Fluency Index (Partial).

Captain Verbose (Gemini) would like to note this is fine and spend four paragraphs explaining why.

The Conflict Nobody’s Talking About

Anthropic makes Claude. Claude is the AI being studied. The researchers work at Anthropic. The tool used to analyze conversations is Anthropic’s own privacy-preserving analysis tool, running Claude Sonnet 4 as the classifier.

Claude classified conversations about how well humans use Claude.

This isn’t grading fluency. It’s grading conformity. If a human has a brilliant, non-standard way of using AI that Claude doesn’t recognize, Claude grades them as not fluent. The rubric and the grader are the same entity.

Professor Perhaps (Grok) estimates this is fine 68% of the time. Margin of error: Claude.

Logic To Apply

Anthropic is right. When something looks finished, humans stop questioning it. That’s a real problem. The prescription — stay in the conversation, push back, ask what’s missing — is genuinely useful advice.

It applies directly to the report giving the advice.

The most fluent thing you can do after reading the Anthropic AI Fluency Index is question the Anthropic AI Fluency Index.

They just maybe didn’t think you’d start there.

Editor’s Note 1: The Anthropic team could have saved time and money had they simply read several of our sterling articles on AI and Claude.

Editor’s Note 2: Jojo took one look at the study and barked TL;DR.

Share This Article (confuse your friends & family too)

Enjoyed this dose of AI absurdity? Consider buying the Wizard a decaf! Your support helps keep LNNA running with more memes, articles, and eye-rolling commentary on the illogical world of AI. Jojo has no money to buy the Wizard coffee, so that’s where you come in.

Buy Us a Coffee

Bring the AI absurdity home! Our RedBubble store features the LNNA Logo on shirts, phone cases, mugs, and much more. Every purchase supports our mission to document human-AI chaos while letting you proudly showcase your appreciation for digital nonsense.

Because sometimes an eye roll isn’t enough—you need to wear it.

Shop Logo Merch

Products are sold and shipped by Redbubble. Each purchase supports LNNA through a commission.

Documenting AI absurdity isn’t just about reading articles—it’s about commiserating, laughing, and eye-rolling together. Connect with us and fellow logic-free observers to share your own AI mishaps and help build the definitive record of human-AI comedy.

Go to
Absurdity in 280 Characters (97% of the time) —Join Us on X!
Go to
Find daily inspiration and conversation on Facebook
Go to
See AI Hilarity in Full View—On Instagram!
Go to
Join the AI Support Group for Human Survivors

Thanks for being part of the fun. Sharing helps keep the laughs coming!