• Back
  • 4
  • The AI literacy paradox — The real leap is when we let go of AI as a tool

The AI literacy paradox — The real leap is when we let go of AI as a tool

by | Mar 26, 2026 | Expeditions, GenAI Misc, Log Diaries, Pod Chronicles | 0 comments

When a technology shift is small, the existing mental models stretch to accommodate it. When the shift is categorical, everything cracks. AI isn’t exposing a skills gap. It’s exposing a paradigm gap — between what we think we’re working with and what we’re actually working with. And for the first time, the turbulence is strong enough to trace exactly where the cracks run.

The blueprint nobody questions

For decades, every significant technology that entered the workplace followed the same pattern. A new capability arrived. We learned its features. We measured proficiency by how well people operated it. We built training programmes around it, certification levels for it, organisational structures to support it. And then we moved on to the next one.

This worked. It worked because the technologies were tools. Word processors, spreadsheets, CRM systems, design software, project management platforms — they all shared a fundamental characteristic: you operated them. You put something in. You got something out. The relationship was one-directional. The intelligence lived in the person; the tool executed.

Over those decades, we didn’t just learn the tools. We built a mental model about what technology is in the workplace. A deep, unexamined assumption: new technology means new features to learn, new interfaces to master, new skills to certify. The model became so natural it stopped being visible. Like water to fish.

AI entered the workplace through that same mental model. And at first, it fitted. Early encounters with ChatGPT really were tool-like — you typed a prompt, you got an output. The mental model held. It felt accurate because, for that moment, it was.

The AI literacy frameworks that followed encoded this model faithfully. Researchers examining 16 major AI literacy instruments found that every single one measures operational sophistication — your ability to use AI as a tool. Can you prompt effectively? Can you navigate platforms? Can you integrate AI outputs into your workflow? Thirteen of the sixteen are pure self-report. And not one measures what might be the most consequential dimension: whether you understand what AI changes about your work. Not what it can do. What it changes.

That’s not a gap in the research. That’s the mental model reproducing itself. The instruments measure what the model says matters: tool proficiency. The model says tool proficiency is what matters because that’s what the instruments measure. The loop is already running before anyone notices it’s a loop.

But here’s the number that should stop us. McKinsey’s 2025 research found that 88% of organisations now regularly use AI. Only 5–6% capture meaningful enterprise-level value. Sit with that for a moment. Nearly nine out of ten organisations have adopted AI. Fewer than one in twenty have figured out how to make it actually matter. McKinsey calls this “adoption without absorption” — deploying a technology without metabolising it into how the organisation thinks and works.

That’s not a small gap. That’s a chasm. And it’s telling us something fundamental about the nature of the problem. If this were a skills gap, training would close it. If this were a tool gap, better tools would close it. If this were an awareness gap, information would close it. But 88% adoption means the awareness is there, the tools are there, the access is there. And still — 5%. Something structural is preventing the translation from adoption to value. Something that operates below the level of tools, training, and intention.

That something is the mental model itself.

Why this one is different

Every previous technology fit into the category “tool” because it was a tool. AI doesn’t fit — and the cognitive science tells us precisely why the mismatch is so hard to see.

Michelene Chi’s research on conceptual change identifies three types, each progressively harder. The easiest is belief revision — updating a fact (“this model is more capable than I thought”). The middle is mental model transformation — restructuring your understanding (“AI can do more than I assumed”). The hardest, and the one that applies here, is categorical shift: moving a concept from one fundamental category to another entirely.

AI is migrating from what Chi calls a “direct process” entity — linear, deterministic, controlled by the user — to an “emergent process” entity — non-deterministic, adaptive, capable of initiative. This isn’t an upgrade within the tool category. It’s a departure from it. And categorical shifts are, according to Chi’s research, the most resistant form of conceptual change there is.

Why so resistant? Because the existing category doesn’t passively wait to be replaced. It actively filters incoming evidence.

Dedre Gentner’s structural mapping theory shows the mechanism. The “AI = software tool” analogy maps surface features accurately — there’s an interface, there’s input and output, there’s a screen. These surface similarities make the analogy feel correct. But it fails at the relational level. The relationship between you and AI is no longer the same as between you and a software tool. When AI behaves unexpectedly, instead of questioning the category, people revise their beliefs within it: “this tool has bugs,” “the output isn’t reliable,” “it needs better training data.” The category is preserved. The anomaly is absorbed. The shift never happens.

Stella Vosniadou’s research on conceptual change adds the next layer. When people encounter observations genuinely incompatible with their paradigm, they don’t immediately shift. They construct synthetic mental models — hybrids that blend new evidence with the old framework. For AI, this sounds like: “AI is a very powerful, somewhat unpredictable application that I control through prompts.”

That sentence feels reasonable. It probably describes your own working model, or something close to it. And that’s the danger — it’s coherent enough to feel like understanding while preventing the deeper realisation that the relationship itself has changed. From command to collaboration. From operating to orchestrating. From tool to something we don’t yet have a settled word for.

This pattern has a physical precedent that makes it easier to see. When the automobile arrived at the turn of the twentieth century, the first designs were horse carriages without horses — literally called “horseless carriages.” The driver sat high up where the coachman had sat, positioned to see over horses that weren’t there. The wheels were wooden and spoked, designed for horse speed. The suspension was built for trotting, not engines. One 1899 design — the “Horsey Horseless” by Uriah Smith — even bolted a carved wooden horse head to the front to avoid frightening real horses on the road. Every design choice came from the old category applied to the new reality. It took only about five years before the most advanced designs escaped the carriage entirely and became something genuinely new. But during that transition, the inherited mental model shaped everything that got built — including the parts that made no sense for what the technology actually was.

We’re in the horseless carriage phase of AI. The LinkedIn skill lists, the tool-proficiency frameworks, the feature-based maturity models — these are the wooden wheels and the coachman’s seat. They map the old category onto the new reality with enough surface accuracy to feel right. But they encode assumptions that belong to the previous paradigm: that the relationship is one-directional, that proficiency means operation, that literacy means knowing what buttons to press. The carved horse head on the front.

This is where AI parts company with every technology adoption that came before it. We’ve never had to make a categorical shift about what workplace technology is. Every previous adoption was an upgrade within the same category. This one requires leaving the category altogether. And the mental model we’ve spent decades building — the one that served us perfectly for every previous technology — is now the primary obstacle.

It has never been this clear. Small shifts don’t produce visible turbulence. When the change is incremental — a new version of software, a better interface, additional features — the existing mental model stretches to accommodate it without cracking. But when the shift is categorical, when it requires leaving one ontological class for another, the cracks appear everywhere. In how individuals think. In how organisations structure. In how we measure progress. And for the first time, those cracks are visible enough to trace — from the individual all the way through the organisation and back.

The loop that locks itself

Here’s where the research reveals something that hasn’t been this traceable before.

The individual’s frozen mental model doesn’t just sit quietly inside one person’s head. It shapes what they propose, what they evaluate, what they consider possible. And it enters the organisation through every meeting, every strategy document, every portfolio decision, every training programme designed by people operating from the tool paradigm.

But it doesn’t stop there. The organisation — whose structures, measurement systems, and decision frameworks were built by people with the same mental model — mirrors the freeze back. Your proposal gets evaluated through tool-paradigm criteria. Your maturity gets assessed against tool-paradigm frameworks. Your environment confirms: yes, AI is a tool. You are right to think of it that way. Here is your Level 3 score.

The individual sees confirmation. The loop tightens. And this is where the research gets genuinely new — because we can now trace the mechanisms at each stage of the loop, name them, and see why they’re so resistant to intervention.

The individual mechanisms

In 2008, Merim Bilalic and colleagues ran a study that should be required reading for anyone leading AI transformation. They put chess masters in front of problems with both a familiar solution and a better unfamiliar one, then tracked their eye movements.

The chess masters’ gaze continued fixating on features of the familiar solution while they claimed to be searching for alternatives. Performance dropped three standard deviations below normal. The bias — called the Einstellung Effect — doesn’t operate through conscious choice. It operates through attentional allocation. The first schema activated by familiar features literally directs where your eyes go. Better solutions become invisible. Not metaphorically. Literally.

Now apply this to every AI evaluation meeting you’ve ever sat in. The senior enterprise architect with decades of valuable experience evaluates AI agents through software criteria — determinism, auditability, latency — finds them wanting on those criteria, and concludes applications are still the right choice. The evaluation feels rigorous. The criteria feel objective. And the lens is wrong without anyone being able to see that it’s wrong — because the Einstellung Effect controls where their attention goes.

Nęcka, Gruszka, and Orzechowski’s research adds a critical finding: experts show strong intra-domain rigidity resistance (they can resist being misled within their domain) but heightened inter-domain rigidity (they perform worse than non-experts when the fundamental rules change). The shift from tool-use to AI collaboration is precisely an inter-domain shift — and expertise in the old domain makes it harder, not easier, to navigate.

Erik Dane’s work on cognitive entrenchment explains why this doesn’t feel like inflexibility from the inside. Deep expertise offers “perceived optimal efficiency” — the entrenched individual stays with familiar patterns because they minimise cognitive load and maintain the feeling of competence. Entrenchment feels like competence. The expert doesn’t experience themselves as stuck. They experience themselves as experienced.

And then there’s what happens when someone has made the shift and tries to bring it into the room. Research by Ackerhans and Wehkamp on medical professionals found that the loss of autonomy in decision-making drives resistance more than fear of replacement. The primary psychological mechanism isn’t “AI will take my job.” It’s “AI will take my agency in my own work.” When you propose a fundamentally different way of working with AI — collaboration instead of operation — you’re not just suggesting a new tool. You’re implicitly suggesting that the expert’s model of their own role needs updating. That triggers identity threat at four distinct levels: self-esteem (“Am I still valuable?”), self-efficacy (“Can I remain effective?”), continuity (“How do I maintain my identity through this?”), and distinction (“What makes me uniquely human?”).

No wonder the proposal gets dismissed.

The organisational amplification

If these were just individual cognitive patterns, training might address them. But individual patterns don’t persist in isolation. They persist because the social environment rewards them and the organisational structure sustains them.

A field experiment with 450 workers revealed something counterintuitive: when AI use is visible to evaluators, people reduce their reliance on AI by 14%. Accuracy declined 3.4% — they were performing worse by hiding their AI use — but the social calculus made hiding rational. Being seen using AI carries risk. So people who’ve made the categorical shift retreat into private experimentation and perform as tool operators in public.

This connects to a pattern confirmed across 14 countries by the Behavioural Insights Team: stated acceptance of AI is systematically more positive than actual behavioural adoption. People say they’re ready. Their behaviour says otherwise. The gap between stated acceptance and behavioural adoption tells us something important about the environment, not the individuals.

And this is where the social cognitive biases enter — not as individual quirks but as structural load-bearing mechanisms that keep the loop running.

Groupthink validates frozen mental models: “We all agree AI isn’t mature enough for that.” The consensus makes each individual’s Snapshot Freeze feel like shared knowledge rather than shared limitation.

Conformity bias makes the person who sees the shift feel like the problem: proposing something outside the group’s paradigm doesn’t feel like offering insight — it feels like breaking social contract.

Authority bias amplifies the Expertise Shield: senior people’s outdated models carry more weight precisely because of their seniority, regardless of whether their models are current.

Status quo bias operates at group level: existing tools, existing processes, existing portfolio decisions are “proven.” Change requires justification; staying the same does not.

Adolfo Carreno’s research on organisational immune systems describes what happens when these biases institutionalise. Resistance to change functions like a biological immune response — operating through pattern recognition, learned responses, and selective memory to protect organisational stability. When defensive responses harden into what Carreno calls “immunity memory,” the organisation begins to misidentify productive novelty as pathogen. Innovation gets neutralised not because it doesn’t work, but because it’s perceived as instability threatening homeostasis.

Chris Argyris identified the mechanism that holds it all together: organisational defensive routines. Recent research by Yang, Secchi, and Homberg maps four manifestations: rigidity (resistance to changing established procedures), embarrassment avoidance (suppression of critical doubts), cover-up (concealing mistakes or using intentional vagueness), and pretense (acting as if the official strategy is functional when everyone quietly knows it isn’t).

The result is what Argyris called “the undiscussable” — patterns that cannot be discussed without threatening social belonging. Everyone in the room knows the official AI picture doesn’t quite match ground reality. Saying so has costs. Not saying so is safer. The gap between official story and lived experience normalises until it becomes the water everyone swims in.

And here’s the insight that changes the frame entirely: these biases aren’t bugs. They’re adaptive collective coping mechanisms. When measurement tools show green and daily experience shows friction, the human brain has to resolve that dissonance. Social biases are how groups resolve it collectively. They’re the immune system of the status quo — and they serve a protective function. Attacking the biases without addressing the structural conditions that make them necessary produces anxiety, not progress.

Meyer and Rowan described this dynamic as decoupling — organisations deliberately adopt formal policies that satisfy external stakeholders while buffering their internal core from actual disruption. Formal AI strategy documents live on one track. How people actually work lives on another. The organisation maintains what researchers call “dual consciousness” — two simultaneous operating realities. This isn’t pathological. It’s structural. And it’s the mechanism by which organisations absorb the pressure of transformation without actually transforming.

The loop is now complete. Individual mental model → organisational structures → social dynamics that protect the structures → confirmation signals back to the individual. Each component sustains the others. And the loop operates at every scale — between individuals in a team, between teams in an organisation, between organisations in an industry, between institutions and the sectors they serve. Same mechanisms. Different boundaries. Same result: the categorical shift that needs to happen gets absorbed, buffered, and neutralised.

This is not a dark picture. It’s a human one. These are coping mechanisms for genuine structural incoherence, not character flaws. But seeing them clearly — seeing the full loop — is the first step toward intervening in it. Because you can’t change a system you can’t see.

What the mirror shows you

Existing AI literacy frameworks share a structural flaw that the loop makes visible: they only face upward. They describe aspiration — here’s Level 1, here’s Level 5, here’s the path between them. They assume linear progression. And they measure the individual in isolation, as if the environment doesn’t determine what’s possible.

The research tells us this is like a doctor who can only describe health but can’t diagnose illness. “Here’s what Level 4 looks like. You’re not there yet.” But no explanation of what’s preventing movement. No distinction between someone who lacks capability and someone who lacks the conditions to demonstrate capability. No recognition that the environment might be the constraint, not the individual.

What would it look like to build a diagnostic that captures both sides?

I’ve been developing something I call the Dual-Perspective AI Literacy Model. It starts with a simple observation: your effective capability isn’t determined by your level alone. It’s determined by the relationship between your level and your environment’s level.

Two gauges, not one

Imagine two gauges side by side. The left shows your AI literacy — your personal understanding, practice, and flexibility with AI. The right shows your environment’s collective mental model — the paradigm operating in your team, your organisation, your industry.

When the readings align, there’s no friction. Both at Level 2? Everything feels functional. You’re productive, valued, making sense to others.

When you’re ahead of your environment, the gap becomes friction. Your proposals can’t be parsed by the paradigm doing the evaluating. At one level apart: mild frustration. At two or three: structural disconnection. Your effective impact is constrained to the lower reading — a Level 4 practitioner in a Level 1 environment operates at Level 1 in that context.

When your environment is ahead of you, you feel something different: things moving too fast, conversations in a language you don’t quite speak, a vague sense that the ground rules changed while you weren’t looking.

What the organisation mirrors back

The dual gauge shows the distance. But to understand what you’ll actually encounter, you need to see what the organisation reflects back at each level — including the social cognitive artifacts that maintain the freeze.

Level 1 — The Frozen Frame. The mental model of AI was formed from early encounters and cached as settled knowledge. Chi’s ontological mismatch is operating invisibly — the category was set, everything since has been filtered through it. Groupthink validates: “We all agree AI isn’t ready for serious work.” There’s no felt friction because you’re producing the current, not swimming against it. The organisation mirrors back confirmation: “We’re being appropriately cautious.”

Level 2 — The Tool Operator. The comfort zone. The organisation rewards tool proficiency, and you are genuinely proficient. The Expertise Shield is forming — AI is “a tool I operate with skill,” evaluated by speed, accuracy, reliability. Dane’s cognitive entrenchment is active, and it feels like competence. Someone proposing AI as a “collaborator” sounds impractical from here. Bandwagon effect normalises the position: “Nobody else is doing this differently either.” The mirror says: “You’re doing well. Keep developing your skills.”

Level 3 — The Pattern Recogniser. The gap opens. You’ve crossed what Chi would call the ontological threshold — you’ve begun experiencing AI as a different kind of entity. Your environment hasn’t. The same colleagues who felt like peers now feel like barriers. The Einstellung Effect is visible to you, operating in others. Conformity pressure pushes back toward Level 2 — the 14% visibility retreat is rational here. You use AI as a collaborator privately and perform as a tool operator in public. The mirror says: “You’re overthinking this. Just focus on what works.”

Level 4 — The Paradigm Navigator. The gap becomes structural. You think in capabilities and redesigned workflows. Your environment thinks in tools and applications. Your proposals sound “too ambitious” for the meeting format. And here Sen’s Capability Approach becomes startlingly relevant: you may have the capability, but you lack the *conversion factors* to demonstrate it — role, visibility, access, time, platform. The maturity framework says Level 4 is achievable. The organisation’s architecture won’t let you operate above Level 2. The mirror says: “Interesting ideas, but let’s be realistic about what we can implement.”

Level 5 — The System Architect of Change. You see the full loop — including yourself in it. The risk is isolation: seeing so clearly that you lose patience with the pace. Carreno’s research on organisational immune systems becomes practical knowledge — you understand that successful local pilots fail to scale when the parent organisation’s immune system treats them as foreign bodies. You design for gradual shifts, not dramatic reveals. Participatory governance — giving practitioners voice in defining what “progress” means — becomes more effective than mandate. The mirror, if the environment has begun to shift, says: “Help us see what you see.” If it hasn’t: silence.

What the dual perspective reveals

The model does something no existing framework does: it diagnoses the present condition, not just the aspiration. It makes visible that a person’s stuckness might not be their own — it might be the environment’s. It names the social cognitive biases that operate at each level, not as character flaws but as artifacts of the gap between individual understanding and organisational structure.

And it reveals the loop in actionable terms. The reason the gap persists isn’t that people refuse to learn. It’s that the environment — built on the same frozen mental model — confirms the freeze and punishes the shift. The measurement frameworks score the wrong dimension. The social dynamics protect the status quo. And the individual, seeing their environment’s response, rationally concludes that the shift isn’t valued.

Repenning and Sterman at MIT identified the system dynamic that keeps this stable. Organisations operate with two competing loops: “Work Harder” (pressure for throughput, which crowds out investment in learning) and “Work Smarter” (invest in capability, which initially produces lower output). When “Work Harder” dominates — and it almost always does, because it produces immediate visible results — capability erodes slowly, with a time delay that masks the erosion. Managers misattribute the erosion to individual motivation rather than system dynamics. The measurement systems provide cover for this misattribution.

The dashboard shows throughput. It doesn’t show capability decay. And so the loop deepens.

5 System architect of change High flexibility

Flexibility marker

You include yourself in the picture you're diagnosing. You've stopped blaming "resistant people" and started seeing structural dynamics. You design for gradual chain effects, not dramatic reveals.

The highest flexibility is knowing when NOT to reconfigure — when stability serves the transition better than disruption.

Capability profile

  • Understands the social amplification layer
  • Can design skill/agent/application portfolios that match actual need
  • Works WITH organisational immune responses, not against them
  • Addresses structural conditions that make biases necessary
Knows that participatory governance — giving practitioners voice in defining progress — is more effective than top-down mandate.

What the organisation mirrors back

  • The temptation to "bang on the big drum" rather than building understanding gradually
  • The challenge of meeting people where they are without condescending
  • Rare level — most organisations have no one here, which is why the structural dynamic remains invisible
  • Decoupling everywhere: formal AI policies for external legitimacy, internal reality unchanged
The paradox: the more clearly you see the system, the harder it is to communicate without triggering the very defences you've mapped.
4 Paradigm navigator Growing flexibility

Flexibility marker

You see the full picture but struggle to communicate it without triggering the Abstract Defence. Your challenge is translating paradigm-level insight into concrete, non-threatening demonstrations.

The conversion factor problem: your capability exists, but the organisational conditions for demonstrating it may not. That's a structural gap, not a personal one.

Capability profile

  • Can design skill/agent/application portfolios
  • Sees connection between individual understanding and organisational capability
  • Understands the three-tier capability model (systems of record, hybrid workflow, autonomous)
  • Questions governing assumptions, not just processes (double-loop learning)
Asks "what category of solution should be doing this?" not "how do we use AI to do what we already do?"

What the organisation mirrors back

  • Perceived as threatening by the organisational immune system — your insights challenge the official picture
  • Evaluated on criteria from the paradigm you've moved beyond
  • The organisation makes wrong-category decisions (building apps when it needs skills) and you can trace the mechanism
  • The gap between management dashboard and ground reality is fully visible to you — and painful
The Aspiration Trap: maturity scales measure absent conversion factors, not absent capability. The bitter taste of being measured against a ladder that doesn't fit the terrain.
3 Pattern recogniser Emerging flexibility

Flexibility marker

You can release the tool-operator model and adopt the collaboration model. But you may not yet have the language or frameworks to explain what you see to others. The flexibility is personal but not yet communicable.

You feel frustrated in meetings about AI. You can see what others can't but can't make them see it. You've stopped arguing and started just doing it quietly.

Capability profile

  • Has crossed from "tool-use" to "collaboration" mental model
  • Recognises when framing limits output quality
  • Begins to see workflow redesign opportunities, not just efficiency gains
  • Understands AI as reasoning partner, not execution engine
The shift from "How do I prompt this better?" to "How should this work be structured differently?"

What the organisation mirrors back

  • You've crossed the threshold — the environment hasn't. THIS is where the gap opens
  • The same Level 1 colleagues who felt like peers now feel like barriers
  • Conformity pressure to perform at Level 2 — "just use it as a tool"
  • Your proposals evaluated through wrong criteria by people who can't see what you see
  • AI shaming: when AI use is visible to evaluators, people hide their actual practice
The morning meeting: you propose something real, get dismissed by people applying frozen mental models with confidence. The frustration IS the gap.
2 Tool operator Low flexibility

Flexibility marker

The tool-operator schema is efficient and rewarding. It produces visible value. Releasing it would mean a period of apparent performance degradation — the cost of transitioning to a new model. This is the cognitive entrenchment threshold.

You haven't changed HOW you work, only added a faster step. AI is "my tool" — not "my collaborator." This feels like competence because it is. The question is whether it's sufficient.

Capability profile

  • Can produce good results with AI for known tasks
  • Understands prompt quality affects output quality
  • Has integrated AI into daily workflow for specific functions
  • Can evaluate AI output within their domain expertise
Competent and productive — but the question "how should this work be structured differently?" hasn't yet surfaced.

What the organisation mirrors back

  • AI gets bolted onto existing workflows rather than triggering workflow redesign
  • Single-loop learning: "How do we do what we already do, faster?"
  • "It's useful for X but you can't trust it for Y" — framing that protects the tool model
  • Someone proposing AI as "collaborator" sounds impractical from here — and slightly threatening
  • Authority bias: senior operators define what's "appropriate" AI use for the team
The comfort of alignment: the environment validates your model, so there's no signal that a different model exists. The gap hasn't opened yet — which is precisely why it's hard to move from here.
1 Frozen frame Rigid

Flexibility marker

The mental model is cached and treated as complete. New evidence is interpreted within the existing category ("this is just a fancier version"). The category itself — "AI = unreliable tool" — is not questioned because it was formed from direct experience, which feels like knowledge.

You reference AI experiences from 2+ years ago as current. You evaluate today's AI by your first encounter. You feel confident in your assessment without recent hands-on use.

Capability profile

  • Has basic awareness AI exists and can produce text/images
  • May have tried ChatGPT or similar once or a few times
  • Formed an assessment based on those early encounters
  • Assessment was accurate for that moment — but the moment has passed
The first encounter wasn't wrong — it was a snapshot. The problem is treating a snapshot as a portrait.

What the organisation mirrors back

  • Confidence in outdated judgements: "I've tried AI, it doesn't work well"
  • Dismissal of demonstrations — evidence absorbed into existing model, not used to update it
  • "Yes but in general..." — the Abstract Defence that can't be falsified by specific examples
  • When leaders are here → portfolio decisions default to "build an application" because the skill/agent category doesn't exist in their model
  • Groupthink validates the frozen model: "We all agree AI is unreliable" feels like consensus, not limitation
You don't experience resistance because you are the resistance. The frozen model feels like knowledge. This is the hardest level to diagnose from the inside.

What no instrument measures

There’s a Swedish book from 2004 that keeps surfacing in this work — Jansson’s “Validering: att synliggöra individens resurser,” drawing on Finnish scientific research into competence validation. Jansson made the case that competence assessed through a single lens produces an incomplete picture. Formal education without practice is half the story. Deep experience without formal knowledge is the other half. Both dimensions are needed.

Twenty-two years later, we’re facing the same structural problem with AI literacy — but with a third dimension that Jansson didn’t need and that no current framework accounts for.

Knowledge — what you know about AI formally. Necessary. But Chi’s research is clear: knowledge accumulation doesn’t produce categorical shifts. You can describe AI’s emergent properties accurately on a test while your working mental model stays firmly in the “tool” category. Knowledge is a capacity dimension. It grows by addition. And addition alone doesn’t produce the shift we need.

Practice — what you’ve done with AI, hands-on. Also necessary. Also insufficient on its own. The Einstellung research is unambiguous: deep experience can entrench as readily as it liberates. Bilalic’s chess masters weren’t lacking practice. They were trapped by it.

And practice with AI carries its own hidden cost. Research by Crowston, and separately by Collins and colleagues, found that AI assistance can accelerate skill decay in experts and hinder skill acquisition in learners — without anyone noticing, because AI produces adequate outputs that mask the degradation. Clinicians using AI for polyp detection showed significant decline in independent detection skills after just three months. Researchers have begun calling this “never-skilling” — trainees using heavy AI assistance never develop the foundational abilities they’re ostensibly learning. The productive struggle that drives genuine skill formation gets removed. And with it, the mechanism that builds the very competence the tool was supposed to augment.

Metacognition — and this is the missing dimension. The ability to see the pattern you’re in. The flexibility to recognise when your mental model is no longer serving you and to release it. Can you feel the difference between knowledge and assumption? Can you recognise when you’re applying a cached model to a changed reality? Can you sit with the discomfort of a paradigm that doesn’t yet have a settled name?

Sixteen major AI literacy instruments. Not one assesses this dimension. They measure what you know and what you can do. They don’t measure whether you can see the frame you’re operating within.

Intelligence as flexibility

This connects to something that reframes the entire landscape. The emerging scientific understanding of intelligence is shifting — away from capacity (how much you know, how fast you process) and toward flexibility (how fluidly you can reconfigure what you already know when the situation demands it).

This is not a minor academic distinction. It’s the theoretical keystone for everything we’ve been tracing.

If intelligence is capacity, then more training, more knowledge, more practice should solve the AI literacy problem. But the research tells us they don’t. The 88% adoption / 5% value capture gap isn’t caused by insufficient training. 82% of employees have received no formal AI training at all — but training the other 18% hasn’t closed the gap either. Because the gap isn’t about capacity. It’s about flexibility.

Knowledge and practice are capacity dimensions. They accumulate. Metacognition is the flexibility dimension. It reconfigures. And flexibility is what determines whether someone can cross the threshold between Level 2 and Level 3 — the threshold where the category changes.

This is why the Expertise Shield is the most resistant pattern in the dual-perspective model. High capacity plus low flexibility equals entrenchment. The most knowledgeable, most experienced professionals can be the most deeply stuck — not despite their expertise but because of it. Dane’s research confirms: perceived optimal efficiency is the trap. It feels like competence. It functions as a cage.

And the Aspiration Trap doesn’t just trap capability — it traps flexibility. When you anchor self-assessment to a fixed ladder (“How do I get to Level 4?”), you replace the adaptive question (“What does my situation actually need?”) with a rigid one. The ladder becomes its own Einstellung — a perceptual attractor at the level of career development rather than problem-solving.

The “signals you’re here” in the dual-perspective model aren’t asking what you can do or what you know. They’re asking whether you can see the pattern you’re in. They’re flexibility diagnostics. Can you recognise when your expertise is creating blind spots? Can you shift your evaluation criteria when the paradigm changes? Can you hold your model loosely enough to be surprised?

This measurement shift — from tool-skill checklists to knowledge, practice, and metacognition — isn’t an academic refinement. It’s the difference between measuring what keeps people busy and measuring what actually moves them.

The portfolio built on yesterday’s logic

Everything we’ve traced — the frozen mental model, the loop between individual and organisation, the social cognitive biases that keep it stable, the missing metacognitive dimension — converges in a place where the cost becomes concrete and measurable: the organisation’s capability decisions.

When the collective mental model is “AI = tool,” the organisation builds accordingly. Application portfolios expand. Every operational need gets translated into a dedicated application — because in the tool paradigm, that’s what capability means. A persistent software system. Built once, maintained forever. Deterministic, auditable, controlled.

Large organisations already carry portfolios of thousands of applications. Research shows 30% of software budgets are wasted on redundant or unused systems. And those portfolios represent something deeper than technology choices — they represent fossilised assumptions about how work should happen. Each application encodes a moment’s understanding of a process, frozen in code, maintained at significant cost, increasingly divergent from how people actually work.

Here’s what the frozen paradigm can’t see: AI has collapsed the cost of contextual, ad-hoc work. What previously required building and maintaining a dedicated application can increasingly be handled by skills, agents, and orchestrated workflows — flexible capability that adapts to context rather than forcing context into a rigid structure. Gartner predicts 40% of enterprise applications will embed AI agents by the end of 2026, up from under 5% in 2025. The infrastructure is already decomposing: APIs, vector databases, MCP servers, modular retrieval systems — the monolithic backend is giving way to orchestrated, composable architectures. The specific forms will evolve. What we call skills and agents today may carry different names tomorrow. But the shift they represent — from rigid, persistent applications toward flexible capability that lives close to the work — that direction isn’t reversing.

Applications will remain essential for persistent cognitive systems with state, cross-references, and deterministic auditability. But that’s the core, not the default for everything. The long tail of organisational work — infrequent, variable, context-dependent — doesn’t need applications. It needs intelligence that adapts. And when the mental model only contains two categories — “manual process” and “dedicated application” — every analysis arrives at “build or buy an application.” The new category doesn’t exist in the cognitive map. Introducing it triggers the exact resistance mechanisms we’ve traced: the Einstellung Effect activates tool-paradigm schemas, the Expertise Shield protects existing architectural knowledge, the organisational immune system treats the proposal as foreign, and the social dynamics punish the person proposing something outside consensus.

The consequence is measurable. When organisations evaluate AI through the tool paradigm, they apply tool-paradigm KPIs: efficiency gains, time saved, error reduction. These metrics make sense for tools. They miss entirely what AI enables when understood differently — workflow redesign, capability redistribution, knowledge returning from rigid application logic to flexible, portable, maintainable process intelligence that people can actually update and own.

McKinsey’s finding is relevant again here: the organisations capturing real value — that 5–6% — are three times more likely to have fundamentally redesigned workflows rather than bolting AI onto existing processes. They’re not better at operating tools. They’ve left the tool paradigm.

And the cost of not leaving compounds. Technical debt accumulates as the portfolio grows. Maintenance costs rise. Flexibility decreases. The gap between how work is officially represented in systems and how it actually happens on the ground widens. And the organisation, measuring tool efficiency while missing paradigm readiness, reports progress on dashboards while the structural problem deepens underneath.

Goodhart’s Law is operating at full force here — “when a measure becomes a target, it ceases to be a good measure.” When maturity level becomes the organisational target, behaviour optimises for the metric. Application deployment rates. Training completion percentages. AI adoption scores. All trending upward. All measuring the wrong thing. Recent research suggests this isn’t just a rational response — fMRI evidence indicates that metric-seeking behaviour is partially neurologically hard-wired. We are built to optimise for the number, even when the number isn’t measuring what matters.

The four-stage blindness mechanism that researchers have identified operates relentlessly: the framework defines the problem space (foreclosing alternatives), measurement creates behavioural response (optimising for metric rather than capability), the score produces false certainty (“Level 3 of 5” closes inquiry), and structural causes become invisible (gaps get attributed to individual inadequacy rather than organisational architecture).

The result: “The frameworks are not failing — they are succeeding at what they actually do: create organisational confidence in systems that are systematically obscuring what matters.”

If the organisation doesn’t have the maturity to see what AI actually does and how it could be used — not as a bolt-on tool but as a fundamentally different way of structuring capability — then every portfolio decision, every architectural choice, every measurement framework reinforces the paradigm that needs to change. The mental model doesn’t just affect how individuals think. It determines what the organisation builds. And what it builds determines what it can become.

What the cracks make visible

I didn’t start this exploration with a framework. I started with a crack.

A moment in a meeting where I proposed something that couldn’t be heard — not because it was wrong, but because the room’s shared mental model didn’t contain the category it belonged to. That experience sent me into the research. And what the research revealed was not a single cause but a system — a loop between individual cognition and organisational structure that reinforces itself at every level, sustained by social dynamics that serve a genuinely protective function, and measured by instruments that confirm progress while obscuring the problem.

What makes this traceable now — for the first time, as far as I can find — is the scale of the turbulence. Decades of technology adoption never required a categorical shift. The mental model of “technology = tool” stretched comfortably to accommodate every previous wave. But AI breaks that accommodation. The shift is too large. The mismatch between the old category and the new reality generates friction at every boundary — individual, team, organisational, inter-institutional — and that friction makes the mechanisms visible.

Small shifts don’t produce this kind of turbulence. You don’t see the cracks when the change is incremental. But when the shift is categorical, everything that was hidden becomes exposed. The frozen mental models. The social biases holding them in place. The measurement frameworks confirming a picture that doesn’t match reality. The portfolio decisions encoding the wrong paradigm. The Einstellung Effect directing expert attention away from better solutions. The dual consciousness of organisations performing transformation while buffering against it.

All of this was always there. It operated in every previous technology adoption. But it was never visible because the turbulence was never this strong.

The research doesn’t just describe the problem. It traces cause and effect from the individual’s cognitive category all the way through the organisation’s defensive architecture and back. Chi explains why the category is frozen. Bilalic shows how expertise reinforces the freeze. Gentner reveals why the tool analogy persists. Vosniadou shows the hybrid models people construct to avoid the shift. Dane shows why entrenchment feels like competence. Carreno shows how the organisation’s immune system neutralises innovation. Argyris shows how defensive routines make the problem undiscussable. Meyer and Rowan show how decoupling allows organisations to perform change without changing. Goodhart and Campbell show how measurement confirms the performance. And Repenning and Sterman show how the system dynamics ensure capability erodes while dashboards show green.

Each of these was known. The cascade connecting them — from an individual’s ontological category assignment through social amplification to organisational architecture to portfolio decisions and back — that’s what hasn’t been traced before. Not because the pieces weren’t available, but because the turbulence wasn’t strong enough to make the connections visible.

So what changes?

The mental model has to change — at the individual level first, then cascading through the organisation. Not through training programmes that teach more tools. Not through maturity frameworks that score the wrong dimension. Through the development of metacognitive flexibility: the ability to see the pattern you’re in, release the model that’s no longer serving you, and operate in the uncertainty of a paradigm that hasn’t fully arrived yet.

That means measuring differently. Knowledge, practice, and metacognition — not tool checklists. Flexibility, not just capacity. Diagnostic frameworks that show where you’re stuck and why, not just where you aspire to be. Dual-perspective assessment that accounts for the environment, not just the individual.

And it means building differently. Organisations that understand AI as a categorical shift will structure their capability portfolios differently — skills and agents for the flexible majority, applications for the essential core. They’ll measure readiness by paradigm understanding, not tool proficiency. They’ll create environments where the people who’ve made the shift can operate visibly, not just privately. They’ll address the structural conditions that make social biases necessary, rather than blaming individuals for having them.

The 88% who’ve adopted and the 5% who’ve absorbed are separated by a categorical shift that no tool list will bridge. The cracks are visible now. What we do with what we can see — that’s the work ahead.

Disclaimer

AI-assisted content: This post was researched and developed with assistance from Claude (Anthropic), Gemini Deep Research (Google), and Perplexity Pro. The research foundation draws on 12 independent research reports commissioned across two AI platforms, covering cognitive science, organisational learning, social psychology, and measurement theory. Research synthesis, structural development, and visual modelling were collaborative processes between the author and AI. The thinking, editorial decisions, and conclusions are the author’s own.

Opinion: This is a personal exploration blog. Views expressed are the author’s own, informed by 20+ years of UX design practice and ongoing research into human-AI interaction.
Sources: Key references are listed below. The full research corpus is available on requ

Research & sources:

Research & academic sources

    • Chi, M.T.H. — Three Types of Conceptual Change: Belief Revision, Mental Model Transformation, and Categorical Shift
    • Bilalic, M., McLeod, P., & Gobet, F. (2008) — Why Good Thoughts Block Better Ones: The Mechanism of the Pernicious Einstellung Effect
    • Dane, E. (2010) — Reconsidering the Trade-off Between Expertise and Flexibility: A Cognitive Entrenchment Perspective
    • Vosniadou, S. — Conceptual Change in Learning and Instruction: The Framework Theory Approach
    • Gentner, D. (1983) — Structure-Mapping: A Theoretical Framework for Analogy
    • Goodhart, C.A.E. (1984) — Problems of Monetary Management: The U.K. Experience
    • Campbell, D.T. (1976) — Assessing the Impact of Planned Social Change
    • Sen, A. (1999) — Development as Freedom (Capability Approach)
    • Repenning, N.P. & Sterman, J.D. (2001) — Nobody Ever Gets Credit for Fixing Problems that Never Happened
    • Nęcka, E., Gruszka, A., & Orzechowski, J. (2012) — Cognitive Flexibility and Inter-domain Rigidity
    • Jansson, S. (2004) — Validering: att synliggöra individens resurser
    • Argyris, C. & Schön, D.A. (1978) — Organizational Learning: A Theory of Action Perspective
    • Carreno, A. (2025) — Why Organizations Resist Their Own Evolution
    • Ackerhans & Wehkamp (2022) — Professional Identity Threat and AI Resistance (JMIR)
    • Yang, Secchi, & Homberg (2025) — Organisational Defensive Routines (IJPSM)
    • Collins et al. (2024) — AI-Assisted Skill Decay in Clinical Settings (PMC)
    • Crowston, K. — Deskilling and AI Assistance in Expert Work
    • Behavioural Insights Team (2025) — AI Adoption Across 14 Countries
    • McKinsey (2025) — The State of AI: From Adoption to Absorption
    • Luhmann, N. — Social Systems and Decision Premises Theory
    • Meyer, J.W. & Rowan, B. (1977) — Institutionalized Organizations: Formal Structure as Myth and Ceremony
    • Elmqvist et al. (2025) — Participatory AI (consortium of 46 researchers)
    • Reich et al. (2026) — Psychological Safety and AI Adoption (arXiv)

Related Stimulus Content

THE STIMULUS EFFECT | Podcasts

Podcasts on Spotify

You can listen to the Stimulus Effect Podcasts
on Spotify now!

 

Click to listen on Spotify!

THE STIMULUS EFFECT | Videocasts

0

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This