Do LLMs Fade Worked Examples?

Tim Gallagher Pilot Study

March 2026

A Pilot Study of Pedagogical Reasoning in Frontier AI Models

Abstract

The worked example fading effect is one of the most robust findings in cognitive load theory: instructional sequences should begin with fully worked examples and progressively remove solution steps so learners take over the problem-solving process. If teachers use LLMs to generate worked examples, do those models apply fading?

This pilot study tested six models (three closed-source: Claude Opus 4.6, Gemini 3.1 Pro, Gemini 3 Flash; three open-weight: DeepSeek R1, Qwen3-235B, GPT-OSS-120B) across three prompting conditions: a baseline with no pedagogical instruction, general cognitive load theory instruction, and specific fading instruction. Both model outputs and reasoning traces (chain-of-thought) were collected and scored on two dimensions: whether the output implemented fading, and whether the trace referenced it.

The headline finding: all six models demonstrate knowledge of fading-related concepts in their reasoning traces when prompted with "apply cognitive load theory," but none applies fading in their output unless told specifically what it is. This knowledge-application gap has direct implications for teachers relying on AI-generated instructional materials.

This is a pilot study with six models, one run per condition, and no inferential statistics. All findings are descriptive. For the full list of AlignED reports, visit AlignED Reports.

At a glance

6
Models tested
3
Prompting conditions
18
Outputs scored
18
Reasoning traces analysed
Key finding: All six models retrieved CLT-related knowledge in their reasoning traces when told to "apply cognitive load theory" (Condition B). None applied fading in their output without specific instruction (Condition C). The models have access to fading as a concept but do not select it from their repertoire of CLT principles unless the prompt names it directly.

Reading guide

This site follows the structure of an academic paper. The Introduction explains why fading matters and frames the research question. Methods describes the prompts, models, and scoring rubric. Results presents the scores, charts, and key trace excerpts. Discussion draws out implications and states limitations. The Appendices contain all prompts, the full rubric, and complete model outputs and reasoning traces.