Data source, classification, and limitations
The Anthropic Economic Index V4, released January 2026, contains two samples of one million conversations each: one from Claude.ai (consumer) and one from first-party API usage (developer). Both were drawn from the week of 13–20 November 2025.
This analysis primarily uses the Claude.ai sample. The API sample is used only for comparison in Section 3.6 of the Results. The full dataset is publicly available on HuggingFace.
Anthropic uses a privacy-preserving classification system called Clio, which prompts Claude itself to categorise anonymised conversations along several dimensions. The methodology is described in Anthropic's Economic Index report and accompanying research paper. The key classifications are:
Anthropic reports validating the classifier against human researchers on a sample of transcripts and against external benchmarks such as Bureau of Labor Statistics education data. However, no published error rates or inter-rater reliability figures are available for the specific classifications used in this analysis.
Education tasks were identified by filtering to SOC Major Group 25: Educational Instruction and Library Occupations. SOC (Standard Occupational Classification) is a US system that groups all jobs into numbered categories. Group 25 covers teachers, lecturers, librarians, and education support workers. This filter yielded 266 of 3,170 total matched tasks, representing 15.2% of total conversation volume.
The 266 tasks were assigned to five subsectors based on their 4-digit SOC prefix:
Some O*NET tasks appear under more than one occupation. For example, "develop instructional materials" could belong to both a postsecondary lecturer and an education support worker. To avoid counting the same task twice, each task is assigned to a single subsector based on its primary SOC code. The totals above reflect these deduplicated assignments and sum to 266.
Not all 266 tasks generate the same amount of conversation. Some tasks, like "assist students with coursework outside class," account for a large share of total volume, while others are rarely seen. When we calculate averages and totals, each task's contribution is weighted by how much traffic it actually receives. This prevents a rarely-used task from having the same influence as one that drives thousands of conversations.
This matters because the data is concentrated: the top 5 education tasks account for approximately 46% of all education conversation volume. Aggregate statistics are therefore shaped disproportionately by a small number of high-traffic tasks.
Clio sometimes cannot classify a conversation's interaction style, labelling it not_classified or none. We exclude these and recalculate the five named patterns (directive, task iteration, learning, validation, feedback loop) so they sum to 100%. This makes patterns comparable across tasks, but it means the reported percentages are higher than they would be if unclassified conversations were included. The exact formula is in Appendix C.
This data covers Claude.ai only. ChatGPT, Gemini, Copilot, and other tools serve different and likely larger education user populations. Claude.ai's market share, interface design, and model behaviour all shape the patterns observed here. Nothing in this report can be assumed to generalise to other platforms.
The sample covers 13–20 November 2025. In the Northern Hemisphere, this is late semester, when assignment loads peak. In parts of the Southern Hemisphere, this is exam season. Usage patterns almost certainly vary across the academic calendar. A January or June sample would likely look different.
Although Anthropic reports validating Clio against human researchers and external benchmarks, no published error rates or inter-rater reliability figures are available for the specific classifications used here. We report Clio's classifications as given.
Clio's collaboration patterns (directive, task iteration, etc.) describe the structure of the conversation, not the user's intent or whether learning occurred. A student asking "explain the causes of World War I" is classified as directive, but the student's purpose may well be to learn. Similarly, task iteration (where a student drafts, gets feedback, and revises) is a core pedagogical process, even though the label sounds mechanical.
There is also a visibility gap. If a student asks Claude for feedback on a paragraph, revises it in a separate document, and returns with an updated version, that revision work happens outside the conversation. Clio can only classify what appears in the transcript. The actual learning process may be richer than what the data captures.
The Discussion section addresses this interpretation gap in detail.
An automated data verification using Claude and Gemini (see Appendix D) identified that the global collaboration baseline numbers and subsector task counts in an earlier draft contained errors related to renormalisation methodology and deduplication approach. These have been corrected. The verification also flagged the V2-to-V4 trend comparison as methodologically unsound due to different classification models being used across versions. That comparison has been excluded from this report entirely.
Filtering to SOC Major Group 25 captures teachers, lecturers, librarians, and education support workers. It does not capture education-adjacent roles classified elsewhere, such as school counsellors (SOC 21-1012), education administrators (SOC 11-9030), or training specialists in non-education sectors. The education share reported here is therefore a lower bound of education-related AI usage.
The top 5 of 266 education tasks account for 46% of education conversation volume. Aggregate statistics are therefore driven disproportionately by a handful of high-traffic tasks. Subsector-level breakdowns partially address this, but the concentration remains a structural feature of the data.
Country-level education share percentages are computed from unknown base volumes. Countries with small Claude.ai user populations can show volatile percentages. A country where 37% of AI conversations are education-related may have a tiny absolute number of conversations. Geographic findings should be treated as indicative, not definitive.