Attention is all you ever needed

By Iain Harper,

For seventy years, a generation of management consultants has repeated Joseph Juran’s line about the vital few and the trivial many as though it described a permanent feature of commercial life. 20% of customers generate 80% of revenue. 20% of products account for 80% of sales. 20% of bugs cause 80% of errors, as Steve Ballmer once put it in a famous 2002 memo.

This is correct as an observation. It is wrong as a permanent worldview. The skewed distributions that defined mid-twentieth-century capitalism were not the imprint of some natural law of commerce. They were what commerce looked like when production required large factories, distribution required national logistics, and persuasion required monolithic channels like network television. When those constraints changed, the distributions changed too.

Artificial intelligence is changing them again now, not because AI will replace us all, which is the current all-consuming conversation, but because it reverses the economics of specificity. For agencies and small businesses, this is a highly consequential shift, and many of us are behaving as though it hasn’t happened or we haven’t noticed.

Galbraith’s world and ours

In The Affluent Society, John Kenneth Galbraith described a strange loop at the centre of the post-war economy. Factories needed large production runs to be profitable, which meant they needed large, uniform buyer audiences. Advertising existed to produce those audiences. Washing machines, televisions, and refrigerators were manufactured in enormous quantities, and the desire to own them was built through advertising across a handful of national channels. Galbraith called this the dependence effect. The wants the economy claimed to satisfy were, in his view, mostly wants the manufacturing complex had created to keep those factories humming.

Galbraith was writing in 1958. His critique worked because the machinery he described was real. A consumer in 1958 encountered a finite set of products sold by a finite set of firms advertising via a very limited number of outlets. The median customer was the object of production because the economics could reliably reach only that customer.

None of this is true any more. The media environment is not concentrated. Distribution costs for digital goods are close to zero. Online advertising does not work by gathering a crowd and selling them a single message. It finds a specific person who wants a specific thing and shows them a tailored ad. The infrastructure that made the Pareto principle look like physics has been dismantled over the past 30 years, and AI is its final death knell.

What was the concentration made of?

Concentration, in the Juran sense, had four ingredients. You needed capital to build production capacity. You needed shelf space or retail distribution. You needed awareness, which meant buying time on one of a small number of broadcast and media properties. And you needed logistics to move goods from the factory to the customer. Firms that cleared all four thresholds tended to win, and the winners tended to keep winning, resulting in a skew that looked like an eighty-twenty rule.

The first digital era dented this but did not break it. Software products removed the marginal cost of replication. Freemium economics broadened monetisation by letting different users pay different amounts for different feature bundles. But discovery stayed scarce. App stores ranked only relatively few apps per category. Google returned only ten results above the fold. Infinite shelf space coexisted with finite attention, and attention is what determines whether a product is commercially alive.

Three things are now collapsing at once. AI lowers the cost of producing variants, so a firm can design twenty product configurations for the price it used to spend on three. AI improves the inference of latent preferences, so the matching layer between customers and products radically self-improves constantly. AI-generated creative assets can be tailored to each cohort or even each individual, so the same underlying product reaches different buyers through different framings without significant additional overhead.

The compound compounds

As matching precision improves, the minimum viable audience shrinks. A product that once needed a hundred thousand buyers to justify its existence might now need only five thousand, provided those five thousand can be found and reached. This is not a marginal adjustment. It fundamentally reorganises what is worth building.

A loop forms. Better matching increases conversion and retention, improving unit economics. Improved unit economics attract producers who previously judged a market too niche. More niche producers enlarge the catalogue. A richer catalogue tells the system more about its customers, because a person who picks one item from a menu of forty reveals far more than a person who picks one from a menu of three. That richer information further improves matching. The cycle continually compounds.

This is not the dependence effect. Galbraith’s model assumed advertising fabricated demand to absorb standardised output. The AI-mediated system reveals pre-existing but dormant demand by widening the set of options a person might plausibly encounter. A niche is not the same as a fabrication. A woodworker in Vermont who will pay £300 for a specific Japanese chisel is not a consumer whose wants have been manufactured. They are a customer who was previously unreachable by the chisel maker and is reachable now.

Hegel (mercifully briefly)

There is a philosophical point worth noting. Hegel thought the self was not something you discovered through introspection. It became visible through action, through the encounters you chose and the options you rejected. You find out who you are by doing things and observing the patterns you keep.

Recommendation systems, search engines, and AI-mediated interfaces work in reverse. They start with no model or signal of the user and refine one through engagement. Every click, dismissal, purchase, and micro-action is a partial disclosure of taste and preference. The better the signal, the more revealing the disclosure. Commerce becomes a medium of self-knowledge, which sounds grandiose until you consider that I seriously weighed purchasing a Cajon (a type of drum you sit on) because Amazon kept recommending them to me, and I assumed it had inferred some latent, undiscovered talent.

The commercial version of the point is this. Most business thinking assumes customers arrive with fixed tastes and that the job is to match inventory to them. That is roughly backwards. Tastes sharpen in contact with choice, which is why AI-driven supply expansion and AI-driven personalisation are not separate trends. They are the same trend, feeding each other.

The common factory floor

The product-side story is the easy half. A catalogue of ten thousand different chisels lives in a warehouse and, in theory, ships to ten thousand buyers, each found by a matching layer that has learned their exact chiselling preferences. A consultancy with ten thousand clients does not exist, because services are produced by humans with calendars, and no matching layer can yet stretch time.

The Management Consultancies Association’s January 2026 survey reports that 77% of UK consulting firms have built AI into delivery or enabled staff to use it. The number understates the real situation. Most of the remaining 23% are undoubtedly using the tools anyway and are declining to say so on the record. So treat AI tooling as table stakes. The interesting question is how a firm competes when everyone has an AI factory floor.

The old answer was scale. The pyramid of partners, associates, and analysts was not really a delivery model. It was an arbitrage. Junior work billed at premium rates generated the margin, and the firm’s moat was having enough junior labour to execute while other competitors ran out of capacity. McKinsey now runs 20,000 AI agents alongside 40,000 humans, and a Harvard–BCG study of consultants using generative AI found 25% faster task completion with 40% higher quality. HBR calls the structure replacing the pyramid the “obelisk”. Fewer layers, smaller teams, more output per head. The pyramid is flattening because the arbitrage is vanishing. Global consulting job postings already bear this out, with non-senior roles in Canada down 40% from early 2022.

A solo operator with Claude and a set of prompts tuned to their own case pattern now produces in a day what a four-person team took a week to produce in 2021. The gap between what a small organisation can ship and what a fifty-person firm can ship has collapsed for a large fraction of the work that used to consume pyramids.

The four bases of differentiation

The first is judgment. AI amplifies whatever you point it at, making it disproportionately important to know what to point it at. A firm that uses AI to produce a wrong answer faster than the competition is not at an advantage. It is just compromised faster. The senior judgment about what problem is in front of the client, what approach might work, and what has been tried and failed in adjacent situations does not scale with tooling. It scales with exposure and experience, which means years spent looking at specific kinds of problems.

The second is taste. Plausible-looking outputs have become abundant. The ability to distinguish plausible from genuinely good has become a scarce skill. AI will cheerfully produce a deck that says nothing very clearly, or a strategy that reads coherently and advises superficially. The firms whose senior operators can tell the difference are doing something the rest of the market cannot copy just by buying more Claude licences.

The third is corpus. Every firm is now sitting on a latent asset, the record of its own past work. Project files, discovery call transcripts, retrospective notes, outcome data. Most firms’ versions of this are poorly organised, often lost, and never used for anything other than case studies. A corpus is deep, not wide. AI is finally the tool that operationalises that depth, because a well-curated corpus can train a firm’s internal models, inform its proposals, sharpen its diagnostic work, and increasingly train the external models that shape how the firm appears in buyer research.

The fourth is specificity. In a world where buyers converse with chatbots that pull from the retrievable content of what a firm has published, written, and been referenced for, being specifically known to a narrow audience matters more than being broadly known to a diffuse one. The default output of a general-purpose chatbot surfaces firms with the deepest web presence, which means the incumbents.

But buyers do not stop at the default. They ask follow-ups. “What about smaller firms specialising in X for Y-sized companies in Sacramento?” The specialist who surfaces on the follow-up gets the call. Whether it surfaces depends on whether its writing, case patterns, and referenced work have made their way into the retrievable material the model draws from. This is a different skill from SEO, and most firms have not built it yet.

None of these four is a novel category. Judgment, taste, proprietary data, and focus have always mattered in services. What has changed is the relative weight of each. When execution was expensive, execution capacity was a genuine moat, and the other four could be good enough without being that great.

The practical consequence is a different reading of what a service firm is. The old reading described professional services as a mechanism for applying cheap junior labour to client problems under the supervision of senior expertise. The new reading is closer to something simpler. A firm is a curated corpus of work, plus the senior judgment to apply it, and everything else is instrumentation. The corpus is the inventory. The judgment is the operator. AI is the tooling. The client pays for the match between their problem and the firm’s record.

For agencies and smaller outfits, the window is open in a way it has not done in decades. Incumbents are of course investing in the same things. McKinsey’s 20,000 agents are not merely a productivity move. They are an attempt to industrialise AI tooling. The success of this brute force approach may seem inevitable, but there is an advantage hidden in the factory-floor argument: the floor itself is being continuously rebuilt beneath everyone.

Qwen releases a new model. Anthropic ships new skills, and toolsets advance continuously. A new orchestration framework arrives. A prompt technique that worked in March is obsolete by April. An MCP server that wasn’t viable last quarter becomes the best way to do something this quarter. The operational stack of an AI-augmented consultancy is not something you build once. It is something you rebuild every few weeks, days or even hours. And the firms that rebuild fastest compound an advantage the slower ones cannot close.

A three-person organisation can rip out its research workflow on a Tuesday morning because someone discovered a radically better approach over the weekend. McKinsey’s 20,000 agents were procured, approved, risk-assessed, legally reviewed, and rolled out through change management. The next material improvement to how those agents work will go through the same slow process. Speed of adaptation is not a minor variable here. It is the variable. The factory floor is not a fixed asset. It is a living one, and small firms are uniquely shaped not only to keep up with it but to stay ahead of it.

The end of the blunt instrument

Pareto will continue to describe some distributions because some concentrations arise from genuine asymmetries. Talent, network effects, cumulative reputation, capital intensity. These persist. What does not persist is the assumption that concentration is destiny. When the cost of specificity falls, specificity always wins where scale used to.

Galbraith worried in 1958 that the affluent society would drift into private affluence and public squalor, numb to standardised goods sold through repetitive persuasion. He was right about his world. The world now being built looks very different. Abundance means a hundred-person audience for a piece of writing that would never previously have been published, a bespoke tool for a professional community too small to have formerly justified its development, a specialised Japanese chisel reaching its woodworker in Vermont.

The 2017 paper that made all of this possible was called “Attention Is All You Need.” It described a training mechanism. It turned out to also describe the evolving commercial market for goods and services as well. Attention was always a scarce resource in commerce. Galbraith’s factories, Juran’s pyramids, and the whole twentieth-century apparatus of advertising were crude machinery for capturing it at scale.

The architecture the old economy used was sequential. Build awareness first, then consideration, then preference, then purchase, with each stage losing most of the audience to the next. Every firm in the funnel fought for the same few points of attention at every step, because that was the only option available. Attention mechanisms in transformers replaced that kind of sequential compression with a parallel, weighted lookup, where every token is considered against every other token and relevance is determined by comparison rather than by the queue.

Commerce is undergoing a similar transition. The funnel is not dead, but it is no longer the only architecture. A specialist firm does not need to win the broadcast round, then the shortlist round, then the pitch round. It needs to be intelligible enough that when a buyer describes a specific problem to a specific system, their relevance is obvious on the first pass. The twenty right clients for a small organisation do not travel through a funnel. They ask a specific question and get a specific answer.

Attention has always been scarce. The machinery for allocating it is what is rapidly changing.

More blog posts

All blog posts