Insights tagged ‘AI’
-
Go to the actual place and see the actual thing
Somewhere in a Toyota plant in the early 1950s, a young engineer stood inside a chalk circle drawn on the factory floor. Taiichi Ohno, the architect of the Toyota Production System, had put him there with a single instruction. Watch. No clipboard, no agenda, just observe what ha…
-
Climbing the Claude ladder: from prompting to orchestrating
Most people using Claude are stuck on the first rung of a very tall ladder. They open a chat, type a question, get an answer, and move on with their day. Which is fine, but it’s a bit like buying a full workshop and only using the tape measure. I’ve spent the better part of a y…
-
The path to an agent-first web
For three decades, the web has operated on an implicit contract between the people who build websites and the people who visit them. You design pages for human eyes and organise information for human brains, monetising attention through ads, upsells, and sticky navigation patter…
-
Why AI models hallucinate
In September 2025, OpenAI published a paper that said something the AI industry already suspected but hadn’t quite articulated. The paper, “Why Language Models Hallucinate”, authored by Adam Tauman Kalai, Ofir Nachum, Santosh Vempala, and Edwin Zhang, didn’t just catalogue the p…
-
The trust problem that you already solved
Every developer who has spent time with AI coding tools carries the same low-grade anxiety. You ask the model to build something, it hands you back a file, and then you stare at it like a customs inspector wondering whether the suitcase has a false bottom. Line by line, function…
-
Yes, the models got dumber
In March 2023, GPT-4 could identify prime numbers with 97.6% accuracy. By June, that figure had cratered to 2.4%. Not a rounding error, not a minor regression, but a 95-point collapse on the same task with the same prompts. If a bridge lost 95% of its load-bearing capacity in th…
-
The sunk cost of being good at something
There is a particular conversational move that has become common in discussions about AI. Someone demonstrates a new capability, shares a use case, or describes how their workflow has changed, and a familiar response arrives. What about security? What about governance? What abou…
-
The flatness of the machine
You can feel it before you can name it. A paragraph arrives, fluent and frictionless, and something in the back of your reading brain flinches. The sentences are grammatically flawless, the structure orderly, the tone warm but not too warm, authoritative but not too authoritativ…
-
Meta’s GEM: what the largest ads foundation model means for your marketing
Meta has been quietly building something significant. Most marketers haven’t fully grasped the importance because it has been wrapped in machine learning jargon and engineering blog posts. The Generative Ads Recommendation Model, which Meta calls GEM, is the largest foundation …
-
The narrow window for probabilistic agents
You can see the exact moment it goes wrong. The CIO sits through a vendor demo, watches an “AI agent” process a support ticket, look up an order, apply a returns policy, issue a refund, and send a confirmation email. It is slick, fast, and in every meaningful way, a workflow aut…
-
Your org chart is not your AI strategy
If you’ve spent any time in enterprise technology over the past two decades, you’ll recognise the pattern immediately. A new category of tool emerges. Employees start using it because it makes their working lives easier. IT discovers this unsanctioned adoption, panics about secu…
-
The machine that improves the machine
In May 2025, Google DeepMind released AlphaEvolve, an AI system that discovers better algorithms by evolving code through thousands of iterations. Within months, it had already optimised parts of Google’s data centre operations, improved hardware chip designs, and, most tellingl…
-
The vibe coding spectrum: from weekend hacks to the dark factory
A year ago, Andrej Karpathy posted a tweet that would come to define how an entire industry talks about itself. “There’s a new kind of coding I call ‘vibe coding,’” he wrote, “where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.” He d…
-
A $10K Mac Studio won't replace your API bill
Caveat: this article contains a detailed examination of the state of open source/ weight AI technology that is accurate as of February 2026. Things move fast. I don’t make a habit of writing about wonky AI takes on social media, for obvious reasons. However, a post from an AI s…
-
Claude Opus 4.6 just shipped agent teams. But can you trust them?
Anthropic shipped Claude Opus 4.6 this week. The headline features are strong: a 1M token context window (a first for Opus models), 128K output tokens, adaptive thinking that adjusts reasoning depth to the task, and top-of-the-table benchmark scores across coding, finance, and l…
-
Out of context: strategies for managing agent memory
The ongoing contest in AI technology—a “strange arms race”—is the relentless expansion of the context window, which is the maximum input size for a large language model. This arms race is driven by the persistent notion that a larger context equals greater intelligence and capab…
-
Escaping prototype purgatory: where is AWS for AI agents?
This question has been running around my brain for a while, driven by two factors. First, building robust, production-ready enterprise agents that can handle scale, complexity and security is hard and complicated. Second, what if we could kind of abstract away all of that comple…
-
The Hot Mess: large AI models and the scaling mirage
There is a chart circulating among machine-learning circles that, depending on your outlook, will either alarm you or confirm something you have long suspected about the computers that are, at this point, writing our code, summarising our meetings, and helping decide who gets ba…
-
Tooling around: letting agents do stuff is hard
There is a messy reality of giving AI agents tools to work with. This is particularly true given that the Model Control Protocol (MCP) has become the default way to connect AI models to external tools. This has happened faster than anyone expected, and faster than the security a…
-
Building a simple agent with Claude
This article covers how to build a simple AI agent using Claude, using a hypothetical sales function as a worked example. A sales team does not need a fully autonomous agent that orchestrates twelve tools and makes decisions about deal strategy. What it needs, at first, is a th…
-
How Claude Code and Cowork talk to your other systems
Anthropic’s products have become the most aggressive movers in the race to connect AI to the messy sprawl of software that runs modern businesses. Claude Code talks to GitHub, Sentry, Postgres, and Jira. Cowork reads your local files, pulls data from your CRM, and drafts message…
-
Security for production AI agents in 2026
Note: This article represents the state of the art as of January 2026. The field evolves rapidly. Validate specific implementations against current documentation. This article is for anyone building, deploying, or managing AI-powered systems. Whether you’re a technical leader e…
-
In the jungle: a reality check on AI agents
One of my all-time favourite films is Francis Ford Coppola’s Apocalypse Now. The making of the film, however, was a carnival of catastrophe, itself captured in the excellent documentary Hearts of Darkness: A Filmmaker’s Apocalypse. There’s a quote from the embattled director tha…
-
AI governance: between the committee and the catastrophe
Every large organisation deploying AI currently faces two failure modes. Moving too slowly by requiring extensive committee approvals and detailed risk assessments causes the technology to become outdated before it can deliver results. Conversely, moving too quickly by allowing …
-
AI slop: psychology, history, and the problem of the ersatz
In 2025, the term “slop” emerged as the dominant descriptor for low-quality AI-generated output. It has quickly joined our shared lexicon, and Merriam-Webster’s human editors chose it as their Word of the Year. As a techno-optimist, I am at worst ambivalent about AI outputs, so…
-
Why AI agents keep forgetting things, and the race to fix it
Ask ChatGPT something on Monday and return on Wednesday, and it will greet you with the warmth of a stranger. It has no recollection of your project, preferences, or the three hours you spent refining a prompt together. This amnesia is not a flaw in the traditional sense but a c…
-
Zero busy work
AI has given us a lot of things. When used incorrectly, your brain turns to mush. When used correctly, it frees you to be original, strategic and creative. Something I’ve been thinking a lot about lately is the idea of zero busy work. This isn’t just about productivity, but abo…