10 August
News
OpenAI does Open, and launches ChatGPT5
OpenAI aimed for two big splashes this week, releasing a set of open-source models for the first time since 2019 and launching GPT5.
OpenAI was supposed to be, well, open, but it stopped making the models public as a matter of responsibility and principle on the claim that this was too ‘dangerous’. However, first Meta and then a wave of Chinese companies have released a series of best-in-class open models, and so Sam Altman has been strategising: relevance might be more important than principles. GPT-OSS does nothing very different, but it comes in the standard range of sizes and weights and gets into the top 10 or so on most benchmarks: if you need to run and customise models yourself, Llama is too far behind and you can’t use the Chinese models, OpenAI is there for you, presuming of course you trust OpenAI not to change course again. Sam Altman is playing chess again.
GPT5 is a more complex story: it continues the steady improvement of LLMs and puts OpenAI back at the top of the benchmarks but isn’t reallya step change in capability, with one exception. Like most model releases these days, ‘GPT5’ is actually a family of large and small models, trading off cost and quality versus speed and ability to run on a PC versus a server farm. But that has given us the much-mocked ‘model picker’, where you’re supposed to know if your task needs GPTo3, o4, 4o or Mini Pro Plus. GPT5 is a system that comes with a component that acts as a router, deciding for itself which underlying model to send your task to. See this week’s column. GPT-5, OPEN SOURCE
Google does video simulation
The third big story this week was Google’s release of Genie 3, which can generate realistic 3D worlds in real time as you move through them. AAA games have become enormously expensive to produce in recent years, with huge numbers of people away building all of those elastic environments by hand; it seems pretty clear that AI will automate a lot of that, at a minimum, and it may also lead to entirely new kinds of experiences, much as 3D itself or networking did. LINK
The week in AI
Google has been talking to advertisers about its plans to include ads in ‘AI Mode’ search results. LINK
Elon Musk also said he plans to include ads in results from xAI’s LLM Grok (the one that called itself ‘MechaHitler’), building on his success in attracting advertisers back to Twitter. LINK
Cloudflare (CDN) accused Perplexity of not just ignoring robots.txt requests not to crawl websites, but of hiding the identity of its crawlers to read websites that are actively trying to block it. LINK
Eleven Music’s latest generator is out, and worth playing with. I’m old enough to remember when synths destroyed creativity. LINK
News from autonomy
Tesla shut down its ‘Dojo’ project to build its own supercomputer to analyse driving data (which last year was supposedly worth tens of billions of dollars) - this might be because Elon Musk’s xAI has plenty of its own compute available on easy terms to Tesla. LINK
Meanwhile, Amazon’s Zoox got clearance to test in public. Zoox has the unusual approach to autonomy of making an entirely new vehicle, which I honestly don’t understand - once autonomy works then cars can be redesigned with no steering wheel, sure, but why spend the money on that in advance? LINK
And further out again, Joby (electric helicopters) is buying Blade, which does helicopter shuttles around NYC, planning to use it as a route-to-market. LINK
Apple gives Trump a trinket
In ‘Godfather II’, the Cuban representative of ITT gave President Batista a solid gold telephone. In 2025, Apple’s Tim Cook gave President Trump a piece of Corning Glass on a gold plinth. LINK, CUBA
Ideas
Google published a blog post pushing back on the narrative that its AI Overviews have slashed traffic to web publishers. There’s a lot of wiggle-room in the language it uses (referral traffic is ‘relatively stable’ - what does that mean?), but the more important underlying point is that the move to LLMs will change the distribution patterns. LLMs will change what kinds of searches people do and what kinds of sites they visit, not just shift traffic from publishers to Google. LINK
Yet another attempt to analyse jobs exposed to AI, this one from the EIG. Reading it, though, I’m struck by an interesting implicit assumption that’s common to a lot of this kind of work: that the more ‘in person’ and ‘physical’ a job is, the harder it is to capture with AI. In particular, it suggests that physical trainers are much harder to automate. But couldn’t we have personalised AI physical trainers, at scale, with generative video and live analysis of what you’re doing? Maybe that will convert to AI much faster than some ‘desk-based’ jobs? This reminds me a little of the notorious attempt to analyse the TAM for Uber by calculating the TAM for taxis: disruption changes our definitions of the market. LINK
Shopify has always wanted to move up the stack from commodity SaaS provider to build a network that can make recommendations and drive traffic to merchants. Now AI makes that a lot more complex. LINK
Analysing the effect of the AI capex boom on the broader US economy. LINK
Apollo’s first move into data centres as real estate investment. LINK
Meta’s paper on how it uses AI to rewrite ads to get better response rates. LINK
The former head of EY UK says AI will be big for new entrants, and takes board seats at new entrants. LINK
It’s very unclear what AI means for consultants and accountants, but ‘machines that can write code’ are clearly a very big deal for the ~$300bn Indian IT outsourcing industry. LINK
A taxonomy of LLM ‘hallucinations’. LINK
Why Apple cares about F1. LINK
Amazon’s ‘ad-stick’ retail advertising dongle. LINK
Russia and Ukraine are using huge numbers of cheap battlefield drones but are still mostly in stalemate. How do drones change from a tool in existing force structures and tactics to a new way to win? LINK
Outside interests
I decided to bury the most important story of the week down here - AOL is discontinuing its dial-up internet service. LINK
Data
A survey of AI use by newsrooms. LINK
Sky News got data on UK ‘de minimus’ imports from China - the loophole just closed by the USA that let Shein and Temu ship directly from China to the consumer without paying import duties. The US figure in 2024 was about £6bn (~$8bn), compared to a bit over $30bn for the USA. LINK
Column
AI products
It seems to me that in the last two years, generative AI has developed in two ways. On one hand, the models get ‘better’ and on the other, the labs companies try to wrap things around them - ‘thin GPT wrappers’ - to turn them into products. If you’re not an AI researcher, GPT5 does nothing very interesting for the first of these - it continues the steady, incremental progress we’ve seen since 2022. But it represents a pretty useful step towards a product.
I’ve written a few times that LLMs are a raw technology that looks like a product. Because the nature of the technology itself is that you can type in natural language questions and get answers, you don’t need to be an engineer or need any training to start using them, unlike, say, SQL. You can just ask for what you want!
But the reason we have thousands of different ‘database’ products for different use cases instead of just one is not only that SQL is not based on natural language - it’s that those use cases work a lot better when you have dedicated UI, tooling, connections, and decisions about how they should work around them - when they become product. We don’t have GUIs only because C++ and SQL are hard - we have GUIs because it’s hard to work out what the task is and how it should work from scratch at a blank screen. The GUI represents decisions and institutional knowledge about the problem and the task.
I think that means that LLMs will mostly become APIs and features embedded inside products built by other companies, and hundreds or even thousands o companies are trying to do this now.. But it also means that the model companies themselves have spent the last two years trying to add that tooling, in new LLM-native ways, to turn the raw chatbot itself into a product. So, we have connectors to Google Docs and similar services, screen sharing, web search, web use, memory, and of course ‘agents’ - all sorts of ways to extend what the LLM can do and extend what you can tell it.
However, the core of everyone’s daily experience with ChatGPT, Claude, Gemini, and all the others has not been that row of inscrutable little buttons at the bottom of the prompt or hidden in a hamburger menu - the core is the model picker. This absurd and much-mocked piece of GUI is the perfect expression of how thin the GPT wrapper is and how close we still are to the raw tech. The labs have spent the last two years making better models, but part of that has meant making different models that work in different ways, and that are better suited to different kinds of questions - but how is any normal person supposed to know which? Why should I have to know that I’d have got a better result if I’d given my query to o3 instead of 4o? When we use Google, we don’t have to choose which version of the search model to use, after all: this should be abstracted away. And now, finally, ChatGPT is abstracting away the model picker.
Hence, there’s an old phrase in UI theory that a computer should never ask you a question if it should be able to work out the answer. An LLM should be able to work out which model to use. This is a step towards product.