28 September 2025
News
OpenAI + Nvidia
Nvidia will invest up to $100bn in OpenAI over the next few years, with the cash going towards Nvidia chips for data centres with 10GW of capacity, which could cost $5-600bn in total. This is in addition to the $300bn, 5-year deal that OpenAI announced with Oracle earlier this month.
Taken together (and at face value), this is an ambition for OpenAI to spend as much or more on infrastructure as any of the hyperscalers are spending now: Microsoft will probably spend $100bn in 2025 and Google $85bn, and that’s with huge existing cloud businesses and cash flows. OpenAI wants to join the club. For Nvidia, this looks a lot like vendor financing (apparently OpenAI might lose rather than buy the chips), a popular option in the telecoms bubble and, like a lot of the other deals it’s done lately, a way for it to turn its exploding cash flow into a stimulus for more AI and more cash flow. This is also, though, a form of leverage, and leverage goes well when things go well and tends to unwind painfully when they don’t.
Also - remember Microsoft? LINK
OpenAI makes its first product?
Something like 10-15% of US consumers are daily active users of generative AI chatbots, so everything is still to play for. But how do you differentiate when the underlying models keep converging on roughly the same experience? You can add incremental features, but those are easily clues. You can go for distribution (hence all the browser assistants, and see Google adding Gemini to all its surfaces). And you can try to unbundle the core experience into products dedicated to specific use cases. Indeed, it’s reasonable to argue that the raw LLM chatbot itself isn’t really a product, only a technology, and that we need a product on top. So, now we have OpenAI Pulse, an AI assistant that you’re supposed to connect into all of your other platforms, and that can proactively make suggestions. This looks a lot like Google Now from a decade ago, except now LLMs could make it work. But it’s also a symbol of all the ex-Meta product and growth people flooding into the company right now. LINK
Tiktok resolution?
After almost a year of uncertainty, Donald Trump now claims to have a TikTok deal. We don’t quite know what that is, however. TikTok’s US operation will be run by a new JV, with a majority of US investors including Oracle, which will run it. Bytedance will retain a share of no more than 20%. However, the US Vice President, JD Vance, said that the JV would be valued at only $14bn, where most estimates were anything from $20bn to $100bn. Bloomberg reported one possible answer: that Bytedance will get 50% of the profits. LINK, VALUATION
Meanwhile, this means Larry Ellison is now (almost) a major power player, in both tech and US politics. Oracle is an important part of a lot of big company legacy tech stacks, but it hasn’t been relevant to any broader tech conversation since the 90s - now it’s becoming a significant AI cloud provider, and will control a major US online media platform with the power to shape what Americans see (the reason that Bytedance is being forced to sell in the first place). Meanwhile, his son, David Ellison, owns Paramount and CBS and is reportedly about to bid for Warner (which includes CNN). That poses a lot of fun anti-trust questions. LINK
The week in AI
Distribution, distribution, distribution: last week Google added Gemini to Chrome, and this week it’s adding it to Google TV. The launch video does a pretty good job of showing realistic use cases. LINK
While Google Labs launched an LLM-enabled mood board tool. LINK
Apparently, Meta is now working on software for humanoid robots, as an ‘AR level’ bet (meaning tens of billions of dollars). LINK
OpenAI is hiring an ad team. LINK
And running its first brand ads. LINK
Jony Ive may be, perhaps, working with OpenAI on a new device, but meanwhile, his LoveFrom firm continues to create random luxury objects with none of the constraints and commercial imperatives that the people there spent their lives working within. This week, a limited edition sailing lantern for $4,800. Lifestyle business? LINK
Accenture reported $1.8bn new ‘generative AI’ bookings in the last quarter. It will also lay off 11k people it thinks can’t be ‘upskilled to AI’, which got a lot of attention, but it’s also only 1.4% of headcount and might just be a cover. LINK, LAYOFFS
Amazon pays $2.5bn for dark patterns
The Amazon checkout-flow used to be full of deceptive ‘dark patterns’ that tried to trick you into subscribing to Prime without realising you were doing it (I remember posting on Twitter about how customer-hostile this was). The FTC eventually sued, and this week, just as it was about to go to court, Amazon settled for $2.5bn. LINK
The New York network plot?
Apparently, US police stopped a plot to swamp US cellular networks during the annual ‘UN Week’ conference season, finding 200 SIM servers and 100k SIM cards. LINK
Ideas
OpenAI published a paper trying to create a library of discrete tasks done by expert, experienced white-collar workers, and then benchmarking LLMs against them. Conclusion: AI will have parity with industry experts sometime next year. There’s a profound naivety in these kinds of analyses, that act as though you can reduce the job of someone in their mid or late 30s to ‘how well did they make that PPT/XLS/DOC?’ and ignore everything else they do, and why they do it, and indeed what exactly went into that document. It reminds me of the joke about the physicists who are asked to predict which horse will win a race, and they say “First, we presume the horse is a perfect sphere…” See the next link. LINK
A detailed analysis of why Geoff Hinton (godfather of machine learning) was completely wrong when he said that ML would replace radiologists. Expert work is almost never about how well you process a file. LINK
Apparently Brussels wants to try to clear up the cookie mess. Meanwhile, Google and Apple have mounted a lobbying push to get Brussels to understand just how many trade-offs and costs have come with the DMA and DMA. Yes, this is self-serving, but in every other kind of policy we understand that there are trade-offs, and tech is no different. COOKIES, APPLE, GOOGLE
Unilever is spending half its ad budget with influencers. Trying to, anyway… LINK
A profile of MrBeast’s efforts to turn $450m a year in revenue into a profitable company that might IPO. LINK
‘Foom’ - a continuous feed of interesting AI-generated videos. LINK
Online gambling in the Philippines has half the population as users, $1bn of fees to the government and plenty of human wreckage as a consequence. LINK
The on-device models in iOS 26. LINK
Today in hacking - a criminal group hacked a UK nursery chain, stole information about the children it looks after, and published that information on the web as part of an extortion plan. LINK
Outside interests
A medieval church on stilts. LINK
Vertical Art - the Eddi van Auken cane collection. Seriously. LINK
Data
Sometimes micro data is the best - how is the pet food industry using AI? Not much, yet. LINK
IAB Europe on ad industry deployment of AI. LINK
Meta has a $10bn business messaging business. LINK
Deloitte and BCG studies on corporate AI deployment challenges. BCG, DELOITTE
Bain’s annual state of tech report. LINK
And useful Deloitte data on US consumer AI adoption. LINK
Someone leaked an Andreessen Horowitz LP deck: $25bn returned to LPs since founding and the 2012 fund had a 9x return (the industry aims for 3x minimum). LINK
According to Similarweb, ChatGPT is 20% of Walmart’s referral traffic. Great headline! Context, though: referral is 5% of Walmart traffic. "ChatGPT, what is 20% of 5%?" LINK
Pew released a new report on consumer use of and attitudes to ‘AI’ in the USA. However, it makes no attempt to define ‘AI’ or distinguish the current wave of generative AI systems, meaning that much of the data is worthless. Hence, the last page claims that over half of Americans were weekly ‘AI’ users in December 2022 (ChatGPT launched in November). This kind of thing is widespread and frustrating: if you have data on AI use but do not define AI or use (Daily? Monthly? Once?), you have no data. LINK
Column
AI PhDs and the measurement problem
People in Silicon Valley are prone to the disease of thinking that they’re the only clever people, and that their jobs are the only hard jobs. It’s a little like the way first-year undergraduates think or pretend to think that their subject is the only hard one and the others are easy. You're supposed to grow out of this at about the age you can buy alcohol, but in the SF Bay Area you find plenty of people in their 30s and 40s who never did. Of course, that’s the attitude that gets you Uber, but it’s also the attitude that gets you WeWork.
Meanwhile, one of the basic conceptual challenges in AI research is that we don’t have a coherent and systematic way to measure human intelligence, nor an equivalent way to measure machine intelligence. We have dozens or hundreds of ways to measure something, but they’re effectively the equivalent of putting something into a black box and then looking at what comes out - you don’t really know what the box did, or why, or what it was trying to do.
Part of the problem is to ask what you’re measuring. There’s a whole class of experiments into animal intelligence that were eventually realised mostly to be testing how close the animal is to a primate. “Is this creature failing the test because it lacks cognition, or because it lacks opposable thumbs, binocular vision and fine motor skills?”
Conversely, does your test show that your subject is very good at something that isn’t really intelligence, or perhaps a different kind of intelligence to whatever you think you’re testing? In the famous case of ‘Clever Hans’, a man thought he’d taught his horse to do simple maths, tapping out the answer with its hoof, but in fact, it was reading his body language and counting until he relaxed.
So. A calculator can do superhuman maths, and a database has superhuman intelligence. They far surpass human performance, but we don’t call them ‘intelligent’. They do things that computers are good at and people are bad at. A year or two ago, LLMs started passing professional exams - famously, the US Bar exam. Does that mean they have some kind of intelligence equivalent to a law student? (Set aside the questions of whether the training data contained the exams.) Not really. they’re doing information retrieval and pattern matching. They’re doing what machines are good at, albeit in a new form, but they can’t do all the other things a first-year law student can do, even in law. A week ago Demis Hassabis made exactly this point talking about ‘competitors’ who now claim that their LLMs have ‘PhD-level ability’ - he called this ‘a nonsense’. Just ask the question a different way, and see the model collapse. Like a database or a calculator, they can do some things that a PhD can do, and do it faster and without getting tired, but that doesn’t make them a PhD. There is more than that.
It seems to me that this week’s paper from OpenAI falls into the same fallacy. Make a set of ‘work products’ for people in their 30s in a range of professions - a deal memo, a sales forecast, a performance review. Now ask ChatGPT to make them. Is the AI version 75% as good? Well then, we’re at 75% of human expert level! Really? Is that what you measured? Is that what you think the job is? Do you think a VP or SVP is judged by how good they make work product? As I wrote above, this reminds me of the joke about the physicists who are asked to predict which horse will win a race, and they say “First, we presume the horse is a perfect sphere…” You’re certainly measuring something and it’s certainly getting better quickly, but you don’t have 75% of a VP of HR, and you don’t really know how far away that is. There is a profound confusion here, first in pointing at an exponential curve and just presuming it will remain on the same trajectory, and second, in thinking that if you’ve labelled a point on that chart ‘humans’ then that’s what your benchmark is measuring, and what your curve will reach.
Indeed, I think it’s much more interesting to ask what happens if we get systems that are 10x or 100x of what an expert might do on one axis, but remain quite incapable of everything else involved in the job. How would that change things? After all, that’s how this has always worked before.