9 March 2025
News
Siri 2 officially delayed
Apple confirmed that, as widely suspected, it will delay its relaunch of Siri until at least the summer and possibly much longer. Apple showed a bunch of demos at WWDC last summer, and this was supposed to have launched by now, but the technology isn’t ready and it’s debatable how far Apple really has the capability to do this.
The Siri demos we saw last year were the best of Apple, tuning technology into something useful. You could ask Siri ‘is my mother’s flight on time’ and it would check texts and emails to find the right flight, check it’s the flight today and not the one last year, and use the web to see if it’s on time. Apple called this ‘personal context’ and it leverages access to individual user data that only Apple and Google really have (OpenAI in contrast has none of this). It’s also a lot more than just making a good LLM (or downloading Llama) - you need a probabilistic system to talk to dozens of deterministic systems and produce coherent answers. Really, Apple proposed an agent model, running on edge compute, plugged into a dozen legacy data types, and who does have that working?
But the other side of Apple recently is that more and more things have slipped later and later, and don’t necessarily work as promised. A lot of people even in Cupertino are baffled by the decision to ship the Vision Pro, which remains a beautiful and technically impressive dev kit with no devs and no mainstream use cases (not that anyone else really knows the xR use cases either). “Siri 2” isn’t that and it isn’t Apple Maps (after all, that shipped), but Apple doesn’t normally pre-announce anything that’s not done, and it looks like the company gave in to pressure to tell an AI story too soon. ANALYSIS, SIRI DEMO
Manus gets buzz
This week’s buzzy AI thing is Manus, which launched a very cool demo of an AI agent that can apparently process a bunch of complex multi-stage tasks, including ‘Operator’-like use of third-party websites. It’s based (or at least domiciled) in Singapore, without much clarity about backers, but the are some suggestions that under the hood it’s really Anthropic’s Claude with a bunch of other tools bolted together. Meanwhile, the demo use-cases are indeed cool, but they remind me painfully of Rabbit in their breezy and slgihtly suspicious over-simplification - ‘arrange a two month holiday in five countries!’ If the demo is real, though, and if it’s their model, then this is a little bit of another DeepSeek - in the sense that it demonstrates the increasing commodification of this technology while the real use-cases are built elsewhere. LINK
OpenAI explores ‘salaries’ for agents
OpenAI is apparently considering tiered pricing for Deep Research style agents - $2k/month for work equivalent to ‘high income knowledge workers’ up to $20k/month for ‘PdD-level research agents’. Setting aside general amusement at the comparison with actual PhD incomes, Deep Research today consistently makes mistakes that would be unforgivable in a summer intern. More fundamentally, presuming that gets solved (and today we don’t know if it will), it seems very unlikely that automation will cost more than people rather than less, especially if (see Manus) there aren’t really any moats. LINK
Coreweave IPO
Coreweave is a cloud GPU company that got allocation (250k units) from Nvidia as a way to manage the market, and has gone from $229m in revenue in 2023 to $1.9bn in 2024 (and operating cashflow of $2.9bn). Microsoft was 62% of that and the next 2 customers (of which one is probably Meta) were another 11%. Now it’s filed for IPO, and the risk factors section is a thing of wonder. There’s some tech analysis to be done here, but to me this is a piece of financial engineering as much as it is a company. LINK
Coding is hot
Anysphere, a LLM coding company, is in talks to raise ‘hundreds of millions’ at a $10bn valuation... up from $2.5bn in January. Yes, January 2025. Apparently it’s already at a $10m revenue run rate in just 12 months, which makes that valuation a little easier to grasp. Coding is one place where LLMs really do have product/market/revenue. LINK
To this point, a YC partner said that a quarter of startups in YC’s current cohort have codebases that are 90% AI-generated. LINK
Meanwhile, Anthropic raised another $3.5bn at $61.5bn. Pretty soon they’ll be a big company. LINK
Sesame voice
Sesame, co-founded by Brendan Iribe from Oculus, is demoing some pretty impressive voice synthesis - natural language, emotion and emphasis are pretty much done. The use cases (‘artificial friends’) seem rather more speculative, but then frontier companies companies often have little idea what the tech will really be used for. LINK
The week in AI
Google is extending its AI search overviews. LINK
Microsoft now offers lightweight versions of DeepSeek R1 to run on your AI PC. LINK
TSMC says it will invest a futher $100bn in US manufacturing by the end of the decade. LINK
Singapore arrested some grey market traders, who buy Nvidia GPUs in Singapore and forward them to China, breaking US sanctions. Singapore is 18% of Nvidia revenue but only 2% of use, so there’s a lot more of this going on. LINK
Robots are hot
The Information reports that Figure AI, which uses AI to make humanoid robots that don’t fall over, is raising $1-2bn at a $40bn valuation. Meanwhile, The Information reports that Larry Page has started a new company using AI for robotic manufacturing. FIGURE, PAGE
Apple versus UK spooks
It emerged a few weeks ago that the UK government had issued a secret order that Apple create a ‘backdoor’ to its encrypted cloud backups, to be applied globally. Apple refused and turned off the feature in the UK, and now it’s going to court. LINK
Tiktok tick-tock
Donald Trump postponed the forced sale of Tiktok (or at least claimed to - the legality of his action is unclear) to give more time to buyers to emerge, and this week Reddit co-founder Alexis Ohanian joined a consortium. However, there’s a steady flow of reports that the company isn’t actually engaging with the sale process - it hasn’t hired bankers or talked much to any of the would-be buyers. Last time around Bytedance just called a bluff and dared the US to shut it down. BUYERS, PROCESS
America’s former allies look at Plan Bs for defence tech
With the US cutting off intelligence to Ukraine this week as Trump presses it to surrender to the Russian invasion, its reliance on Starlink for much of its battlefield comms is now a vulnerability: what if that gets cut off too? Ukraine is looking at switching to Eutelsat, which doesn’t have a big a constellation as Starlink, but can probably handle this.
More generally, since the US has spent the last few weeks systematically and explicitly breaking its alliances, any developed country with dependencies on US military systems is now thinking about contingency plans. The UK is now discussing a fallback position if the US cuts off Trident, and Germany is wondering if it’s safe to buy F35s when the US might cut them off. Some of this is a narrative from European defence contractors, and any real change would be very painful (both for national budgets and US defence contractor revenues), but the only clear message from DC in the last few weeks is that you’d be unwise to depend on America for anything. That will also make life harder for Anduril, and boost non-US defence-tech. EUTELSAT, TRIDENT 1, TRIDENT 2, GERMANY
Lockheed’s ‘affordable mass’
Lockheed Martin announced a new modular cruise missile, designed for rapid mass-production at much lower cost - $150k versus existing equivalents at 10x the price. The Russian invasion of Ukraine has concentrated minds on the reality that real wars against near-peers (which the US hasn’t done since Korea, if not WW2) use enormous quantities of munitions, which is a conflict with the military tendency to buy smaller and small numbers of more and more sophisticated and expensive systems. Ukraine produces roughly 2m drones last year. This is of course the Anduril thesis - but, see above. LINK
Ideas
DeepSeek shifted the vibe in Chinese tech. LINK
China’s ‘Six Little Dragons’ - games, robots and of course DeepSeek. LINK
DeepSeek spent this week releasing a lot of new open source tools. LINK
Yet another story on how Microsoft would like its own AI without relying on OpenAI and struggles to get them ro co-operate. LINK
A Google blog post on personalisation and ecommerce - which also mentions that Google is now seeing 5tr searches per day. LINK
Profile of Jessica Lessin, the journalist-turned entrepreneur behind The Information. LINK
Sigma, a Japanese camera company, launched a new high-end product, the ‘BF, that’s a nice expression of the way obsolete product categories can sometimes retreat to luxury. LINK
Anthropic wrote a doomery recommendation for the US AI action plan. LINK
Outside interests
A beautiful sale from the Chinese Imperial Wardrobe. LINK
A minor masterpiece on sale in Hampstead. LINK
Data
Denmark’s post office will stop delivery letters, as volumes have fallen too far. LINK
Amazon’s 750k warehouse robots. LINK
OpenAI and its alumni have now raised $100bn. LINK
YouTube Music has 125m subs. LINK
Column
AI and creativity
It’s easy to see how AI will help creative people with automation, just as has happened dozens of times in the past. Boring, labour-intensive grunt work will get automated - most obviously, the VFX industry will have massive cost deflation. Translation and dubbing will be much better, and flatten the world. It will be much easier and cheaper to fix mistakes, or even swap out characters. Creative tools of every kind will get a lot cheaper and more accessible. One way or another, this is another wave of a process we’ve seen before. Thirty years ago, Kevin Smith signed up for a cooking course so that he could get a student discount on film stock, and today he could use an iPhone and get better images.
After that, though, things get a bit fuzzier. How much further up the creative chain can generative AI go? LLMs can suggest ten ideas for action-comedies with two mismatched cops who have to work together. They can suggest ten new Marvel movies, and even write the scripts. But would it suggest a pirate movie based on a theme park ride? Would it cast Johnny Depp, and suggest he do an imitation of Keith Richards?
That is, the other side of making creative tools cheaper, more accessible, and more powerful is that you need someone creative to tell the tool what to do. I can ask Runway or Midjourney for images, but what images, and which results do I choose? I can buy the same camera that Cartier-Bresson used, and go to the same places, but I won’t get the same pictures. And I have an iPhone, but that doesn’t make me a filmmaker. The AI can make any picture but it doesn't know what picture to make.
There’s an interesting science question here. AlphaGo created new kinds of moves and strategies in Go, based on working the game out from first principles, but the game had a scoring system. AlphaGo could invent millions of strategies and then see which ones led to victory, by playing against itself. There was a built-in feedback loop. What’s the feedback loop for movies, and for generative AI in general? It can see how much the result matches what's already been done, but is that what you want?
Conversely, the fundamental nature of generative AI is in matching the average. It says ‘what would most people probably say here?’ The score is ‘how well does this match the average?' which means, really, how well it matches the mediocre. The new, different and original will get a low score, and creativity means things that are new, different, original and good. AlphaGo had a mechanism - an automated, scalable mechanism - to tell it what was good. How would you do that in media?
Hence, it’s pretty easy to see that generative AI can create generic hip-hop or generic punk rock. It could make songs that sound like the Sex Pistols. But how would it invent punk or hip-hop and the culture they came from? How would it know that people felt a certain way in the 1970s, and wanted to feel another way, and that punk could express that? It might suggest 50 ways people felt, and 50 ways to react to that, but how would it know what was good?
Narrowly, that might be a ‘human in the loop’ issue. A model asked to brainstorm ideas for a series of children’s books might well propose that you combine the boarding school story genre with the wizard genre and add the ‘lost prince’ trope, as one of a list of 50 or 100 other ideas.
But stepping back further, it’s always useful to think of LLMs (and indeed all internet systems down to Google or Instagram) as vast mechanical turks - they work by capturing the insights of humans. So how can you flow an LLM back into the mechanical Turk to get that feedback?