12 April 2026

News

Anthropic’s Security model

In the last couple of months, it’s been clear that agentic AI will automate broad classes of software development, and this week Anthropic announced a new model, Mythos, that demonstrates the other side of this - Mythos is very good at finding security exploits automatically. In particular, it’s good at searching across many different parts of a complex system to find multiple different weak spots that can be exploited and chained together to break in.

I often compare AI to interns (or associates), and that’s a comparison in two parts: the first part is to imagine you have a million interns, and so now you can have an intern listen to every customer call and tell you if the customer is angry; but the second kind is to imagine you have one intern who can listen to a million customer calls at once. That kind of ‘intern’ can give you insights that no human ever could, not because they’re more ‘clever’ but because of that scale. Mythos is that intern: a security researcher that can check everything all at once, and remember everything and make every connection.

So, Anthropic says that Mythos already found a whole bunch of exploits in lots of widely deployed software, including one in FreeBSD that’s been there for 27 years that would let anyone crash it. Accordingly, it’s not making Mythos publicly available yet, because it doesn’t want just anyone to get that kind of hacking capability - instead, it announced a partnership project (‘Project Glasswing’) with other major tech companies to use this to fix their systems. Anthropic has a track record of making apocalyptic and somewhat questionable statements, but this one is getting a lot more credence from security people, and it’s ironic that the US Department of Defence claims it’s a supply chain threat when it now claims to have the capability to hack anything on Earth, more or less. LINK, OPINION

Anthropic coding blows the doors off

Meanwhile, Anthropic did a deal to buy both TPU AI accelerator capacity from Google and buy its own chips for Broadcom. The announcement includes the fact that the company now has annualised revenue (past four weeks multiplied by 13) of $30bn, up from $9bn at the end of December 2025, $14bn in February, and $19bn in March. I don’t like this ‘annualising’, but on a monthly basis, that means the company has gone from ~$75m monthly revenue at the beginning of last year to close to $2.5bn last month.

Almost all of this is software development, where (see above and below), agentic coding uses vast amounts of tokens for vast amounts of money that software developers are willing to pay given the productivity gains. It is pretty widely reported that this inference itself is profitable, but there are lots of complaints that Anthropic is capacity-constrained (hence the infra deals) and meanwhile training the next model is a money pit (and each foundation model only remains relevant for six to nine months at best).

This revenue also puts Anthropic ahead of OpenAI, based on its last public claim of $2bn monthly revenue at the end of March ($24-25bn annualised). However, these are all rounded numbers, and more importantly, the two companies don’t recognise revenue in the same way: according to The Information a few weeks ago, when their models are resold by cloud partners, OpenAI records only its cut as revenue, whereas Anthropic records the total sale as revenue and then deducts the cloud providers’ share as a cost. LINK

For context, Amazon’s 2025 shareholder’s letter says AWS has $15bn in annualised AI revenue - i.e. ~$1.2bn monthly revenue. (The letter also points to Amazon’s launch of same-day delivery in broad parts of rural America, which would have been front-page news before ChatGPT.) LINK

Tokenmaxxing

The sheer amount of tokens you can consume if you’re spinning up squadrons of agents, automating dozens of tasks, can get intoxicating: someone at Meta started an internal leaderboard for which engineers were using the most - apparently over 20 days the total usage was 60tr, and the highest-ranked user had burnt 281 billion, which could cost anything from hundreds of thousands to millions of dollars. The board was taken down after it was reported. Meanwhile, Visa says its employees are using nearly 2tr a month. This reminds me a bit of showing off your mobile data bill (though at vastly greater scale), and some of it is just performative, but it’s also about how fast you are (or can say you are) adopting something that’s clearly transforming your industry, which is also a recruiting tool. LINK, VISA

Meta gets back into the game

It’s a measure of how much happened this week that I put this story in fourth place - Meta has released Muse Spark, the first model from the new AI lab that it built from scratch for billions of dollars last year. And, the model is pretty good - not right at the top of the league tables, but definitely in the pack with Anthropic, Google, and OpenAI (though we should be a little cautious, given that the Llama model from the previous team tried to game the benchmarks). The fact that Meta fell off the ladder and managed to get back on is a tribute to Mark Zuckerberg’s management skills, and also a rebuke to Apple, Amazon, and especially Microsoft, all of whom have failed to do the same. LINK

Sam Altman

The most terrifying sentence in the English language is “Rowan Farrow is writing a profile of you”, but this time he filled half the New Yorker with a profile of Sam Altman and didn’t find anything much that we don’t already know. Many people who know him say he’s an untrustworthy, manipulative liar, and many people who’ve worked with him quit, and whether you agree or not, all of that’s been written about at length already. Indeed, the only thing that was new to me was the note that someone with a grudge is spreading rumours that Altman pays for under-aged sex (which Farrow couldn’t find any support for). LINK

Meanwhile, someone threw a Molotov cocktail at Sam Altman’s house. The crazy fringes of the ‘doomers’ are still around, especially in the Bay Area, even as everyone else concluded they’re morons, but there’s also a growing panic about AI and data centres, sometimes rational but probably wrong (impact on employment), and sometimes based on pure misinformation (no, this isn’t ‘consuming’ lots of water). Either way, you would need to be a horrible and very stupid person to think any of that justifies violence. Sam Altman wrote about this here. LINK

PE/AI rollups

The WSJ reports that Anthropic is in talks with a range of PE funds to set up a new vehicle that would act as a consultant to help their portfolio companies deploy AI. Anthropic would invest $200m of a planned $1bn raise. There’s an obvious irony in the fact that deploying a thing that breathless Silicon Valley types think will destroy consultants turns out to need a lot of consulting. But as a company, why would you want a consultancy that’s captive to one vendor, especially when things are changing so fast? Meanwhile, Jeff Bezos’s new venture, that aims to use PE and AI to buy and transform companies, hired the last remaining co-founder out of Elon Musk’s xAI. LINK, BEZOS

In other news

The US tractor company John Deere, which has morphed into a tech company, settled a class-action ‘right to repair’ lawsuit. LINK

The New York Times did a big investigation claiming that an early cryptocurrency pioneer called Adam Back is the pseudonymous ‘Satyoshi Nakamoto’ who created Bitcoin. 🤷🏻‍♂️. LINK

Anthropic’s marketing team is hiring video editors at $250k each - one of many fields where AI is an accelerant, not a replacement. LINK

If you’re an ambulance-chasing lawyer looking for clients to sue Meta, Meta won’t let you use their platform for ads. There are many layers of irony here. LINK

Ideas

All those open-source Chinese models are starting to go closed as they look for revenue. LINK

The FT suggests that the averaging inherent to LLMs might mean that where social media tends to reward strong views and polarising positions, AI might be the opposite. LINK

Building a business on Roblox. LINK

The war in the Persian Gulf means a lot of GPS jamming, which means delivery drivers are blind. LINK

An interview with the founders of prediction market / bookie Kalshi. LINK

Outside interests

If you’re in New York this week, make space to go to the Gunzberg show at the new Sotheby’s Breuer - they have the room full of mirrors that Claude Lalanne made for Yves Saint Laurent and Pierre Bergé. At auction for $10-$15m. LINK

A Chevalier Guard helmet. LINK

US public opinion in favour of Israel, once a geopolitical constant, has shifted sharply following Israel’s reaction to the October 2023 pogrom, with ‘favourable’ going from 55% in 2022 to 37% now, and ‘very unfavourable’ going from 10% to 28%. Even for Republicans, 57% of those under 50 have a negative view. LINK

Data

An estimate of how many AI chips China has, between legal imports, smuggling, remote access, and domestic chips - perhaps 15% of the global base? LINK

The FBI says online scams stole $21bn in the USA last year, with crypto and now AI at the forefront. LINK

Ramp thinks that Anthropic’s revenue from corporates will overtake OpenAI in the next month or two. This data is self-selected to the kinds of companies that use a modern SaaS payment management platform, so it’s not a neutral sample of US industry, but it’s probably directionally correct. LINK

How many books do Americans actually read? Not much change in the last decade (while books and audiobooks remain much smaller than print). LINK

Some a16z charts, partly proprietary data from portfolio companies, on where enterprise AI adoption is happening. They estimate about 2/3 is coding. LINK

A batch of recent research on AI adoption: GALLUP ON US CONSUMERS, US ENTERPRISE, EU CONSUMERS, EU ENTERPRISE

Column

Price with pride

No one really knows what token capacity, token consumption, token cost, or token pricing will look like in five years.

We do know the algebra. We know that there are more and more people using this more and more, and we know that with reasoning, media, and now tool-using agentic coding, the amount of tokens that somebody can use is increased by orders of magnitude. On the other side, inference efficiency keeps pushing the numbers in the other direction, with the cost per token halving every three months. Next, new approaches will keep changing those numbers for both efficiency and usage. And then, of course, sitting behind all of this, there's a chase for the next model. Inference today is profitable, but any given model is only relevant for six to nine months, and so you have to keep chasing the frontier. And we don't know how long that will go on for or will end up costing.

All of this means that trying to forecast five years out today is rather like trying to forecast internet bandwidth in, say, 1998. You know what all the rows in the spreadsheet are, but you don't know what the values are going to be. But I think the more interesting comparison is to look at mobile bandwidth, which was the last time that consumer usage had real marginal cost that had to be displayed to the user.

Back in the 2000s, the mobile industry had built and deployed radio networks that can give you mobile data, but they did not have enough capacity to let you use however much you wanted without the network collapsing, and so they had to work out the right way to ration usage. In doing that, they wrestled with problems we see today: usage of capacity doesn’t map well to value, uses that look similar might use very different amounts of capacity, and the base unit of capacity (bits then, tokens now) means little or nothing to a normal user.

Telcos dealt with this in two ways. First, they spent 15 to 20% of their revenue on capex every year, and deployed successive waves of new technologies that gave both more speed, but much more importantly, more capacity (5G was almost entirely about more capacity). Secondly, they tried to segment the pricing into different kinds of perceived value. The extreme example of this, that people loved to talk about, was SMS (which was actually using a signalling channel, not mobile data), where the piece per bit was astronomical but the perceived value mapped pretty well to the price (proof: people paid). After that, things got more complicated: they would zero-rate their portal, and they would take payments from companies for flat-rating particular services. In principle, they tried to segment the price-elasticity curve.

We're speed-running all of those conversations today for enterprise AI. Companies don't like getting bills of tens of thousands of dollars they weren't expecting, but as a SaaS company, you got that bill yourself from your model provider, and ultimately, the model provider is incurring that cost, so how can you control that? Do you give people bundles? Do you try and segment by different use cases? Do you create complex hybrid structures with caps and tiers? In particular, the buzzy idea at the moment is outcome-based pricing, where the user pays a percentage of the actual value delivered by the software in that use case. This sounds great in theory, but how many use cases are there where you could mechanistically connect a given number of dollars to a given set of actions taken by a piece of software? Stripe can do that, obviously, and Salesforce might claim to be able to, just as an ad agency might claim to be able to (and of course, half of the ad-tech world is built around attribution). But when an HR department buys a piece of software that helps them track hiring or internal reviews, there isn't revenue or even a direct cost-saving. The opposite extreme is that security software might prevent your entire company from ceasing to exist, but what’s the ‘outcome’ value of that?

I think there are two things that we can say with some degree of certainty about what happens next. The first is that we will see a proliferation of complex and clever pricing systems, some of which will actually make sense and might endure. But the second is what happened every other time this has come up before: we will end up with flat-rate bundled pricing, where the client gets absolute predictability on what they're spending. After all, an old-fashioned seat-based model is capturing usage, just at a level of abstraction, and that HR team buying the hiring management system is thinking about a tangible value in terms of the time it saves and the number of people that represent. They're just not giving it to their SaaS provider as a percentage of salaries saved. To put that another way, the legacy proving models areoutcome-based pricing.

But the other part, of course, is how much of this will go to the edge. Foundation models today are big and heavy and need to run in the cloud, but part of the future is splitting them up into the older model and the smaller model that's good enough for your use-case, that can run much cheaper in the cloud, or, indeed, that can run on-prem or on your phone. That's what happened with mobile. In the end, Moore’s Law squashes complexity. Mobile shifted from weird complex pricing structures, bundles, caps, and segmentation into flat-rate pricing, and indeed to the edge, with Wi-Fi offload. One of this is tech determinism - that things always get much cheaper and move to flat rate - but some of it is also that we have to get there for anyone to be able to use this. Enjoy your tokenmaxxing while it lasts.

Benedict Evans12 April 2026