Or Why I'm not betting on Anthropic or OpenAI
Anyone who has recently used Claude or Codex knows that these models and the tools around them are very very good. It's hard to imagine an upstart jumping into the ring and seriously taking on these two behemoths. That's why I don't think there will be a serious competitor to these companies, rather what we'll see is a massive shift in this industry that diminishes the relevance of these two giants and opens the field for lots of little players. That's a big prediction that I'm surely not going to be wrong on.
AIs Borland Era
Anyone near computers in the 90s and early 2000s has heard of Borland. It was the C or C++ compiler that you used professionally.
Borland is now Defunct — it turns out selling compilers, even for a lot of money, when GCC is free is not a viable business.
I look at tools like Claude Code and Codex and see Borland — a good, proprietary, expensive product that is ripe for replacement by a good, open, free one.
Of course Claude Code and Codex are technically "free" you just need to pay to use the model. I think even as our local agent tooling moves to a free and open source model we're going to keep needing to pay that subscription. The difference? If I want to switch from Claude to OpenAI I'll be able to do it mid-stream without issue. All of a sudden the models complete purely on the strength of the models and the cost of the tokens not the tooling around them.
The Rise of Proprietary Hardware
Google is already running it's models on proprietary TPUs. While general purpose NVIDIA hardware may still be essential for model training, inference will likely move to lower power, custom hardware. These will likely be custom built for a specific model family, and the models themselves will likely have hardware fine tuning. With these in place there will be a significant decrease in the cost to run inference and so inference providers will flock to the hardware.
While OpenAI and Anthropic will likely try to play in this space I didn't think it will matter. Ultimately if I have a choice of two inference providers that offer the same model I'm going to choose based on speed and cost. This is going to be a commodity and commodities trend to the marginal cost which means that this is not going to be the money printing machine that Anthropic and OpenAI want it to be.
Of course we've also seen a rose in quality of small models. Gemma 4 from Google is particularly good, and runs ok on my 3 year old MacBook. Adding local models to a agent, combined with a choice of inference providers and models is going to give you the super power to plan locally and execute with a mix of local and cheap remote resources.
The Death of Open Weight
Currently we have 2 categories of models: Open Weight models, largely from well funded Chinese labs, and Proprietary models from companies that only allow access through their own hardware (or a limited set of vendors). I think the future lies in the proprietary weight models, models where the weights are for sale.
Labs will train a model, sell it, turn a small profit and then start all over. The thing is that this is going to be incredibly risky — one bad model and your on thin ice, 2 and you're probably broke. Anthropic and OpenAI are probably the big names in this area, they'll have enough cash on hand to outlive a lot of competitors but ultimately they'll be just one in a market of hundreds.
The "Good Enough" model
Inference providers are going to care about one thing when buying a model — can I turn a profit on this? The math is very simple, what does the model cost up front, what does it cost to run, what will a customer pay for it, and how much demand is there.
Right now there's still lots of runway for models to be better, but in the next year or two I think we'll see the rise of good enough models. These are going to trade quality for operational cost and make their money on volume. They will likely carry a higher upfront price tag (benefitting the labs that train them) and inference providers will likely spend more time performance tuning them.
Big frontier models like Opus and Mythos are likely not going to play in this space, and as this space grows the demand for those models will shrink. That's a bit of a death spiral, but ultimately one that's overdue.
The Beautiful Fragmented Future
Things move fast in this space l, so I think we're probably less then 2 years from a future where our AI usage is fragmented.
A free open source tool will be our primary means of interacting with AI. The tooling may be standalone or built into other software.
Inference will run both locally and remotely, using a mix of models and providers. Context will live locally and move between models.
Labs will sell models, not access. Most models will be Good Enough and specially tuned to tasks at hand.
Large general models will be the exception, reserved for novel new tasks and mostly tasked with exploring the space to recommend specialized agents that can work in it. Honestly most of us will never use these.