The AI industry just got a reality check. Private company defaults are hitting 9.2%, the highest in years. Venture capital firm Lux recently sent out a message that basically boils down to: stop trusting handshake deals with your compute providers. Get it in writing. Better yet, get it somewhere that doesn’t depend on a provider at all.
That’s where a Spanish startup called Multiverse Computing enters the picture. While the AI world obsesses over bigger, faster models running in sprawling data centers, Multiverse is doing something different. It’s taking those massive models, squeezing them down until they fit on your phone, and asking whether we actually need the cloud for everything.
The timing feels less like a coincidence and more like a reckoning.
When the Cloud Becomes a Liability
Here’s the problem nobody talks about enough: every AI model running in the cloud is a dependency. You’re betting your business on someone else’s infrastructure not failing, not getting hacked, and not deciding to charge you triple next quarter. The bigger your reliance on external compute, the bigger your risk.
Multiverse is positioning edge AI as an antidote to that fragility. Their new CompactifAI API portal went live recently, giving developers direct access to compressed models without having to go through AWS or any other middleman. The models are small enough to run locally, which means no data center dependency, no latency nightmares, and crucially, no counterparty risk.
The company has already proven it can compress serious models. They’ve worked with OpenAI, Meta, DeepSeek, and Mistral AI to shrink their systems down. Their latest creation, HyperNova 60B 2602, is built from OpenAI’s publicly available gpt-oss-120b but allegedly runs faster and cheaper than the original.
That’s not just a tech flex. That’s a fundamental shift in how we might think about deploying AI.
The Privacy Angle Gets Real
CompactifAI launched as a consumer chat app, kind of like ChatGPT but with a twist. It runs a model called Gilda locally on your device, no cloud required. You get to ask questions, get answers, and your data never leaves your phone. It’s genuinely appealing for anyone who’s gotten tired of wondering where their conversation data ends up.
But there’s a catch, because there always is. Your phone needs enough RAM and storage. Older iPhones? They don’t qualify. When that happens, the app (which uses a wonderfully nerdy routing system named Ash Nazg after the One Ring) switches to cloud models and you lose the privacy benefit entirely.
The app itself is still tiny, with fewer than 5,000 downloads in the past month. That’s not necessarily a failure though. It’s more like a proof of concept, a way to show what’s possible. The real target market is businesses.
Where the Money Actually Is
Multiverse already counts over 100 customers, including the Bank of Canada, Bosch, and Iberdrola. Those aren’t exactly household names for everyone, but they’re the kind of organizations that care deeply about control, security, and cost efficiency. They’re also the kind that will pay for it.
And they might be paying more soon. The company raised $215 million in Series B funding last year and is now rumored to be raising around 500 million euros at a valuation exceeding 1.5 billion euros. That’s serious money for what still feels like an under-the-radar player in the AI world.
The use cases matter here. Imagine embedding AI in drones, satellites, or industrial equipment where cloud connectivity is either unreliable or completely unavailable. Edge models aren’t just cost-cutting measures then. They’re actually the only option that works.
The Bigger Picture
What’s happening with Multiverse is part of a larger trend that feels inevitable in retrospect. When you can run useful AI models locally, you stop being entirely beholden to cloud providers. When you stop being beholden to cloud providers, the entire economics of AI deployment change.
Mistral just released Mistral Small 4, optimized for everything from chat to coding to autonomous workflows. They’re also releasing a system called Forge that lets enterprises build custom smaller models and decide what tradeoffs matter for their use case. Even Apple Intelligence made this move by combining on-device and cloud models rather than going all-in on one approach.
The gap between what small models can do and what massive language models can do keeps narrowing. That’s not just good news for efficiency advocates. It’s potentially destabilizing for anyone betting their business model on the idea that bigger always wins.
The real question isn’t whether edge AI will matter. It’s whether cloud providers have already woken up to the threat.


