Anthropic just wrapped up what it called Project Deal, an experiment where AI agents actually negotiated and completed real transactions for real goods using real money. On the surface, it sounds like a win. The numbers look good: 186 deals across four separate marketplaces, over $4,000 in total value. The company said it was “struck by how well Project Deal worked.”
But dig a little deeper, and the story gets weird in ways that should concern anyone paying attention to how technology is reshaping commerce and human interaction.
When Better AI Becomes Your Disadvantage
Here’s the uncomfortable part: Anthropic ran one marketplace where everyone was represented by its most advanced model, with deals actually honored after the experiment ended. In the other three, it tested different configurations. The finding was straightforward and slightly alarming. Users represented by more advanced models got objectively better outcomes.
That shouldn’t surprise anyone. Smarter negotiators win. But there’s a catch that Anthropic itself flagged: the people on the losing end often didn’t realize they were worse off.
Think about that for a second. You walk away from a negotiation feeling fine about the deal you struck. You have no idea the other person got better terms because they had access to superior AI. You didn’t feel shortchanged because you had nothing to compare it to.
Anthropic called this the possibility of “agent quality gaps,” and honestly, that’s a sanitized way of describing a system where invisible asymmetry becomes the norm. It’s not fraud. It’s not dishonesty. It’s just… some people don’t know they’re playing against a better hand.
The Instructions Didn’t Matter Much Either
There’s another finding worth unpacking. Anthropic gave its AI agents different initial instructions to see if they’d change how aggressively they negotiated or what prices they’d target. Turns out, the instructions barely moved the needle on either sale likelihood or final prices.
That’s notable because it suggests that once you unleash an AI agent into a marketplace, its underlying capabilities matter way more than whatever guardrails or nudges you tried to build into its prompt. The model’s sophistication, not its values, determined the outcome.
This matters for business and policy alike. If you’re designing systems where AI agents negotiate on behalf of consumers or workers, telling those agents to “be fair” might feel reassuring. But the research here hints that capability matters more than instruction.
A Pilot With Caveats
Let’s be clear about what this actually was: a pilot experiment with 69 Anthropic employees, each given $100 in gift cards to trade with coworkers. It’s not the marketplace of the future. It’s a controlled study with self-selected participants who had no real stakes and weren’t operating in a genuine supply-and-demand ecosystem.
But pilots matter because they’re where you spot problems before they scale. And this one spotted something uncomfortable: AI agents can systematically produce unequal outcomes that users won’t even notice.
The real question isn’t whether this works. Anthropic showed it does. The question is what happens when this scales beyond 69 employees and $4,000 in gift cards to actual commerce involving actual people with real money and real consequences. When the AI agent negotiating your employment contract is better than the one your employer is using, or vice versa, will you know?


