Why I Built MarginOps Entirely on Anthropic's Claude

I Tested Everything

I have been building production systems for over 25 years. CTO at IG Group. Founded DatingUK and PetMeds, both sold. Built AVORA Analytics to €7.4M raised. Co-founded Gravity Data, acquired. Co-founded Streamkap, $3.3M raised. I say this not to list credentials but to make a point: I have no loyalty to any vendor. I use whatever works.

When I started building MarginOps in late 2024, I tested every serious foundation model. GPT-4o, Gemini Pro, Llama 3, Mistral Large, and Claude. The use case was specific: I needed a model that could analyse a £6.4M transformation programme across 119 EBITDA-tracked initiatives, reason about pricing elasticity across 37,800 products, and generate actionable recommendations that a CFO would trust enough to sign off on.

GPT-4o was fast and capable, but its reasoning on multi-step financial analysis was inconsistent. It would get the direction right but miss second-order effects — the kind of thing that turns a margin improvement into a margin erosion when you miss the interaction between a markdown and an active promotion. Gemini was impressive on benchmarks but hallucinated confidently on edge cases in financial data. Open-source models were out of the question for enterprise clients handling sensitive P&L data — the compliance overhead alone would have killed the business case.

Claude was different. Not perfect. But consistently, measurably better at the specific kind of reasoning that margin analysis demands.

Reasoning Quality for Complex Business Analysis

The pricing engine is where Claude's reasoning advantage showed up most clearly. The engine monitors 37,800 products every 15 minutes, running 7 automated checks against the live catalogue. Each check requires multi-step reasoning: is this product subject to an active promotion? If so, what is the effective price after the discount code? If the engine recommends a further markdown, does the combined discount breach the margin floor for this category?

This is not pattern matching. This is chain-of-thought reasoning over structured financial data with real constraints. Claude handles it reliably. The discount conflict detection alone — powered by Claude's ability to reason about overlapping promotional mechanics — prevented an estimated £180K in annual margin leakage.

The customer data platform is another example. Claude performs customer segmentation across behavioural, transactional, and engagement dimensions. It identifies segments like "high-value lapsing customers who respond to category-specific promotions but not site-wide discounts." That level of nuance requires genuine reasoning about customer behaviour, not just clustering algorithms.

Trust and Safety for Enterprise Clients

MarginOps works with PE-backed retailers. Our clients have boards, audit committees, and compliance teams. When I tell a CFO that an AI agent is making pricing decisions across their entire product catalogue, the first question is never "how accurate is it?" The first question is always "how do I know it won't do something catastrophic?"

Anthropic's approach to AI safety is not a marketing differentiator for me. It is a commercial requirement. Claude's Constitutional AI framework, its refusal to fabricate data when uncertain, and its transparent reasoning process mean I can give clients audit trails they can actually review. When our pricing agent recommends a markdown, it explains why. When it flags a conflict, it shows the reasoning chain. This is not a nice-to-have. For board-level clients managing £6.4M transformation programmes with 119 EBITDA-tracked initiatives, it is table stakes.

The responsible AI angle also matters for the customer experience agent. When an AI is handling first-line support for thousands of customers, you need a model that knows what it does not know and escalates gracefully. Claude's CSAT contribution was part of taking satisfaction from 59% to 80%.

Curious what your margin opportunity looks like?

Free Margin Audit — 2 min

Free Tool

How much margin are you leaving on the table?

Answer 6 questions. Get a personalised margin estimate in under 2 minutes.

Take the Free Margin Audit

Agentic Capabilities That Ship to Production

The gap between "impressive demo" and "production system" is where most AI projects die. I have seen it repeatedly — at AVORA, at Gravity Data, across dozens of client engagements. The model works in a notebook. It falls apart when you need it to run autonomously, handle edge cases, recover from failures, and coordinate with other systems.

Claude's agentic capabilities — particularly since Opus 4 — closed that gap. All 7 of our production agents run on Claude: the pricing agent, the CDP agent, the CX agent, the warehouse agent, the marketplace agent, the DevOps agent, and the analytics agent. Each one runs in production, autonomously, processing real transactions and making real decisions.

Claude Code transformed the development workflow. I build and deploy these agents directly from the terminal. The iteration speed is absurd compared to what I experienced building AVORA or Streamkap. A pricing model that would have taken a month of iteration now converges in a week. An agent that would have needed a team of three engineers to build and maintain runs on Claude Code with a single operator.

Why We Went All-In on Claude

The results speak for themselves. +77% revenue from AI pricing. 60% cloud cost reduction. CSAT from 59% to 80%. All powered by Claude. All in production. All generating measurable EBITDA impact.

MarginOps is proof of what Claude can do when it is deployed by operators, not theorists. We do not build demos. We build production AI agents that move P&L lines. Every one of our 119 initiatives is tracked to EBITDA. Every agent has measurable outcomes. Every recommendation has an audit trail. Everyone on the team is trained on Anthropic's tools — Agent SDK, MCP, Claude Code — and uses them daily.

If you are evaluating AI foundations for enterprise operations — pricing, supply chain, customer experience, cloud infrastructure — I would encourage you to test Claude against your actual workloads, not benchmarks. Benchmarks measure capability. Production measures reliability. Claude delivers both.

That is why MarginOps runs on Claude. Not because it is the most hyped. Because it is the most reliable when the stakes are real.

See what Claude-powered agents can do for your margins.

We will analyse your operations and show you where AI agents can drive measurable EBITDA improvement. No slide decks. Just numbers.

Explore AI Agents → Book a Call

Why I Built MarginOps Entirely on Anthropic's Claude

I Tested Everything

Reasoning Quality for Complex Business Analysis

Trust and Safety for Enterprise Clients

Agentic Capabilities That Ship to Production

Why We Went All-In on Claude

Read Next

See what Claude-powered agents can do for your margins.

Want results like these?

Why I Built MarginOps Entirely on Anthropic's Claude

I Tested Everything

Reasoning Quality for Complex Business Analysis

Trust and Safety for Enterprise Clients

Agentic Capabilities That Ship to Production

Why We Went All-In on Claude

AI Agents

Read Next

Claude Code AI Agents at Scale

Replacing SaaS Vendors with AI

Build an AI-Capable Team Without Data Scientists

See what Claude-powered agents can do for your margins.

Want results like these?