Forward deployed engineers across e-commerce and legal. We embed, build AI systems, and stay to run them. See the proof →
← Back to Insights
12 March 2026

Claude Opus 4.6 Changed How I Run Consulting Engagements

Agent teams, 1M token context, and finance benchmark leadership. This is the model powering MarginOps right now — and it has fundamentally changed the speed and depth of transformation work.

Agent Teams: Splitting Complex Analysis Across Workstreams

The £6.4M transformation we deployed had 7 workstreams running simultaneously: pricing, customer data, customer experience, cloud infrastructure, warehouse operations, marketplace management, and analytics. Before Opus 4.6, coordinating analysis across these workstreams meant I was the bottleneck. I would run an analysis on pricing, switch context to cloud costs, come back to pricing with new information from the cloud migration, and manually synthesise the cross-workstream dependencies.

Opus 4.6's agent teams changed this completely. I now deploy a coordinator agent that dispatches sub-agents to each workstream. The pricing sub-agent analyses elasticity data while the cloud sub-agent reviews infrastructure costs while the CX sub-agent processes customer satisfaction metrics. The coordinator synthesises their findings and identifies cross-workstream dependencies automatically.

Concrete example: during the initial due diligence phase, the coordinator agent identified that the client's cloud infrastructure costs were inflated partly because the pricing vendor's API was making 2.3 million calls per day to a grossly over-provisioned database cluster. The pricing sub-agent flagged the vendor for replacement. The cloud sub-agent flagged the database for optimisation. The coordinator connected the two — replacing the vendor would eliminate the API load, which would allow a more aggressive infrastructure downsizing. That connection saved an additional £40K annually that neither sub-agent would have found alone.

1M Token Context: Entire Businesses in a Single Session

A typical MarginOps engagement starts with what I call the "full load" — ingesting everything about the client's operations in a single session. Before Opus 4.6, this meant breaking the analysis into chunks. I would feed in the P&L for one quarter, analyse it, then feed in the next quarter and hope the model retained the patterns from the first. It was like trying to read a novel one chapter at a time with amnesia between chapters.

With 1M tokens, I load the lot. Two years of monthly P&L statements. The complete vendor contract portfolio. The full product catalogue with pricing history. The customer database schema and sample data. The infrastructure architecture diagrams. The team structure and salary data. All of it, in one session.

The difference is not just convenience. It is analytical quality. When the model can see the entire business at once, it finds connections that chunked analysis misses. During our fashion retailer engagement, the full-context analysis identified that the client's highest-margin product category had the lowest marketing spend and the worst search ranking on their own site. That insight — which required simultaneously understanding the P&L, the marketing budget allocation, and the site search configuration — was worth £200K+ in annualised revenue when we fixed it.

A transformation audit that used to take three weeks of iterative analysis now takes three to four days. Not because I am cutting corners. Because I am not losing context between sessions.

Finance Benchmark Leadership: Analysis You Can Trust

Opus 4.6 is ranked #1 on Finance Agent, TaxEval, and BigLaw Bench. For most people, benchmarks are abstract. For me, they are directly correlated with revenue.

When I build a pricing model that recommends markdown depths across 37,800 products, the model needs to reason correctly about gross margin impact, stock-to-sales ratios, promotional overlap, and category-level elasticity — simultaneously. A model that is 90% accurate on financial reasoning will make catastrophic errors on 3,780 products every cycle. A model that is 99% accurate will still make errors on 378 products. The 7-check safety monitor catches those errors, but fewer errors upstream means fewer interventions downstream means faster, cleaner pricing decisions.

The finance benchmark performance also matters for client credibility. When I present analysis to a PE firm's operating partner or a retailer's CFO, they need to trust the numbers. I can point to independent benchmark validation that Claude leads on exactly the kind of financial reasoning I am using it for. That is not a sales pitch. It is a verifiable claim backed by third-party evaluation.

In practice, the improvement from Opus 4 to Opus 4.6 on our internal financial analysis accuracy tests was approximately 14%. That translates to fewer manual corrections, faster sign-off from client finance teams, and higher confidence in the recommendations we deploy.

Curious what your margin opportunity looks like?

Free Tool

How much margin are you leaving on the table?

Answer 6 questions. Get a personalised margin estimate in under 2 minutes.

Take the Free Margin Audit

Claude Code: From Analysis to Deployment in Hours

Claude Code is where the consulting engagement becomes a delivery engagement. The traditional consulting model is: analyse, recommend, hand over a slide deck, and hope the client implements it. MarginOps does not work that way. We analyse, build, deploy, and measure. Same team, same tools, same week.

Opus 4.6 powers Claude Code, and the improvement in coding quality is tangible. The database optimisation that produced a 59,000x query speedup was built and deployed using Claude Code in a single session. The pricing engine's 7-check monitor was iterated from concept to production in four days. The CDP's customer segmentation pipeline — which replaced a £40K vendor — was built in Claude Code and deployed to production in under a week.

The iteration speed changes the economics of consulting. I can test three approaches to a problem in the time it used to take to spec out one. If the first approach does not work, I have not burned a week of budget — I have burned an afternoon. This means clients get better solutions faster, at lower cost, with less risk. That is the MarginOps model, and Opus 4.6 is the engine that makes it viable.

The Compounding Effect

None of these capabilities exists in isolation. Agent teams use the 1M context window to coordinate across workstreams with full business context. The finance reasoning quality means the agent teams produce trustworthy analysis, not just fast analysis. Claude Code turns that analysis into deployed production systems within the same engagement.

The result is a consulting model that would have been impossible 18 months ago. A £6.4M transformation programme with 119 EBITDA-tracked initiatives, delivered by a lean team, with 7 production AI agents running autonomously. +77% revenue from AI pricing. 60% cloud cost reduction. CSAT from 59% to 80%. All on Opus 4.6.

If you are running transformation programmes with the previous generation of tools — slide decks, manual analysis, chunked context, single-threaded investigation — you are leaving margin on the table. Not because your team is not smart enough. Because the tools have changed, and the operators who adopt them first will compound their advantage.

Ready to see what Opus 4.6 can find in your business?

We will run a full-context analysis of your operations and show you where margin is hiding. Three to four days, not three weeks.

Want results like these?

We go into businesses and make them permanently more profitable. Every initiative is EBITDA-tracked.

Book a Call See the Case Study