90 days from assessment to self-sufficient AI operations. 12 steps. 4 phases. One outcome: your team runs AI without us.
Notice the overlaps. Enable starts before Deploy finishes. Transfer starts before Enable finishes. This is intentional. Linear sequencing is what makes traditional AI transformations take 18 months. Parallel execution is what makes ours take 90 days.
Most AI assessments take 6-12 weeks and produce a 100-page strategy document that nobody reads. Ours takes 10 days and produces a deployment blueprint with specific systems, specific agents, and specific measurable outcomes. The difference: we're not assessing whether you should use AI. We're identifying exactly where to deploy it first.
I audit three things: your data estate (where does your data live, what format, how accessible), your cloud posture (what infrastructure are you running, what does it cost, what capacity is available), and your API landscape (what systems can talk to each other, and which ones are locked behind manual processes).
The output is a "What You Have" map — a factual inventory of the infrastructure that AI agents will need to operate within. Most companies overestimate their data readiness and underestimate their API coverage. I've yet to see a company where the data estate matches what leadership thinks it is.
Every process in your business gets mapped to its AI leverage potential using a simple formula:
AI Priority Score = Manual Hours × Frequency × Error Rate
A process that takes 10 hours per week, runs 20 times per month, with a 5% error rate scores 1,000. A process that takes 2 hours, runs 4 times, with a 1% error rate scores 8. The score tells you where AI delivers the most value fastest.
Equally important: the kill list. Processes where AI adds complexity, not value. Not everything should be automated. A process with low volume, high ambiguity, and significant legal exposure is often better left to humans. The kill list prevents the most common AI deployment failure: automating the wrong thing.
This isn't "who knows Python." It's: who thinks in systems?
We assess your team against the 4 AI-Ready Archetypes:
Engineers who will build and maintain agents. They need to understand Claude's API, Agent SDK, and MCP. They think in systems, not features.
Team members who will work alongside agents daily. They need to understand agent outputs, know when to override, and recognise when something is wrong.
People who bridge technical and business domains. They can evaluate AI output quality, explain decisions to stakeholders, and identify new use cases.
People who question AI decisions and stress-test outputs. Often the most valuable archetype. They find failure modes before customers do. Every team needs them.
All four archetypes are valuable. A team of only Builders ships fast but misses business context. A team of only Operators knows what they need but can't build it. The Capability Assessment maps your people to archetypes and identifies gaps that will derail the programme if unfilled.
The blueprint isn't a strategy deck. It's a deployment plan. Specific systems. Specific agents. Specific measurable outcomes. Named owners. Week-by-week milestones.
It includes three parallel tracks (detailed in Phase 2), the first agent to ship (always the simplest, highest-ROI candidate from the Process Value Map), the enablement plan mapped to team archetypes, and the self-sufficiency criteria the team must meet by Day 75.
The blueprint gets presented to the board. They see exactly what will be deployed, when, and what it will deliver in measurable terms. No ambiguity. No "it depends." Specific numbers, specific dates.
This is where most AI programmes stall. They over-plan, over-design, and under-ship. Our approach: get the first agent into production within 15 days. A production system handling real data and making real decisions. Without momentum, AI programmes die in committee. A team that's seen AI working in their environment thinks differently about AI than a team that's only seen a presentation.
Ship the simplest, highest-ROI agent first. We call this "first blood" — the moment the team sees AI working in their own environment, with their own data, producing real outputs.
The first agent is chosen for speed and visibility. It might be a content generation agent that writes product descriptions. A triage agent that classifies support tickets. An anomaly detector that monitors key metrics. The point is: it works, it's visible, and it proves the pattern.
Rule: if it takes more than 5 days to ship the first agent, the scope is wrong. Descope until it fits. You can expand later. You can't recover momentum lost to a 6-week proof of concept that nobody cares about by the time it ships.
After first blood, we expand to three parallel deployment tracks:
Agents that touch customers directly: personalisation, pricing, support triage, content generation.
Agents that automate internal operations: inventory management, vendor monitoring, reporting, data processing.
Agents that inform human decisions: anomaly detection, forecasting, competitive intelligence, market analysis.
Each track has its own team lead and success metric. Tracks run independently. If Track A stalls, Track B and C keep shipping. This parallelism is what compresses 12 months of sequential deployment into 35 days.
Individual agents help. Connected agents compound. In this step, we wire agents together: the anomaly detection agent feeds into the pricing agent. The customer support agent informs the product content agent. The inventory agent triggers the procurement agent.
We build monitoring and observability across the entire agent fleet — system-level views showing how agents interact, where bottlenecks form, and which feedback loops produce the most value.
For high-stakes decisions (pricing changes above a threshold, customer-facing responses about refunds, inventory orders above a value), we establish human-in-the-loop protocols. The agent recommends. A human approves. Over time, as confidence builds, the approval threshold rises and more decisions become autonomous.
Production traffic. We test with real data volumes, real concurrency, and real edge cases. The questions we answer: What happens when the LLM hallucinates a price? What happens when an API goes down mid-decision? What happens when two agents give conflicting recommendations? What happens at 10x normal load?
Every failure mode gets a circuit breaker and a fallback path. If the pricing agent can't reach the inventory API, it holds current prices rather than guessing. If the content agent produces output below the quality threshold, it flags for human review rather than publishing. Production AI systems must fail gracefully. This step ensures they do.
Enable starts while Deploy is still running. You can't train people on AI they haven't seen working. By Day 30, your team has been watching production agents for 15+ days. They have context. They have questions. They have opinions. Now is when training works.
Hands-on building, not slide decks.
Your engineers pair-programme with our Anthropic-trained team. They work on real production agents — the ones already running in your infrastructure. They learn by doing: building new features, debugging production issues, deploying updates, tuning prompts, adding guardrails.
The bar: each developer ships at least one agent independently before this phase ends. A production agent, deployed to your infrastructure, handling real data. If they can do that, they can build anything. If they can't, we have more work to do.
The Operators and Translators in your team learn to work WITH AI, not around it. This means: understanding agent dashboards (what do these numbers mean, when should I worry), tuning alert thresholds (this agent is too aggressive, that one is too conservative), exercising override protocols (when and how to override an agent decision), and evaluating output quality (is this agent-generated content good enough to publish?).
We build custom dashboards for each team. The merchandising team sees pricing agent decisions alongside sell-through data. The customer service team sees triage agent classifications alongside resolution times. The ops team sees the full agent fleet status. Each team gets the view they need to do their job with AI as a partner.
Now that AI is running, redesign processes around it.
This is backwards from how most consultancies do it. They redesign processes first, then build AI to fit. The problem: you can't redesign processes around AI you haven't built yet. You don't know what the AI is good at, where it fails, what it needs from humans, or how humans need to interact with it until it's running.
By Day 50, your agents have been in production for 35+ days. You know their strengths, weaknesses, and quirks. Now you can redesign processes around reality, not assumptions. The pricing team's workflow changes. The content approval process changes. The customer service escalation path changes. These are informed redesigns, not speculative ones.
The goal of the entire programme is this phase. Everything before it exists to make this phase work. If a consultancy stays forever, it's not a consultancy — it's a dependency. Our incentive is aligned: we want to leave.
Five questions. Binary answers.
If yes to all five: we leave. The programme is complete. Your team is self-sufficient.
If no to any: we extend the Enable phase. More pair programming. More hands-on building. More operational practice. We retest at Day 90. The extension is at no additional cost because our incentive is to build independence.
In practice, most teams pass by Day 75. I've seen it enough times now to trust the pattern: the Deploy-before-Enable approach means they've been working with production AI for 60 days by the time we test. They have context, muscle memory, and confidence that no training programme can provide without real systems.
Most consultancies follow a linear pattern:
It's faster, but that's a side effect. The real difference is when people learn.
The traditional approach assumes you must train people before deploying AI. This sounds logical. It's wrong. Training people on AI they've never seen working is like teaching someone to swim on dry land. They learn the theory. They pass the test. They drown in the pool.
The MarginOps approach deploys AI first, then enables the team to use it. By the time training starts, the team has watched production agents for weeks. They've seen the outputs, the failures, the edge cases, and the value. They have real questions. The training sticks because it's grounded in experience.
This is why we can do in 90 days what others do in 18 months. We eliminated the months of training-before-deployment that doesn't work.
The test matters more than the agents or the architecture. It determines whether we've succeeded at the only thing that matters: making your team independent.
Passing looks like: A developer on your team takes a new use case from the Process Value Map, scopes it, builds the agent on Claude, defines guardrails, sets up monitoring, and deploys to production. We observe but don't help.
Failing looks like: The developer gets stuck on Claude API integration, can't define guardrails independently, or needs our help to configure monitoring. If this happens, we identify the specific gap and address it in extended enablement.
We break something in production (or wait for something to break on its own — it always does). Your team detects it via monitoring, diagnoses the root cause (data issue, prompt issue, guardrail gap), fixes it, and deploys the fix. We observe but don't help. If they can't diagnose it or can't ship the fix independently, that's the gap we address in extended enablement.
Passing looks like: A business stakeholder proposes a new AI use case. Your Translators assess feasibility, your Builders estimate effort, and your Skeptics identify risks. The team produces a go/no-go recommendation with reasoning.
Failing looks like: The team defaults to "we need to ask MarginOps" or produces an assessment that misses obvious risks or overstates feasibility.
This one's straightforward: are your Operators actually using the dashboards daily? Tuning alert thresholds? Exercising override protocols when appropriate? Escalating genuine issues through the correct path? I've seen teams where agents run fine but nobody's watching — that's a time bomb, not a pass.
Passing looks like: A board member asks "why did the pricing agent set this price?" Your Translator explains the inputs, the decision logic, and the expected outcome in business terms. No jargon. No hand-waving.
If the answer is either too technical ("the model used temperature 0.3 with a 200K context window") or too vague ("the AI decided it was the best price"), they're not ready yet.
The AI Mobilisation Pattern was developed during our engagement with a major UK fashion retailer, where we deployed 7 production agents across pricing, customer service, content, anomaly detection, inventory, marketing, and vendor monitoring — all built on Anthropic's Claude. The 90-day timeframe, the parallel tracks, the Deploy-before-Enable principle, and the Self-Sufficiency Test all emerged from that engagement.
It builds on our other methodologies: the Margin Audit identifies where AI should be deployed, the AI Agent Deployment pattern defines how individual agents are built, and the Transformation Programme tracks everything against EBITDA. The Mobilisation Pattern is the orchestration layer that ties them together into a 90-day programme with a single outcome: your team runs AI without us.
Most consultancies follow an Assess-Train-Pilot-Scale pattern that takes 12-18 months because they train people on AI before deploying it. We deploy first and train second. You can't teach someone to drive by showing them a PowerPoint about steering wheels. You put them in the car. Our 90-day pattern works because deployment happens in parallel with enablement, and we ship the first agent within 15 days to build momentum and prove value immediately.
The AI Priority Score is a formula we use to rank processes by AI automation potential: Manual Hours per week multiplied by Frequency per month multiplied by Error Rate. A process that takes 10 hours per week, runs 20 times per month, with a 5% error rate scores 1,000. A process that takes 2 hours, runs 4 times, with 1% error rate scores 8. The score identifies where AI delivers the most value fastest, and equally important, where AI adds complexity without value — those processes go on the kill list.
The four AI-Ready Archetypes are: Builders (engineers who will build and maintain agents), Operators (team members who will work alongside agents daily), Translators (people who bridge technical and business domains and can evaluate AI output quality), and Skeptics (people who question AI decisions and stress-test outputs). All four are valuable. Skeptics are often the most valuable because they find failure modes before customers do. We assess your team against these archetypes during Phase 1 to build the right enablement plan.
If your team can't pass all five questions of the Self-Sufficiency Test at Day 75, we extend the Enable phase at no additional cost. Our incentive is aligned with yours: we want to leave. A client that depends on us permanently is a client that will eventually resent us. We extend enablement, intensify pair programming, and retest at Day 90. In practice, most teams pass by Day 75 because the Deploy-before-Enable approach means they've been working with production AI for 60 days by the time we test.
The AI Agent Deployment methodology is about building a single production agent — from use case identification through guardrails, monitoring, and iteration. The AI Mobilisation Pattern is about transforming your entire organisation's AI capability in 90 days. It includes agent deployment as one component (Phase 2), but also covers assessment, team enablement, process redesign, and the self-sufficiency handover. Think of Agent Deployment as a single workstream within the broader Mobilisation programme.
90 days. 12 steps. 4 phases. One outcome: your team deploys, monitors, and improves AI agents without external help. The first step is a conversation about where you are today.