AC.
Back to blog
·15 min readGenAI TransformationFrameworks & Models

The AI Onramp That Became The Destination

Why generic AI deployment and embedded AI deployment produce fundamentally different outcomes, and how to sequence them.

Consider two organisations, same size, same AI budget, same starting point.

The first deploys a productivity suite to every knowledge worker. Each person gets 10% more efficient. The whole organisation benefits. The efficiency gain, multiplied across thousands of employees, looks substantial in aggregate.

The second picks one workflow, one team, one measurable outcome. They redesign the process from the ground up. The targeted function becomes twice as fast, half as costly, or both. Nine in ten employees see nothing change.

On paper, the headline numbers can look identical. In practice, these are different types of outcome, and they compound differently.

This distinction runs deeper than the technology being used. Text-to-SQL is a widely available capability; dozens of tools offer it. Uber made a different decision. Rather than deploying another tool with text-to-SQL capability, they built QueryGPT, a system designed specifically for Uber's data environment. It cut query-authoring time by 70% and saves an estimated 140,000 analyst hours per month. Not because it uses a better model than any generic alternative, but because it knows Uber's schemas. Same technology available to everyone. Different contextualisation. That is the entire gap.

This is the question most AI leaders are not asking precisely enough: is a 10% efficiency gain across 100% of your workforce better than a 100% efficiency gain across 10% of it? The honest answer: it depends entirely on whether you are treating generic deployment as a destination or as a deliberate on-ramp to something embedded.

Most organisations are treating it as a destination. BCG's analysis of firms at the frontier found that they deliver 1.7 times the revenue growth and 3.6 times the total shareholder return of laggards. The differentiator is contextualisation into redesigned workflows, not model selection.

The sequencing question has a direction. Getting the order wrong closes options.


Generic and Embedded: What the Terms Actually Mean

Generic AI deployment means a tool that works for anyone, applied broadly. A meeting summariser. A document chatbot. A writing assistant. A code completion tool. These serve every function equally because the contextualisation responsibility stays with each individual user. The tool arrives with no opinion about your schemas, your metric definitions, or your process logic. What you bring to it determines what you get from it. Universal accessibility is the design choice, and it carries a natural ceiling.

Embedded AI deployment is what happens when contextualisation moves from the individual to the architecture. A tool built around a specific process, a specific data environment, and a specific measurable outcome. It works exceptionally well for the workflow it was designed around, because it carries the context that generic tools must guess at or leave to the user: your schemas, your metric definitions, your process logic, your edge cases, your governance rules.

Between these two poles, a third layer is worth naming. Tools like Google Gemini Gems and Anthropic's Claude Projects allow users to configure persistent instructions, upload reference materials, and shape how the tool behaves across conversations, without any coding required. For analysts, this often means uploading schema documentation, naming conventions, and metric definitions so the tool responds within the team's specific context. These tools raise the floor for motivated individuals significantly. They are, even so, generic at the organisational level. The contextualisation work rests entirely with the person who built the configuration: it carries no governance rules, queries no governed data layer, and does not update when your schema changes. When the person who built it leaves, the context leaves with them. Call this the configured-generic layer: more capable than off-the-shelf, less durable than anything designed at the system level.

The diagnostic that makes this concrete: if you removed the AI tomorrow, what would break? For generic tools, the honest answer is usually nothing specific. People work more slowly, but no process degrades in a bounded, attributable way. For embedded implementations, a specific business process degrades. Handle time rises. Query resolution times lengthen. A workflow that ran automatically requires humans again. The presence of attributable breakage is the signal that genuine contextualisation has occurred.

Run this test on your current AI portfolio. The distribution will tell you something worth acting on.


What the Generic Phase Is Actually For

A broad deployment — Microsoft 365 Copilot for every knowledge worker, ChatGPT Enterprise across functions, a RAG-based knowledge assistant on the intranet — is not primarily about the productivity gain it produces directly. That gain is real. JPMorgan reports three to six hours saved per week per employee across 250,000 people. These are genuine outcomes.

The most important thing the generic phase does lies elsewhere.

The generic phase, if instrumented properly, builds five things the embedded phase requires.

AI fluency. Employees who have used a general-purpose assistant for six months approach a contextualised workflow tool differently from those encountering AI for the first time. BCG's 2025 AI at Work survey found that employees with positive AI experience are more than three times more likely to be highly productive when more demanding tools arrive. Fluency is more than familiarity with a chat interface. It is the development of prompt discipline, output evaluation habits, and the intuition to recognise when a model is confidently wrong. The generic phase builds this before the stakes are high enough for errors to matter.

Use-case discovery. With proper instrumentation, usage patterns reveal which functions are seeking AI help most, which categories of work are being substituted into AI workflows, and where the tool is consistently falling short. This is a signal, not a complete friction map. Combined with structured discovery conversations with function leads, it becomes one. Use-case logging from day one is the mechanism that makes the generic phase diagnostic. Without it, the phase produces productivity gains with no strategic intelligence attached.

Governance formation. A broad deployment is when AI governance stops being theoretical. Most organisations launch with an initial Acceptable Use Policy and data classification framework; actual usage reveals what was not anticipated — data types fed into prompts that were not classified, use cases outside the policy's scope, AI configurations built by power users with no inventory behind them. Phase 1 surfaces these gaps, forces remediation, and produces an AI asset inventory: a record of every Gem, Project, and similar configuration, including what data each connects to and who is accountable for it. That inventory is a Phase 1 output, not a Phase 2 prerequisite.

Leadership endorsement. Broad deployment moves AI from an IT experiment to a board-level priority. Without executive sponsorship, the budget and organisational mandate for the embedded phase rarely materialises. The generic phase does political groundwork as much as technical groundwork.

Champion identification. Power users emerge from every broad deployment. These are the analysts who build Gems for their team, the operations leads who configure shared Projects, the senior analysts who publish context files with metric definitions. They are running informal proofs of concept at the individual level, demonstrating that a given workflow is worth redesigning properly. The organisation's job is to instrument this, find these people, and treat their experiments as design inputs for what comes next.

The mistake is treating this phase as the destination. An organisation that deploys Copilot licences, measures time saved, and stops there has not completed the generic phase. It has abandoned it at the point where it becomes useful.


The Framework: Two Phases, One Critical Transition

The evidence points to a sequencing model. Most organisations reach the generic phase and stall there, not because the next step is unknown, but because it requires a deliberate diagnostic act that passive progression never triggers. Two phases, separated by a decision point that requires an active choice rather than passive drift.

Phase 1: Generic Floor

Deploy broad-access tools with diagnostic intent. The objective is AI fluency, use-case discovery, data readiness assessment, governance formation, and leadership alignment. Measure the percentage of the workforce with regular AI interaction, not time saved. Instrument use-case logging from day one. Actively encourage power users to experiment with configured-generic tools — Gems, Projects — and capture what they build. These individual experiments are informal proofs of concept. They are design inputs, not solutions.

By the end of Phase 1, the organisation should have documented outputs across each dimension: an AI fluency baseline, a use-case heat map by function, a data readiness map, a champion inventory, and an AI governance baseline. These five artefacts are what Transition 1 runs on.

Transition 1: The Diagnosis

This is the step most organisations underinvest in. Done properly, it separates a generic floor from a generic ceiling.

The five Phase 1 artefacts answer most of the questions needed to select an embedded workflow. The use-case heat map tells you where Constraint is present: which workflows are active bottlenecks that people are already attempting to solve with AI. The data readiness map tells you where Context exists: which of those workflows sit on a governed data environment ready for embedded deployment. The champion inventory tells you who will lead Phase 2 and what they have already proven informally.

The one input Phase 1 cannot fully answer is Codifiability: whether the core of the workflow can be expressed as rules, patterns, or repeatable decisions a system can act on reliably. This requires a direct conversation with the process owner.

Run the three-C test against your top candidate workflows, using what Phase 1 produced as the starting point:

Constraint. Is this workflow currently limiting business performance? The use-case heat map will show where people are pressing hardest against the limits of generic tooling. Active bottlenecks deserve priority over workflows where AI would be convenient.

Context. Does sufficient governed data exist to support embedded deployment reliably? The data readiness map answers this directly. Context is a binary gate: the governed data is there, or the workflow waits.

Codifiability. Can enough of this workflow be expressed as rules, patterns, or repeatable decisions? The champion's configured-generic experiment is useful evidence here: if someone has already built a Gem or Project that partially automates the workflow, that is proof that some portion of it is codifiable. Low codifiability shapes the operating model; it does not disqualify the workflow.

A workflow that clears all three Cs, with an identified champion willing to redesign, is a Phase 2 candidate. The champion's configured-generic experiment becomes the design brief for Phase 2, not the solution. What they proved individually now gets built properly: governed, shared, and compounding across the function rather than residing in a single person's Projects folder.

JPMorgan reportedly ran 450 proofs of concept with hard KPIs before selecting which workflows to operationalise. Morgan Stanley identified wealth management as the entry point not because it was the largest function, but because advisors had a measurable time cost in document retrieval, clean data in a RAG-compatible format, and leadership willing to change the workflow. The diagnosis is the architecture.

Decision: What role should AI play?

Selecting the workflow is one decision. The second is what role AI should play once deployed there. Two variables determine this: how codifiable the output is, and what the cost of being wrong looks like. Where codifiability is high and error cost is low, AI can own execution, with humans handling exceptions. Where codifiability is high but error cost is significant, governance rails are needed before automation proceeds — AI does the work, humans approve and audit. The operating model question matters as much as the workflow selection; getting the workflow right and deploying the wrong operating model is how technically sound implementations produce poor outcomes. There is a fuller treatment of this decision framework in GenAI does not disrupt uniformly(https://abhinavchouhan.com/blog/genai-does-not-disrupt-uniformly).

Phase 2: Embedded Spine

Deploy AI against end-to-end redesigned workflows in the selected functions. Accept that this is slower and more expensive per workflow than the generic phase. The return is an order of magnitude higher. Measure EBIT contribution and cycle-time reduction per workflow, not adoption rates.

The phrase "end-to-end redesigned" carries weight. McKinsey's 2025 State of AI survey tested 25 attributes across nearly 2,000 organisations and found that workflow redesign had the single strongest correlation with EBIT impact. Adding a capability to an existing process does not qualify. The process must change. The champion's informal experiment from Phase 1 provides the hypothesis. Phase 2 proves it at scale, with governance, and without the fragility of a single person's configuration.


The Pattern Recurs at Every Level

Consider a simple question: "What was revenue last quarter?" An analyst in Finance asks their configured AI assistant and receives one answer. An analyst in Growth asks the same question and receives another. Both answers are internally consistent. Both are supported by the context each person provided. Neither is universally correct.

The problem is not the model. The problem is that the organisation never fully agreed on what revenue means.

For years, these inconsistencies could remain hidden because each team lived inside its own dashboards and reports. AI changes the dynamic. The moment hundreds of employees begin asking natural-language questions, every unresolved metric definition, undocumented business rule, and piece of tribal knowledge becomes visible. This is why analytics teams often misdiagnose the challenge. They assume the bottleneck is query generation. In reality, generic AI quickly reveals that the bottleneck is organisational agreement.

Generic AI surfaces a truth that data and analytics leaders have been wrestling with for years: the organisation knows more than its architecture does.

Every time an AI system produces an answer that looks reasonable but cannot be trusted, it is revealing knowledge that exists somewhere in the organisation but nowhere in a governed system.

This is where the transition to embedded AI begins. The purpose of the generic phase is not simply to make analysts more productive. Its real value is diagnostic. It reveals where contextualisation is missing, identifies which business definitions matter, and surfaces which pieces of institutional knowledge must be codified if AI is to operate reliably.

The embedded phase resolves these constraints at the architectural level. A governed semantic layer becomes the source of truth for metrics. Business definitions move out of individual Projects and into shared infrastructure. Data lineage, transformation logic, quality rules, and governance controls become accessible to both humans and AI systems.

This is also where the accuracy difference becomes measurable. Against raw enterprise schemas, state-of-the-art LLMs achieve 10 to 20% accuracy on realistic production questions. The same models querying against a well-built semantic layer reach 85 to 95%. Snowflake's Cortex Analyst benchmarks at approximately 90% against governed semantic models versus 51% for a standard LLM baseline on the same questions. The model does not close that gap. The contextualisation does.

This changes the role of analytics architecture fundamentally. For two decades, semantic layers, metric stores, and governance frameworks existed primarily to help humans consume data consistently. In the AI era, they serve a second purpose: helping machines reason consistently. Without that layer, every assistant, copilot, and agent develops its own interpretation of the business. With it, every system inherits the same definitions, calculations, and logic.

The semantic layer is no longer just reporting infrastructure. It becomes AI infrastructure.

The future bottleneck is not dashboard creation. It is decision codification.

Most analytics functions have spent years codifying data, transformations, and metrics. AI introduces a new requirement: organisations must now codify the decisions that sit on top of them. When a business user asks why churn increased, what explanation should be prioritised? When a forecast falls outside tolerance, what action should occur next? Historically, these decisions lived with experienced managers and analysts. In an AI-enabled workflow, they must increasingly live inside systems. The organisations that succeed will not be those with the most AI tools. They will be those that convert institutional knowledge into operational architecture.

Generic analytics AI makes analysts faster. Embedded analytics AI makes answers trustworthy.


What to Do on Monday Morning

The organisations pulling ahead are not the ones that deployed the most AI. They are the ones that picked one workflow, contextualised it properly, measured it against a specific outcome, and used that proof to fund the next one.

If your organisation is still in the generic phase, the question worth asking is not "are we getting value?" It is: what is this phase teaching us about where the contextualisation investment belongs? Build the use-case heat map. Look at who is already experimenting with configured-generic tools and what problems they are trying to solve. Run the three-C test against your top candidate workflows. Find the two leaders willing to redesign, not merely augment.

If you lead data and analytics: do you have a semantic layer? A governed, code-managed, metric-consistent layer that your top enterprise KPIs pass through. Without it, every natural-language analytics tool, every conversational BI interface, every text-to-SQL system you deploy is answering the generic question with generic accuracy. The contextualisation is the investment. Everything else follows from it.

The generic phase was never the destination. It was the first active decision. The second one, the diagnosis that converts floor into spine, is where most organisations are currently parked.

That decision will not make itself.


If this was useful, Analytics in the AI Era goes deeper on questions like this every week. Analysis grounded in what is actually happening, applied to the work of leading data and analytics teams.

Subscribe here; free, and you can leave at any time.

If any of this connects to something you are working through, I would love to compare notes. Reach out.

I write about analytics leadership and AI transformation on LinkedIn.

Connect on LinkedIn

More in this topic