← Back to Perspectives

The first wave of AI in customer experience was about conversation. Chatbots that could answer questions. Virtual assistants that could understand intent. Natural language interfaces that made self-service feel less like filling in a form.

It was genuinely impressive. It was also not what customers actually wanted.

Customers do not call because they want to talk. They call because something needs to happen: a booking needs to change, a payment needs to be processed, a problem needs to be resolved. The conversation is the means. The outcome is the point. And for most of the conversational AI era, the AI handled the conversation but someone - or something - else still had to handle the outcome.

That gap is closing. And it changes everything about how CX programmes should be designed.


From responses to execution

The distinction that matters is between AI that responds and AI that acts. A model that generates a helpful answer to a customer's question is performing one function. A model that can then access the reservation system, modify the booking, trigger the confirmation, and update the CRM record is performing a fundamentally different one.

The second type - increasingly referred to as action-oriented or agentic AI - does not just produce output for a human to act on. It executes multi-step tasks across systems, adapts when conditions change mid-task, and completes the outcome without requiring a handoff. The customer's problem is resolved, not just acknowledged.

This shift is significant for two reasons. First, it changes what a successful AI deployment looks like - the measure is no longer containment rate or deflection, but resolution rate and outcome quality. Second, it raises the stakes considerably. An AI that responds incorrectly produces a bad answer. An AI that acts incorrectly can make changes in live systems, trigger real transactions, and create problems that are considerably harder to undo.

Which is why the most important property of agentic AI in CX is not capability. It is reliability.


Why reliability is the hard problem

General-purpose language models are probabilistic. Ask the same question twice and you may get subtly different answers. That is acceptable - often even desirable - when the output is information. It is not acceptable when the output is an action in a live system.

A customer who asks for a refund and receives it expects to receive it the same way every time the same request is made. A business that uses AI to process claims, update accounts, or schedule services needs those actions to be repeatable, consistent, and auditable. Probabilistic variation is not a feature in this context. It is a liability.

The models designed specifically for CX execution address this differently from general-purpose models. They are trained on purpose-built, structured data from CX workflows rather than the open web. They reason over defined states, tools, and policies rather than open-ended language. They operate within constrained, validated action spaces where unsafe or off-policy steps are blocked by design rather than left to the model's judgement. The output is not the most plausible next token - it is the correct next action according to the workflow.

The practical consequence is a material reduction in the hallucination risk that makes general-purpose models unsuitable for autonomous execution. The AI does not invent a policy that does not exist. It does not execute an action it has not been authorised to take. It does not produce an outcome that cannot be explained to a regulator.


Governance is not optional

The question organisations most often skip when evaluating agentic AI is: what happens when it goes wrong?

Every autonomous system will eventually encounter a scenario it was not designed for. The difference between a well-governed deployment and a liability is whether the system knows what to do in that moment - whether it escalates appropriately, whether the decision trail is visible, whether the action can be reversed.

Explainability and auditability are not compliance features to be bolted on after the fact. They are design requirements that determine whether the deployment is trustworthy enough to operate at scale. An AI that can take actions in live systems but cannot explain why it took them is not ready for production, regardless of how impressive the demo was.

Across the UAE and Saudi Arabia, this is becoming a regulatory reality rather than a theoretical concern. Both markets are developing AI governance frameworks that place explicit requirements on the auditability of automated decisions, particularly in financial services, government services, and healthcare. Organisations deploying agentic AI in the GCC that have not designed for explainability from day one will face a retrofit problem - and retrofitting governance into a running system is considerably more expensive than building it in from the start.

The same applies to data residency. Agentic AI accesses and modifies customer data in real time, across systems. In a region where data sovereignty requirements are tightening, the question of where that data flows during execution - and whether it crosses borders - is not a detail to resolve after go-live.


What this means for CX strategy now

Stop measuring what the AI said. Start measuring what it achieved.

If your AI programme is still reporting on deflection rates and CSAT scores for contained interactions, you are measuring the conversational era with conversational-era metrics. The question that matters is resolution rate: what percentage of customers who interacted with the AI had their problem fully resolved without human intervention? That is the outcome that drives cost, satisfaction, and loyalty.

Evaluate AI vendors on their action architecture, not their language model.

The model underneath matters less than how it is constrained. A general-purpose model with no action guardrails is a higher-risk deployment than a purpose-built execution model with validated action spaces, even if the general-purpose model scores better on benchmarks. Ask vendors specifically how unsafe actions are blocked, how decisions are logged, and what happens when the AI encounters a scenario outside its defined scope.

Design the governance layer before you design the use cases.

Decide what the AI is and is not authorised to do before you build the capability to do it. Define the escalation triggers. Define what auditability means in your regulatory context. Define who is accountable when the AI makes a mistake. These are not questions to answer in the post-launch review. They are questions that determine whether the deployment is viable at all.


The bottom line

The next era of CX is not about AI that converses more naturally. It is about AI that executes more reliably - completing outcomes, not just generating responses, within governance structures that make autonomous action trustworthy enough to scale.

The organisations that get this right will deliver customer experiences that were previously impossible at any cost: genuinely seamless, genuinely fast, genuinely resolved. The organisations that get it wrong will discover that autonomous AI acting incorrectly in live systems is a more serious problem than a chatbot giving a bad answer.

Capability is no longer the constraint. Reliability, governance, and the willingness to design for them from the start - that is where the work is.