Invisible interfaces, visible consequences
Editor's note: Arnie Guha is partner and head of experience design at Phase 5. He can be reached at arnieg@phase-5.com.
In 1769, a Hungarian nobleman unveiled The Turk, an automaton that appeared to play flawless chess. Crowds marveled at the mechanical genius, convinced that the machine's moves emerged from pure mechanism. In truth, a human chess master sat hidden inside, guiding every move. Those who lost to the machine had no way of knowing whether they had been beaten fairly, only whether the moves seemed sensible and the loss legitimate.
Today, AI has replaced The Turk's hidden master with code but the customer's position is much the same. The process is invisible. The decision arrives, unadorned, and the only points of contact are trust, clarity and the emotional residue left behind.
The disappearing interface
The history of user experience is largely the history of the interface. Vannevar Bush's 1945 essay “As We May Think” envisioned the "memex," a conceptual forerunner to hypertext that would allow users to navigate complex data intuitively. This launched a multi-decade quest in human-machine interaction (HMI) to perfect the bridge between human intent and machine computation.
Early command-line interfaces demanded users learn a machine's language. The revolutionary work at Xerox PARC in the 1970s inverted this, creating the graphical user interface (GUI) where the machine learned to speak ours. The interface became the locus of design, the tangible plane where function was made accessible.
As digital services moved from workplace to home, brands came to live and die on these interfaces. Consider America Online in the 1990s: the friendly layout, distinct folders and iconic "You've got mail!" were inseparable from the AOL brand itself. Designers iterated endlessly on flows, buttons and page load times. We measured satisfaction with the journey.
But in AI-driven service environments, this painstakingly constructed interface often dissolves. The customer no longer navigates a process; they receive a verdict.
In banking, a user applying for a credit card provides information through a simple form and, within seconds, receives a decision. The complex, AI-driven underwriting process – weighing hundreds of variables in a model – is entirely invisible. The user doesn't experience a journey; they are handed a decision: approved, with a specific credit limit and interest rate, or denied.
In health care, Google's DeepMind AI can analyze retinal scans to detect diabetic retinopathy with accuracy matching or exceeding human ophthalmologists. The system's internal process of analyzing millions of pixels is a black box. The clinician and patient receive a verdict: a probability score or classification that informs the medical decision.
In retail, a customer visiting Amazon sees a specific price generated by a dynamic pricing algorithm that has instantly weighed purchase history, current demand, competitors' prices and time of day. The "products recommended for you" list is a verdict from a sophisticated engine that has already decided what you are most likely to buy. The customer does not see the calculation; they only see the final, authoritative result.
These contexts expose a blind spot in the traditional customer experience (CX) toolkit. Net Promoter Score (NPS), Customer Effort Score (CES) and customer satisfaction (CSAT) were all designed to evaluate an observable interaction with an interface. They cannot explain why two customers receiving identical outcomes might diverge sharply in trust, loyalty and advocacy when the journey itself has become invisible.
Why traditional CX metrics fail
Procedural justice research has long shown that people's perception of the fairness of a process is often more critical to their acceptance of a decision than the outcome itself. In a courtroom, a defendant who believes they had a fair trial is more likely to accept a guilty verdict than one who feels the system was rigged. The perceived legitimacy of the process validates the result.
In invisible-interface environments, however, the customer is denied a view of the "trial." The AI's process is inaccessible, so fairness can only be inferred from the final decision and whatever explanation accompanies it. This creates a critical vulnerability for brands.
Finance – the opaque verdict: The Consumer Financial Protection Bureau's 2022 guidance mandates specific, comprehensible reasons for adverse AI-driven credit decisions. An applicant for a small business loan is rejected with a notice stating, "Your profile did not meet the profitability threshold of our proprietary model." This opaque verdict is functionally useless. The applicant can't learn from it, correct a potential error or understand the bank's logic. This absence of an understandable process invites suspicion that the "black box" is arbitrary or biased, destroying trust.
Retail – the betrayal of the algorithm: Wu et al. (2022) found that algorithmic price discrimination directly harmed loyalty by increasing customers' feelings of betrayal. Two loyal customers browse the same airline's website for the same flight. One, whose browsing history suggests price sensitivity, sees a fare of $350. The other, whose history includes expensive hotel bookings, sees $425. If they discover this, the damage isn't the $75 difference; it's the violation of trust. The invisible process treated them unequally, turning their loyalty into a variable to be exploited.
Gig platforms – judgment without appeal: Kellogg et al. (2020) documented how opaque algorithmic deactivation on gig platforms destroyed worker trust. A food delivery driver with a 4.9-star rating over thousands of deliveries wakes up to find they can't log in. They receive an automated email: "Your account has been deactivated for fraudulent activity." No specifics are given. The driver is locked out of their livelihood by a secret judgment. The absence of a visible process for review or meaningful appeal is devastating, removing any sense of recourse and making the entire platform feel illegitimate.
The psychological dynamics of outcome-only experiences
The behavioral mechanics of how we react to decisions are well understood. In AI-driven contexts, these dynamics are dangerously amplified due to the inherent opacity of the systems.
Fairness heuristics: People use mental shortcuts to judge fairness and will accept negative outcomes if they believe the process was fair. A driver is more likely to accept a speeding ticket if the officer was polite and clearly explained the radar reading. When an AI simply triples a ride-sharing fare due to "surge pricing," the user gets bad news without any visible process, making the outcome feel arbitrary and exploitative.
Transparency and legitimacy: When a platform like YouTube removes a video with a vague notice like, "This violates our community standards," it feels like illegitimate censorship. A transparent reason – "This video was removed because it contains copyrighted audio from 'Song X'" – makes the platform's authority feel legitimate and gives the user a clear path to correction.
Recourse confidence: The mere belief that one could challenge a decision significantly reduces dissatisfaction, even if never used. When an AI system denies a business loan, the presence of a clear "Appeal This Decision" button that leads to human review provides psychological safety, signaling that the system is not an unchallengeable dictatorship.
Emotional immediacy: First emotional reactions are powerful predictors of future behavior. When a credit card is unexpectedly declined at a busy checkout counter, the immediate feeling is embarrassment and anger. This gut reaction is a stronger predictor of switching banks than any subsequent rational analysis. Because AI verdicts are delivered instantly in high-stakes moments, they are potent triggers for churn-driving emotions.
In AI-driven contexts, these dynamics are compounded by severe information asymmetry. The decision logic is inaccessible, the criteria are opaque and human dialogue is often entirely absent. This creates a perfect storm where an invisible process feels inherently unfair, unexplained decisions feel illegitimate and the lack of recourse leaves the customer with nothing but their immediate, negative emotional reaction.
The case for new metrics – and the TAR framework
If AI is restructuring the customer experience from a journey into a verdict, then our methods for measuring that experience must evolve. Continuing to rely solely on traditional CX metrics like NPS, CSAT and CES in these new contexts is a critical error. It's like judging a chef's cooking based on the cleanliness of the menu.
These legacy metrics were designed for a world of visible processes. NPS asks if a customer would recommend your brand, but that willingness now hinges on the perceived justice of an AI's verdict. CSAT measures satisfaction but satisfaction with what? The seamless app that delivered a life-altering loan denial? CES measures effort but in an AI-driven world, effort can be near-zero while emotional impact is sky-high.
These metrics fail because they cannot see or measure the judgment moment – that instant the verdict is delivered and trust is either forged or shattered. Without outcome-level metrics, companies are flying blind.
At Phase 5, we developed the TAR framework – trust, alignment, recourse – as a governance and design model built for outcome-driven experiences, where the decision is the interface. TAR was designed for contexts where AI acts with institutional authority but without the natural feedback loops of human interaction.
Trust means users believe the outcome was reached fairly and transparently and can follow the reasoning behind it.
Alignment means the AI's decision-making remains faithful to the institution's stated role, policies and values, even as the system adapts over time.
Recourse means users retain agency: the ability to question, appeal or override an AI-driven decision. A system that cannot be challenged is not just unaccountable, it is unsafe.
TAR differs from existing AI governance frameworks because it is rooted in the user's lived experience of an AI verdict. Where compliance models focus on documentation and audits, TAR operationalizes fairness, purpose and agency into the real-time moment of delivery.
Metrics for the AI outcome era
To be able to measure and govern CX meaningfully, organizations must adopt a new suite of metrics designed specifically to measure the customer's perception of an AI-driven outcome:
Perceived fairness (trust)
Definition: The customer's belief that the outcome, regardless of whether it was favorable, was just, unbiased and equitable.
Measurement: Five-point agreement scale on "The decision I received was fair," paired with mandatory open-text follow-up: "Why do you feel this way?"
Decision clarity (trust + alignment)
Definition: How well the customer understands why the outcome occurred.
Measurement: "Do you understand the reason(s) for the decision you received?" (Yes/No/I'm not sure)
Brand alignment (alignment)
Definition: The degree to which the AI-delivered outcome feels consistent with the brand's established values and tone.
Measurement: Five-point agreement scale on "This interaction reflected what I expect from [your brand name]."
Recourse confidence (recourse)
Definition: The customer's belief that they could effectively challenge or get a human review of the decision.
Measurement: Five-point agreement scale on "I am confident I would know how to get this decision reviewed by a person if I disagreed with it."
Outcome emotional response (trust + recourse)
Definition: The immediate, unfiltered emotional reaction to the decision.
Measurement: Emotion-tagging question with context-specific options (Angry/Disappointed/Confused/Calm/Relieved/Hopeful/Delighted).
Implementation roadmap
Adopting these metrics requires deliberate integration into organizational operations:
Post-outcome micro-surveys: Deploy lightweight, often single-question surveys within minutes of a decision, in the same channel as the decision – a pop-up in the app, a link in the email, an SMS message. Capture the emotional reaction before it cools.
Passive sentiment tracking: Use natural language processing (NLP) to analyze unstructured feedback. Systematically scan support chat logs, call transcripts and social media for keywords related to AI decisions (e.g., "algorithm," "automated," "unfair," "confusing").
Data linkage to business outcomes: Connect these metrics to hard business data. Does a one-point drop in perceived fairness correlate to increased churn over 90 days? Does low decision clarity predict higher call volumes? This linkage proves the tangible ROI of fairness and transparency.
Governance integration as early-warning system: A sudden dip in perceived fairness among a specific demographic can be the first signal of unintentional algorithmic bias, triggering a model review before the issue escalates into a regulatory fine or public relations crisis.
Cross-sector benchmarking: Compare your metrics against direct competitors or best-in-class examples from other industries. This provides crucial context, sets meaningful improvement targets and helps identify emerging best practices.
Governance and brand resilience
These metrics are not just diagnostic tools, they are governance instruments. TAR-based metrics are intended to form an early-warning system, giving leadership a live dashboard of trust. They can show in real time whether outcomes are experienced as fair, clear and challengeable.
While direct quantitative impact is an emerging area of study, the business logic is clear: Since brand trust is a known driver of retention, it stands to reason that a decline in perceived fairness – a direct measure of that trust – would serve as a powerful leading indicator for future customer churn. Following the same logic, low decision clarity scores are a likely proxy for customer confusion, which often translates directly into higher call volumes. Gaps in recourse confidence signal a growing sense of user powerlessness that serves as an early warning for reputational and regulatory risk.
The underlying principle is thus: Outcome-level perception metrics are leading indicators, offering executives a chance to detect and address the erosion of trust before it materializes in lost customers or public backlash. By integrating these signals into board-level dashboards alongside financial KPIs, organizations can transform "soft" perception data into a quantifiable control system for risk management.
Beyond internal monitoring, cross-sector benchmarking gives TAR metrics additional weight, creating reputational benchmarks. Just as firms once competed on NPS, the next decade will likely see competition on legitimacy, with metrics like TAR forming a potential new scorecard.
These instruments do more than quantify user sentiment or reduce support costs. They are the levers by which institutions preserve legitimacy in an era when decisions arrive without process and outcomes speak louder than any interface. Governance through metrics like TAR is the operating system that can keep authority explainable, accountable and aligned with purpose. With measurement in place, organizations can manage risk and protect reputation – but without it, every verdict risks becoming indistinguishable from arbitrariness.
Hides the process inside the code
In the 18th century, The Turk fooled audiences by hiding the human inside. Today, AI hides the process inside the code. In both cases, the visible element – the outcome – determines belief, trust and loyalty.
The decision has become the customer experience. In a world where interfaces vanish, the only way to protect trust is to measure – with unflinching precision – the moment the verdict arrives. But measurement alone is not the shield; it is the alarm.
Without trust, decisions are never legitimate. Without alignment, systems drift from institutional purpose into quiet betrayal. Without recourse, mistakes calcify into injustice. Our belief is that TAR is not an accessory to CX – it is the operating system for legitimacy in the age of invisible interfaces.
The organizations that embed TAR will define what it means to be worth trusting: every verdict explainable, every decision anchored in purpose, every user empowered to challenge the machine. Those that do not will learn, often too late, that in an outcome-only world, trust lost at the moment of decision is almost impossible to regain.
References
Ansems, T. G. C., van de Schoot, R., and van der Helm, P. (2021). “The importance of perceived procedural justice among detained youths: a multilevel meta-analysis.” Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.753697
Bush, V. (1945, July). “As We May Think.” The Atlantic Monthly. https://www.theatlantic.com/magazine/archive/1945/07/as-we-may-think/303881/
Consumer Financial Protection Bureau. (2022, May). Consumer Financial Protection Circular 2022-03: Adverse action notification requirements in connection with credit decisions based on complex algorithms. https://www.consumerfinance.gov/compliance/circulars/circular-2022-03-adverse-action-notification-requirements-in-connection-with-credit-decisions-based-on-complex-algorithms/
Jarrahi, M. H., and Newlands, G. (2021). “Algorithmic management in a work context.” Big Data & Society, 8(2). https://doi.org/10.1177/20539517211053049
Kahneman, D. (2011). “Thinking, Fast and Slow.” Farrar, Straus and Giroux.
Kellogg, K. C., Valentine, M. A., and Christin, A. (2020). “Algorithms and the future of work: A research agenda.” Organization Science, 31(1), 1–25. https://doi.org/10.1287/orsc.2019.1332
Kihwa, A. (2022). “The gig economy and algorithmic management: A study of Foodora riders' perceptions of algorithmic management.” Master's thesis, Stockholm University. DiVA portal. http://su.diva-portal.org/smash/get/diva2:1695427/FULLTEXT01.pdf
Lind, E. A., and Tyler, T. R. (1988). “The Social Psychology of Procedural Justice.” Plenum Press.
Standage, T. (2002). “The Turk: The Life and Times of the Famous Eighteenth-Century Chess-Playing Machine.” Walker & Company.
Wu, Z., Yang, Y., Zhao, J., & Wu, Y. (2022). “The impact of algorithmic price discrimination on consumers' perceived betrayal.” Frontiers in Psychology, 13:825420. https://doi.org/10.3389/fpsyg.2022.825420
