Escalation Decision Making

Knowing when to escalate to a human is a core agentic skill. The right triggers come from explicit signals, not from heuristics like confidence scores or sentiment analysis. This lesson covers the always-escalate cases, the never-escalate cases, and the trap of using fuzzy proxies as escalation triggers.

Always Escalate

Customer explicitly requests a human. Immediately. Do not attempt to resolve first. The customer's autonomy overrides resolution efficiency.
Policy is silent on the specific request. If the policy doesn't address the case, the agent shouldn't invent. Escalate.
Agent cannot make meaningful progress. If two consecutive turns produce no movement, the agent is stuck. Escalate.

Do NOT Use as Escalation Triggers

Self-reported confidence below a threshold. The model's confidence is poorly calibrated. "Below 70%" means different things in different contexts.
Negative sentiment / customer frustration. Frustrated customers can still want the agent to keep trying. Escalating because someone sounds frustrated overrides their actual preference.
Case complexity estimates. Complexity is in the eye of the agent. The agent's complexity estimate doesn't predict whether escalation is the right call.
Number of messages in conversation. A 20-message conversation might be a simple back-and-forth. A 3-message conversation might be a critical case requiring escalation. Message count is noise.

Customer Explicit Request Is the Hard Override

If the customer says "I want to talk to a human right now," escalate immediately even if you could resolve the issue in the next two turns. Trying to resolve first ignores customer autonomy. The exception is none — there is no "but I'm so close to fixing it" clause.

Policy Gaps Are Different From Complexity

If the customer's request is in scope but complex, the agent should attempt resolution. If the customer's request is outside policy — something the policy doesn't cover at all — that's an escalation, regardless of complexity. The trigger is gap, not complexity.

Multiple Customer Matches

If get_customer returns multiple matches for the search criteria, do NOT pick one based on heuristics (most recent, alphabetically first, etc.). Ask the customer for additional identifiers — order number, ZIP code, account email. Acting on the wrong customer is a worse outcome than asking one extra question.

Escalation Criteria with Few-Shot Examples

For ambiguous cases, define escalation criteria explicitly in the system prompt with few-shot examples:

## Escalation Criteria

Escalate to human when:
- Customer requests a human directly (immediately, do not attempt resolution)
- Refund amount > $500
- Policy doesn't cover the request

Do NOT escalate when:
- Customer is frustrated but the issue is in scope
- Multiple message turns have elapsed
- The agent "feels" uncertain (no confidence-based escalation)

Examples:
1. Customer: "Can I return this?" → Try to resolve.
2. Customer: "I want to talk to a manager NOW." → Escalate immediately.
3. Customer: "The shipping is wrong." → In scope, attempt resolution.

Skills to Develop

Escalate immediately on explicit customer request — do not attempt resolution first.
Escalate on policy gaps; attempt resolution on policy-covered complexity.
Reject confidence/sentiment/complexity/turn-count as escalation triggers.
Ask for clarifying identifiers when multiple customer matches; do not heuristic-pick.

Exam tip: "Customer says they want a human right now" → escalate immediately, even if the issue is within agent capability. Customer autonomy overrides resolution efficiency. Always.