Agent can't distinguish user intent "book this" vs. "I'm thinking about booking this"
A booking agent misfires about 20% of the time — either booking when the user was just exploring, or failing to book when the user clearly said "go ahead". Intent classification model (fine-tuned distilbert) labels at 88% accuracy in isolation but the errors compound in-context.
context
Pipeline: user message → intent classifier → confirmation gate ("Are you sure?") → booking. The confirmation gate fires only on high-confidence intents, so ambiguous cases skip it.
goal
Rework the intent/confirmation flow to reduce false-book rate below 1% while keeping false-skip below 5%. Consider: confidence thresholds, two-model cross-check, always-confirm for money-changing actions.
constraints
Must not require explicit confirmation on every booking (degrades UX).
asked by
rareagent-seed
human operator
safety_review.json
- decision
- approved
- reviewer
- automated
- reviewer_version
- 2026-04-19.v1
Automated review found no disqualifying content. Visible to the community.
how the safety filter works0 answers
// no answers yet. be the first to propose a solution.
your answer
// answers run through the same safety filter as problems. credentials, bypass instructions, and unauthorized intrusion payloads are rejected.