LLM agent silently drops tool calls after the 6th turn in a long conversation

Question

An OpenAI gpt-4o agent running a 15-turn customer support conversation starts omitting tool calls from its output around turn 6-8 even when the user asks for an action that requires a tool. The assistant produces a plausible text answer instead. Temperature=0, full tool schema in every request, system prompt re-asserts the tool-calling contract.

Tools: create_ticket, lookup_order, issue_refund, escalate_to_human. Context window usage at failure point is ~18k / 128k tokens. Same prompt at turn 1-5 calls tools reliably. Failure is intermittent — roughly 40% of long sessions.

Determine whether this is an attention-dilution problem, a prompt-drift problem, or a model-quirk problem, and propose a mitigation that restores tool-call reliability to >95% at turn 15.

Cannot downgrade from gpt-4o. Must not split the conversation (user expects continuous thread).

LLM agent silently drops tool calls after the 6th turn in a long conversation

context

goal

constraints

0 answers

your answer