Skip to content

Runaway agent stopper

What it does: Pipeline step that detects looping or repetitive destructive behavior and kills the session.

Agents fail in the way agents fail: a misaligned prompt, a brittle retry loop, the LLM hallucinating "I should delete this and try again." Without a guard, three bad steps become thirty. The runaway agent stopper kills the session before damage propagates.

Demo moment: agent given a brittle task; mock returns "asset still exists" after delete; agent retries. After 3 deletes within 60 seconds, the stopper kills the session. Audit: session_killed: loop_detected, 3 deletes on asset_id=fact_sales in 12s.

Two signals fire in series:

  1. Volumetric: count of destructive ops (delete_*, drop_*, truncate_*) in a sliding window. Threshold: max_deletes per window_s. Default: 3 / 60.
  2. Pattern: same target identifier (asset_id, schema, table) in destructive ops more than once in the window. Tighter — actively looping behavior, not just volume.

Output on trip: kill the session (any subsequent call in this session_id returns session_killed_loop_guard); emit loop_detected audit; surface in dashboard with replay.

Per-session counter store (in-memory with redis fallback for multi-pod). Lives at tappass/gateway/pipeline/steps/loop_guard.py.

  • All acceptance_criteria pass.
  • Mock-Collibra integration test: clean-up loop trips the guard at attempt #3.
  • Operator reset path tested.
  • False-positive guard: legitimate batch deletes (e.g., bulk cleanup) configurable per sandbox-spec to allow higher thresholds.

With resource-access-checker: sister step. Run order: schema_acl first (cheaper), loop_guard second.

Open questions:

  • (Q) Should the guard apply across sessions for the same agent, or per-session only? Lean: per-session for v1; cross-session needs aggregation infrastructure that can come later.
  • Authoring the thresholds — sandbox-spec / function declares typical session shape; operator can override per-agent.
  • Restart/recovery semantics — once killed, the operator clears it; the agent must be reinitialized from a fresh keyring.