Runaway agent stopper
Runaway agent stopper
Section titled “Runaway agent stopper”What it does: Pipeline step that detects looping or repetitive destructive behavior and kills the session.
1. Vision context
Section titled “1. Vision context”Agents fail in the way agents fail: a misaligned prompt, a brittle retry loop, the LLM hallucinating "I should delete this and try again." Without a guard, three bad steps become thirty. The runaway agent stopper kills the session before damage propagates.
Demo moment: agent given a brittle task; mock returns "asset still exists" after delete; agent retries. After 3 deletes within 60 seconds, the stopper kills the session. Audit: session_killed: loop_detected, 3 deletes on asset_id=fact_sales in 12s.
2. Functional specification
Section titled “2. Functional specification”Two signals fire in series:
- Volumetric: count of destructive ops (
delete_*,drop_*,truncate_*) in a sliding window. Threshold:max_deletesperwindow_s. Default: 3 / 60. - Pattern: same target identifier (asset_id, schema, table) in destructive ops more than once in the window. Tighter — actively looping behavior, not just volume.
Output on trip: kill the session (any subsequent call in this session_id returns session_killed_loop_guard); emit loop_detected audit; surface in dashboard with replay.
3. Technical design
Section titled “3. Technical design”Per-session counter store (in-memory with redis fallback for multi-pod). Lives at tappass/gateway/pipeline/steps/loop_guard.py.
4. Definition of done
Section titled “4. Definition of done”- All acceptance_criteria pass.
- Mock-Collibra integration test: clean-up loop trips the guard at attempt #3.
- Operator reset path tested.
- False-positive guard: legitimate batch deletes (e.g., bulk cleanup) configurable per sandbox-spec to allow higher thresholds.
5. Coordination notes
Section titled “5. Coordination notes”With resource-access-checker: sister step. Run order: schema_acl first (cheaper), loop_guard second.
Open questions:
- (Q) Should the guard apply across sessions for the same agent, or per-session only? Lean: per-session for v1; cross-session needs aggregation infrastructure that can come later.
6. Out of scope
Section titled “6. Out of scope”- Authoring the thresholds — sandbox-spec / function declares typical session shape; operator can override per-agent.
- Restart/recovery semantics — once killed, the operator clears it; the agent must be reinitialized from a fresh keyring.