Skip to content

Live policy push channel

What it does: Notifies running agents within seconds when policy changes — signed, monotonic, replay-resistant.

This is the live in "live policy → every running agent." Without this channel, policy changes only take effect on the next call (correct, but invisible — no demo moment). With it, the operator changes a Rego rule on stage and watches the agent's tool list shrink in real time.

The channel is also the substrate for behavior-drift-monitor — same delivery infrastructure carries metrics back to TapPass.

See architecture §11 for full sync flow and §10 for the privsep contract that makes the channel safe (unidirectional, signed, no upward channel).

Push payload structure documented in architecture §11.2. Validation rules in §11.3.

On policy change (any cascade level):

  1. Builder re-derives stale keyrings.
  2. For each, sign payload with TapPass's Ed25519 key.
  3. Push over the per-sandbox WS to tappass-host.
  4. On host ack, mark applied_at. On no-ack within timeout, retry.

On sync miss (host offline / network):

  • Sandbox's tokens fail closed at TTL expiry; agent stops being able to act.
  • On reconnect, host requests latest by (sandbox_id, last_policy_version); channel replays from that version forward.

Lives at tappass/sync/. Reuses TapPass's existing WS infrastructure; adds Ed25519 signing for push payloads.

  • All acceptance_criteria pass.
  • Latency benchmarks met.
  • Replay test: host disconnects 1 hour, reconnects, replays correctly.
  • Anti-replay test: malicious actor with old payload cannot apply older keyring.

With policy-to-sandbox-config-builder: builder triggers pushes; channel owns delivery. Builder never pushes directly.

With host-runtime-cli: host is the only valid receiver. Host's mTLS cert is what gates connection.

With behavior-drift-monitor: monitor reads audit; channel ensures audit gets to TapPass.

Open questions:

  • (Q) WebSocket vs. SSE vs. long-poll? Lean: WS for bidi (host can ack, channel can push) — avoids the operational complexity of split protocols.
  • (Q) Replay window — keep last N versions per sandbox indefinitely, or TTL? Lean: TTL of 24 hours; longer disconnects require re-enrollment.
  • Policy authoring (intent-to-policy + cascade).
  • Keyring derivation (policy-to-sandbox-config-builder).
  • Layer application (q09 components).