LLM provider
LLM provider
Section titled “LLM provider”An LLM provider is an HTTP client to one LLM API.
It speaks the API's wire protocol (Anthropic Messages, OpenAI Chat Completions, Bedrock InvokeModel, …), handles auth, streaming, retries, and surfaces a uniform call surface to the gateway pipeline.
An LLM provider executes. It does not enforce. Enforcement happens around it — in the Pipeline before/during/after each call.
Note on naming. The earlier
Providercard conflated this concept (HTTP client) with the Policy provider concept (translator from Compiled Policy → runtime config). They are now split into two cards.
At a glance
Section titled “At a glance”| Takes | a normalized chat/completion request from the gateway |
| Outputs | a normalized response stream (chunks, tool-use blocks, errors) |
| Speaks | one LLM vendor's wire protocol |
| Where it lives | tappass/gateway/<vendor>.py (anthropic.py, openai.py, …) |
| Status | shipped (Anthropic, OpenAI, LiteLLM); planned (Vertex / Gemini, direct Bedrock) |
What each LLM provider does
Section titled “What each LLM provider does”| LLM provider | API | Notes |
|---|---|---|
anthropic | Anthropic Messages | streaming SSE; tool use; thinking blocks; cache_control |
openai | OpenAI Chat Completions | streaming; function calling; reasoning models |
litellm | 100+ providers via LiteLLM proxy | covers long tail; already shipped |
vertex | Google Vertex AI / Gemini | planned |
bedrock | AWS Bedrock InvokeModel | planned (direct, beyond LiteLLM) |
azure-openai | Azure OpenAI | via openai client with base-URL override |
Adding a new vendor is mostly: implement the wire protocol, normalize the response shape, register the client.
What it is, concretely
Section titled “What it is, concretely”When a customer agent calls TapPass with POST /v1/messages, the gateway:
- Authenticates the
tp_key, identifies the agent + tenant. - Resolves the customer's chosen vendor (and credentials from the Vault).
- Selects the matching LLM provider module (
gateway/anthropic.py, …). - Runs before-the-call Pipeline steps.
- Hands the request to the LLM provider, which calls the vendor's API and streams chunks back.
- Runs after-the-call Pipeline steps on each chunk.
- Returns the streamed response to the agent.
The LLM provider is the only component that talks to the vendor. Everything else operates on normalized request/response shapes.
Why this concept exists separately from Policy provider
Section titled “Why this concept exists separately from Policy provider”Both concepts were originally collapsed into a single "Provider" card. They are structurally different and shipping cadences differ:
| Policy provider | LLM provider | |
|---|---|---|
| Concept | translator (Compiled Policy → target config) | HTTP client (TapPass → LLM API) |
| Pure function? | yes — deterministic, no side effects in compile | no — makes outbound HTTP calls |
| Where it runs | host runtime / control plane on policy update | server-side, on every governed LLM call |
| Status | concept (most), partial (openshell, gateway) | shipped (Anthropic, OpenAI, LiteLLM) |
| Adding one | ~2-week translator project | ~1-week wire-protocol integration |
The LLM gateway internally uses both:
- A Policy provider (
anthropic-gateway) translates the Compiled Policy into gateway configuration (budget caps, base-URL redirect, redaction policy). - An LLM provider (
anthropic) executes the actual API calls under that configuration.
Lifecycle
Section titled “Lifecycle”[implement] Wire-protocol client written; response normalized to TapPass shape ↓[register] Module registered under gateway/<vendor>.py; client picked by vendor name ↓[credential] Customer adds vendor credentials to Vault (BYOK or managed) ↓[serve] Gateway routes calls through this LLM provider for matching agents ↓[evolve] Vendor API versions bumped; client follows; old versions kept until customers migrateEngines that operate on LLM providers
Section titled “Engines that operate on LLM providers”| Engine | What it does | Status |
|---|---|---|
| Gateway pipeline | Runs before/during/after steps around the LLM call | shipped |
| Vault | Resolves vendor credentials per tenant | shipped |
| Provider router | Picks the right LLM provider per agent based on customer config | shipped |
| Cost telemetry | Records token + latency + cost per call into Metering | partial |
Surfaces
Section titled “Surfaces”| Persona | Surface | What you do |
|---|---|---|
| Operator | Admin UI → Settings → LLM Providers | Configure which vendors are available + BYOK credentials |
| Operator | tappass llm-provider list / test <id> | Verify connectivity / credentials |
| Customer agent | POST /v1/messages (or /v1/chat/completions) | Issue a governed LLM call; gateway picks the right LLM provider |
Related concepts
Section titled “Related concepts”- distinct from ↔ Policy provider — translator, not client
- wraps ↑ Pipeline — every call runs through the engine
- uses → Vault — for vendor credentials (BYOK or managed)
- emits → Audit log — every call lands as an audit row
- feeds → Metering — cost / token / latency rollups
Status snapshot
Section titled “Status snapshot”| LLM provider | Status | Notes |
|---|---|---|
anthropic | shipped | streaming, tool-use, thinking, prompt caching |
openai | shipped | streaming, function calling, reasoning models |
litellm | shipped | 100+ vendors via LiteLLM proxy |
vertex (Gemini) | planned | direct Vertex client |
bedrock | planned | direct AWS Bedrock client (current path is via LiteLLM) |
azure-openai | shipped | via openai client + base-URL override |