Documentation Index
Fetch the complete documentation index at: https://docs.getmuster.io/llms.txt
Use this file to discover all available pages before exploring further.
The snippet
Copy this into your agent. It runs in a background thread and never blocks your agent.
import httpx, threading, time
def muster_emit(job_id, checks, token_input=None, token_output=None, model=None, latency_ms=None):
"""Fire-and-forget. Covers: correctness checks + cost + latency in one call."""
def _send():
try:
httpx.post(
f"https://backend.getmuster.io/api/v1/jobs/{job_id}/quality",
json={
"agent_id": "your-agent-name", # ← replace with your agent name
"job_id": job_id,
"overall_passed": all(c["passed"] for c in checks),
"checks": checks,
"token_input": token_input, # from LLM response.usage
"token_output": token_output,
"model": model,
"latency_ms": latency_ms,
},
timeout=2.0,
)
except Exception:
pass # never block your agent
threading.Thread(target=_send, daemon=True).start()
Usage
start = time.time()
result = your_agent.run(input) # ← your existing code, unchanged
muster_emit(
job_id=job_id,
checks=[
{"check_id": "output_not_empty", "severity": "HIGH", "passed": bool(result)},
{"check_id": "subtotal_arithmetic", "severity": "HIGH",
"passed": abs(computed - declared) < 0.01,
"expected": str(declared), "actual": str(computed)},
],
token_input=result.usage.prompt_tokens, # OpenAI / Anthropic response
token_output=result.usage.completion_tokens,
model="gpt-4o",
latency_ms=int((time.time() - start) * 1000),
)
What each field powers
| Field | Powers in dashboard |
|---|
checks | Health Heatmap pass rates, anomaly detection |
token_input + token_output + model | Cost Dashboard — $ calculated automatically |
latency_ms | SLA monitoring, latency trends |
overall_passed: false | Anomaly detection — failure rate spike |
Recommended check IDs
| check_id | What to check |
|---|
output_not_empty | Agent produced a non-empty response |
subtotal_arithmetic | Numeric totals add up correctly |
required_fields_present | All required output fields are present |
no_refusal_in_output | Agent didn’t say it can’t help |
decision_is_valid_enum | Decision is one of the expected values |
source_cited | Claims include source references |
grand_total_arithmetic | Subtotal + tax = grand total |
latency_within_sla | Response time under threshold |
Getting token counts
# OpenAI
token_input = response.usage.prompt_tokens
token_output = response.usage.completion_tokens
# Anthropic
token_input = response.usage.input_tokens
token_output = response.usage.output_tokens
# LangChain
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
result = chain.invoke({"input": "..."})
token_input = cb.prompt_tokens
token_output = cb.completion_tokens