Python snippet

The snippet

Copy this into your agent. It runs in a background thread and never blocks your agent.

import httpx, threading, time

def muster_emit(job_id, checks, token_input=None, token_output=None, model=None, latency_ms=None):
    """Fire-and-forget. Covers: correctness checks + cost + latency in one call."""
    def _send():
        try:
            httpx.post(
                f"https://backend.getmuster.io/api/v1/jobs/{job_id}/quality",
                json={
                    "agent_id":      "your-agent-name",   # ← replace with your agent name
                    "job_id":        job_id,
                    "overall_passed": all(c["passed"] for c in checks),
                    "checks":        checks,
                    "token_input":   token_input,          # from LLM response.usage
                    "token_output":  token_output,
                    "model":         model,
                    "latency_ms":    latency_ms,
                },
                timeout=2.0,
            )
        except Exception:
            pass  # never block your agent
    threading.Thread(target=_send, daemon=True).start()

Usage

start = time.time()
result = your_agent.run(input)   # ← your existing code, unchanged

muster_emit(
    job_id=job_id,
    checks=[
        {"check_id": "output_not_empty",    "severity": "HIGH",   "passed": bool(result)},
        {"check_id": "subtotal_arithmetic", "severity": "HIGH",
         "passed": abs(computed - declared) < 0.01,
         "expected": str(declared), "actual": str(computed)},
    ],
    token_input=result.usage.prompt_tokens,    # OpenAI / Anthropic response
    token_output=result.usage.completion_tokens,
    model="gpt-4o",
    latency_ms=int((time.time() - start) * 1000),
)

What each field powers

Field	Powers in dashboard
`checks`	Health Heatmap pass rates, anomaly detection
`token_input` + `token_output` + `model`	Cost Dashboard — $ calculated automatically
`latency_ms`	SLA monitoring, latency trends
`overall_passed: false`	Anomaly detection — failure rate spike

Recommended check IDs

check_id	What to check
`output_not_empty`	Agent produced a non-empty response
`subtotal_arithmetic`	Numeric totals add up correctly
`required_fields_present`	All required output fields are present
`no_refusal_in_output`	Agent didn’t say it can’t help
`decision_is_valid_enum`	Decision is one of the expected values
`source_cited`	Claims include source references
`grand_total_arithmetic`	Subtotal + tax = grand total
`latency_within_sla`	Response time under threshold

Getting token counts

# OpenAI
token_input  = response.usage.prompt_tokens
token_output = response.usage.completion_tokens

# Anthropic
token_input  = response.usage.input_tokens
token_output = response.usage.output_tokens

# LangChain
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
    result = chain.invoke({"input": "..."})
token_input  = cb.prompt_tokens
token_output = cb.completion_tokens

Getting Started

Integration

Integrations

Connectors

Concepts

Python snippet

The snippet

Usage

What each field powers

Recommended check IDs

Getting token counts

Getting Started

Integration

Integrations

Connectors

Concepts

Documentation Index

​The snippet

​Usage

​What each field powers

​Recommended check IDs

​Getting token counts

The snippet

Usage

What each field powers

Recommended check IDs

Getting token counts