Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.getmuster.io/llms.txt

Use this file to discover all available pages before exploring further.

The snippet

Copy this into your agent. It runs in a background thread and never blocks your agent.
import httpx, threading, time

def muster_emit(job_id, checks, token_input=None, token_output=None, model=None, latency_ms=None):
    """Fire-and-forget. Covers: correctness checks + cost + latency in one call."""
    def _send():
        try:
            httpx.post(
                f"https://backend.getmuster.io/api/v1/jobs/{job_id}/quality",
                json={
                    "agent_id":      "your-agent-name",   # ← replace with your agent name
                    "job_id":        job_id,
                    "overall_passed": all(c["passed"] for c in checks),
                    "checks":        checks,
                    "token_input":   token_input,          # from LLM response.usage
                    "token_output":  token_output,
                    "model":         model,
                    "latency_ms":    latency_ms,
                },
                timeout=2.0,
            )
        except Exception:
            pass  # never block your agent
    threading.Thread(target=_send, daemon=True).start()

Usage

start = time.time()
result = your_agent.run(input)   # ← your existing code, unchanged

muster_emit(
    job_id=job_id,
    checks=[
        {"check_id": "output_not_empty",    "severity": "HIGH",   "passed": bool(result)},
        {"check_id": "subtotal_arithmetic", "severity": "HIGH",
         "passed": abs(computed - declared) < 0.01,
         "expected": str(declared), "actual": str(computed)},
    ],
    token_input=result.usage.prompt_tokens,    # OpenAI / Anthropic response
    token_output=result.usage.completion_tokens,
    model="gpt-4o",
    latency_ms=int((time.time() - start) * 1000),
)

What each field powers

FieldPowers in dashboard
checksHealth Heatmap pass rates, anomaly detection
token_input + token_output + modelCost Dashboard — $ calculated automatically
latency_msSLA monitoring, latency trends
overall_passed: falseAnomaly detection — failure rate spike
check_idWhat to check
output_not_emptyAgent produced a non-empty response
subtotal_arithmeticNumeric totals add up correctly
required_fields_presentAll required output fields are present
no_refusal_in_outputAgent didn’t say it can’t help
decision_is_valid_enumDecision is one of the expected values
source_citedClaims include source references
grand_total_arithmeticSubtotal + tax = grand total
latency_within_slaResponse time under threshold

Getting token counts

# OpenAI
token_input  = response.usage.prompt_tokens
token_output = response.usage.completion_tokens

# Anthropic
token_input  = response.usage.input_tokens
token_output = response.usage.output_tokens

# LangChain
from langchain.callbacks import get_openai_callback
with get_openai_callback() as cb:
    result = chain.invoke({"input": "..."})
token_input  = cb.prompt_tokens
token_output = cb.completion_tokens