Topics
Browse posts by category and tag — every topic we cover, with the latest pieces under each.
Tags
- #observability 10
- #drift-detection 6
- #monitoring 4
- #llm-ops 3
- #mlops 3
- #opentelemetry 3
- #production-ml 2
- #tooling 2
- #alerting 1
- #cost 1
- #debugging 1
- #embeddings 1
- #evaluation 1
- #experiment-tracking 1
- #latency 1
- #llm-monitoring 1
- #ml-ops 1
- #mlflow 1
- #model-monitoring 1
- #open-source 1
- #production 1
- #rag 1
- #statistical-tests 1
- #tracing 1
- #vector-store 1
Categories
ops 5 posts
- Alerting for ML Model Drift: A Practical SetupDrift alerting fails in one of two ways — it never fires, or it fires constantly until everyone mutes it. A concrete setup for alerts that fire when
- LLM Cost & Latency Observability with OpenTelemetryToken spend and tail latency are the two metrics that decide whether an LLM feature ships or gets killed. How to instrument both with OpenTelemetry so you
- Closing the Eval-Prod Gap: Online Evaluation as ObservabilityOffline eval scores are green and production is worse. The gap is not a measurement error — it is structural. Here is how to instrument online evaluation
- Embedding and Vector-Store Observability: The Unwatched LayerRAG systems fail at the embedding and index layer long before the LLM does. Here is what to actually monitor: embedding drift, index staleness, recall
- End-to-End Tracing for LLM Applications: What Belongs in a SpanProduction LLM apps span multiple model calls, tool invocations, retrieval steps, and re-tries. A complete trace makes them debuggable; a sparse one
monitoring 2 posts
- How to Detect Data Drift: Statistical Tests, Thresholds, and Production WiringA practitioner's guide to how to detect data drift: PSI, KS, Wasserstein, and Jensen-Shannon compared, with Evidently code, threshold guidance, and real production caveats.
- How to Monitor LLM in Production: Metrics, Drift, and AlertingA practitioner's guide to production LLM monitoring — covering TTFT, token throughput, output quality drift, hallucination signals, and alerting with
tooling 2 posts
- Weights & Biases vs MLflow vs Comet (2026): Choosing by Constraint, Not HypeThree tools that look interchangeable in their marketing solve subtly different problems. An honest breakdown of W&B, MLflow, and Comet — what each owns
- The Open-Source ML Observability Stack: Evidently to PhoenixAn honest breakdown of the three open-source tools most teams reach for — what problem each was built for, where they overlap, where they don't, and how