Work

Production systems with architecture decisions, trade-offs, and results.

embedding-drift-monitor

Production ML monitoring system that detects embedding drift and model degradation in real-time

llm-inference-router

Multi-model LLM router that optimizes cost and latency by intelligently routing queries to local/cloud models based on complexity analysis

vector-cache-optimizer

Intelligent embedding cache with ML-driven eviction policies and real-time performance optimization