Work
Production systems with architecture decisions, trade-offs, and results.
embedding-drift-monitor
Production ML monitoring system that detects embedding drift and model degradation in real-time
llm-inference-router
Multi-model LLM router that optimizes cost and latency by intelligently routing queries to local/cloud models based on complexity analysis
vector-cache-optimizer
Intelligent embedding cache with ML-driven eviction policies and real-time performance optimization