Eugene Chernenko

Optimizing Token Efficiency for Agentic Inference Systems 2026-06-24
Strategies to optimize tokens spend using 3 levers – prompt caching, tool search, and WebSocket transport. Measured across OpenAI (GPT) and Anthropic (Opus/Sonnet) models, with double-digit token and latency reductions per session.
Extremely Reliable API - 99.9995%, lessons from Stripe 2026-05-18
Process: 1 - Practice your worst day every day; 2 - Never send a human to do machine's job; 3 - Exercise extreme ownership. Tech: 1 - Cell-based architecture; 2 - Chaos testing and fault injection.
From SELECT LIKE to RAG Search 2026-05-18
How search evolved from SQL substring matching to production RAG: BM25 inverted indexes for lexical match, hybrid retrieval and reranking for semantic precision, and grounded LLM synthesis with citations.
Spawning an Agentic Team in Claude Code 2026-05-18
A lab walkthrough of bootstrapping a small agentic team in Claude Code – one lead Claude that decomposes work, plus backend and frontend specialists running in parallel tmux panes, each in its own git worktree, talking peer-to-peer through a file-based mailbox. Includes the full bootstrap, a small POC task on this Flask blog, and the OOM lesson learned.
ML Models for Website Optimization and Personalization 2026-05-18
Top 5 ML models that earn their keep on a typical website – semantic search embeddings, gradient boosting for prediction, collaborative filtering for recs, density clustering for segmentation, and time-series forecasting for capacity – with notes on where to get them, how to install, and which knobs actually matter when tuning.
Load balancing web APIs vs LLM APIs 2026-05-08
Similarities and differences. Touching hardware both APIs run on and strategies to load balance (and fallback).
Running MCP Servers (Stdio, Streamable HTTP) 2026-05-04
MCP servers are the standard protocol to get insights across systems in AI-driven development. Which flavor works best (stdio, streamable http), how to call it (from the main thread or via agent), what are pros and cons of each approach.
Blue/Green vs Canary Deployments 2026-05-01
Insights into the cost $$, practical implementation and when to pick which: blue/green at the infrastructure / cluster level (big platform moves, K8s upgrades, AZ shifts), canary at the service/app level for everyday deploys, and feature flags layered on top to decouple deploy from release.
Skills vs Agents 2026-05-01
When to use skill and when to go the agent route? These 2 functionalities are getting mixed up at times. Agents run in isolated context (they can't access your conversation history), while skills can, and that impacts output, isolation (or not) from main thread and tokens usage.