runlit
blog

Engineering blog

How we build eval infrastructure for AI-generated code. Architecture decisions, benchmarks, and lessons from production.