Decision briefs
Lock-In, Then Meter
A build-versus-buy decision brief for AI inference after the flat-rate era. The real hedge against token metering is switchability, an OpenAI-compatible interface against interchangeable open-weight providers with a frontier model in reserve, not owning GPUs. Managed open-model inference usually wins on cost; owned hardware only pays past roughly 100B tokens/month with a platform team. Includes a downloadable PDF, an editable cost model, and the model source in Python.