Prototype fast, ship safe
We’re in an upgrade cycle. New models, new specs, new primitives every quarter. In this environment, speed is table stakes — but speed without safety is just churn.
Here’s the pattern that worked repeatedly.
1) Prototype in days, not weeks
- Use the smallest workable scope: a single path, a single data source, a single user action.
- Build the “ugly vertical”: logging, toggles, metrics from day one so the prototype is demo‑able and measurable.
- Keep the code disposable. You’re testing a behavior, not writing a framework.
Outcome: product can react to a real thing, not a deck.
2) Evaluate like a product
- Define success criteria you can measure this week: CTR, dwell time, task completion, quality rubrics.
- Run side‑by‑side with a baseline; capture deltas on real‑ish traffic.
- Keep a curated eval set of 100–200 examples that reflect the problem slice.
Outcome: decisions are evidence‑based and legible to non‑engineers.
3) Add guardrails as soon as it’s interesting
- Refusals for sensitive content; rate limits and timeouts; circuit breakers for dependencies.
- Caching: product cache (repeat answers) + outage cache (stale‑acceptable answers).
- Rollout plan with a crisp kill switch.
Outcome: you’re fast and trustworthy.
4) Productionize the parts that survive
- Separate the prompt/policy layer from the application logic and UI style.
- Turn one‑off scripts into versioned components with changelogs.
- Invest in clear oncall docs: alerts, playbooks, owners.
Outcome: iteration gets faster over time instead of slower.
5) Culture: celebrate boring wins
Boring engineering is how you go fast on purpose. Launches that don’t wake anyone up at 3am are a competitive advantage.
A checklist I keep around
- Scope is one flow, one persona
- Metrics + logs from day one
- Eval set + baseline comparison
- Guardrails: refusals, limits, cache, kill switch
- Rollout plan in the PR description
- Ownership, runbook, and post‑launch bake time
The tech will change. The feedback loop is the product.