insights

Notes on building AI that ships.

Our point of view on the engineering behind production AI — written from the work, not the hype cycle.

May 12, 202610 min read

Production over demos: shipping LLM features that survive real users

A prompt that works in a demo is a hypothesis. A feature that holds up under real users is an engineered system. The gap between the two is concrete and closeable — here is how.

GenAIEngineering

April 3, 20269 min read

Evals as a first-class artifact

Teams that test their conventional code but run AI features on intuition are operating a double standard with expensive consequences. Evaluations need the same status as tests: versioned, automated, and treated as a gate — not an afterthought.

EvalsQuality

March 15, 20268 min read

The context window is not memory: designing stateful AI systems

Passing a growing conversation transcript into every inference call is the simplest possible state management strategy and also the most brittle. Here is what a serious approach to context actually looks like.

ArchitectureGenAI

February 18, 20269 min read

Why we build AI-native, not AI-bolted-on

Bolting a chat box onto an existing product is not an AI strategy — it is a feature flag with extra latency. Building AI-native means letting model capabilities and constraints shape architecture, data design, interface, and team structure from the start.

StrategyGenAI

January 20, 20269 min read

Hardware-software co-design for inference at the edge

Running inference close to the data — on embedded hardware, in a factory, on a vehicle — forces every assumption about cloud-native AI architecture to be re-examined. The constraints are real and the engineering is interesting.

HardwareInferenceEngineering

Nelfet.