How we built the AI interview agent — Orvine Blog

The hardest part was not the model. It was the question policy — knowing when to push, when to reflect, and when to stop.

An LLM with a microphone is not an interview agent. The hardest engineering problem we faced was the question policy — the actual choreography of asking, probing, reflecting, and stopping. The model is a small part of it.

Our first prototype was naive. The model held one system prompt, took a turn of audio, transcribed it, and produced the next question. It worked for two minutes. After that the interview drifted, the questions became generic, and the operator on the other end politely said 'we have already covered that.' The model had no scaffolding to remember what it had asked, what was still open, or what the operator had implied that needed to be verified.

The rewrite split the agent into a planner and an executor. The planner holds an explicit topic tree — derived from the entity graph for the operator's owned scope — and tracks coverage. The executor takes one topic at a time, runs a structured probe (specific example, edge case, breakage), and writes the result back to the planner. When the planner sees a topic is sufficiently covered, it advances. When it sees the operator is fatiguing, it offers to pause.

The single most useful behavior we added was reflection. Before extracting a procedure, the agent reads back what it heard in plain language and asks the operator to confirm. It feels conversational. It is also the highest-precision step in the pipeline. Operators correct themselves at reflection roughly 30% of the time, and those corrections are gold for the downstream SOP.