Observe 29 min
Observe the Run
Reproduce an untraceable coding-agent patch, then add Anvia observers and pipeline events.
Failure pattern
The patch reaches review, but nobody can reconstruct the run. Which files did the agent inspect? Which command failed before the patch? Which checks were skipped? Which tool output became evidence?
Reproduce the failure
const result = await codingPipeline.run(task);
console.log(result.summary);
The summary exists, but the path is gone.
Successful Anvia pattern
Attach an Anvia observer to the agent and a pipeline observer to the workflow.
import { AgentBuilder, createObserver } from "@anvia/core";
const observer = createObserver({
startRun(args) {
const runId = crypto.randomUUID();
runStore.start({ runId, prompt: args.prompt, history: args.history });
return {
startTool(toolArgs) {
runStore.toolStarted({
runId,
toolName: toolArgs.toolName,
args: toolArgs.args,
});
return {
end(endArgs) {
runStore.toolEnded({
runId,
toolName: endArgs.toolName,
result: endArgs.result,
skipped: endArgs.skipped,
});
},
};
},
end(endArgs) {
runStore.end({ runId, output: endArgs.output, usage: endArgs.usage });
},
};
},
});
const observedAgent = new AgentBuilder("repo-coding-agent", model)
.observe(observer)
.build();
Pipeline stages should also emit events:
await codingPipeline.run(task, {
observer: {
onEvent(event) {
runStore.pipelineEvent(event);
},
},
});
Why it succeeds
The run record now captures process evidence: prompt, stages, tool calls, command results, skipped checks, and final output. Review can start from facts instead of memory.
Success check
The run record should answer:
- what behavior was active?
- which files were read?
- which checks failed before patch?
- which checks passed after patch?
- which checks were skipped?
- what risk remains?
Next move
Observation tells you what happened. Verification decides whether that is enough.