Equip Safe Tools
Reproduce unsafe repo access, then expose a typed Anvia tool surface for reading files, running focused checks, and requesting review.
Failure pattern
The coding agent has a generic shell. It can run any command, mutate anything, and decide by itself which output counts as evidence. That is convenient, but it is not a harness.
Reproduce the failure
const runShell = async (command: string) => exec(command);
await runShell("pnpm db:reset && pnpm test && git add .");
The model may be trying to help, but the tool surface is too broad. It can destroy local state, run unfocused checks, or stage unrelated files.
Successful Anvia pattern
Expose typed tools that match safe coding-agent operations. Use createTool to make each action explicit.
import { z } from "zod";
import { AgentBuilder, createTool } from "@anvia/core";
const readRepoFile = createTool({
name: "read_repo_file",
description: "Read a repository file by project-relative path.",
input: z.object({
path: z.string(),
}),
output: z.object({
path: z.string(),
text: z.string(),
}),
execute: async ({ path }) => repo.readFile(path),
});
const runFocusedCheck = createTool({
name: "run_focused_check",
description: "Run an approved focused verification command.",
input: z.object({
command: z.enum(["pnpm test invite", "pnpm typecheck", "pnpm lint"]),
}),
output: z.object({
command: z.string(),
exitCode: z.number(),
output: z.string(),
}),
execute: async ({ command }) => checks.run(command),
});
const requestPatchReview = createTool({
name: "request_patch_review",
description: "Submit a bounded patch summary for human review.",
input: z.object({
behavior: z.string(),
changedFiles: z.array(z.string()),
evidence: z.array(z.string()),
risks: z.array(z.string()),
}),
approval: {
when: () => true,
reason: "Human review is required before a coding-agent patch is accepted.",
rejectMessage: "Review request was not submitted. Human approval is required.",
},
execute: async (input) => reviewQueue.submit(input),
});
const agent = new AgentBuilder("repo-coding-agent", model)
.tool(readRepoFile)
.tool(runFocusedCheck)
.tool(requestPatchReview)
.defaultMaxTurns(6)
.build();
Why it succeeds
The agent can inspect the repo and produce evidence, but it cannot run arbitrary shell commands or claim approval. The safe path is now the easy path.
Success check
The successful tool surface has:
- no generic shell tool for normal runs
- focused commands with named purpose
- typed outputs that become evidence
- approval for review submission
- refusal paths for unsafe or unsupported actions
Next move
After tools are safe, compose them into a pipeline so the agent cannot skip preflight or verification.