Leave a Clean Handoff
End each quant-agent session in a state another analyst or agent can verify, understand, and continue.
Failure pattern
The research session ends with a draft memo and a confident summary, but the next analyst cannot tell what is final, what is temporary, what is blocked, and what must happen before review.
The agent may have useful work in the workspace: charts, screens, temporary filters, half-run backtests, and caveats. If the handoff does not make state explicit, the next session begins with cleanup and archaeology.
Incident: pre-committee research handoff
Agent task
The Quant Analyst AI Agent is preparing a semiconductor revision-momentum packet before an investment committee meeting:
Finish the research packet or leave it ready for the morning analyst to continue.
The session ends before all checks are complete.
Available surface
The agent has created or touched:
| Surface | State |
|---|---|
| Research memo | Drafted with thesis and charts |
| Backtest output | Net-cost table complete |
| Risk analysis | Beta and sector exposure complete |
| Borrow-cost check | Missing for two short candidates |
| Universe filter | Temporary exclusion for three low-liquidity names |
| Review board | Status still draft |
| Run record | Partially written |
Bad run
The agent ends with:
The memo is mostly ready. Remaining work is minor. The next analyst can review the draft and finalize.
The next analyst opens the workspace and finds:
- The memo says “review-ready” in the conclusion, but the board status is
draft. - Borrow-cost checks are missing for two short candidates.
- A temporary universe exclusion is still active.
- The risk chart uses an older risk snapshot than the memo table.
- The next action is unclear.
The summary was optimistic. The state was not restartable.
Why the harness failed
The handoff did not classify state.
| Handoff gap | Consequence |
|---|---|
| Draft vs review-ready unclear | Analyst may advance incomplete memo |
| Temporary filter not marked | Future runs may inherit hidden exclusion |
| Missing borrow check vague | Reviewer cannot tell if blocker matters |
| Snapshot mismatch hidden | Memo has inconsistent evidence |
| No first action | Next session starts by rediscovering state |
Clean handoff is a harness behavior, not a courtesy.
Why it happens
Agents often summarize effort rather than state. “Mostly ready” may be conversationally useful, but it is operationally weak. Quant research needs exact status because small differences matter: draft versus review-requested, exploratory versus advisory, gross versus net, old snapshot versus current snapshot.
Handoff quality matters most when work is incomplete. A truthful incomplete handoff is better than a polished summary that hides blockers.
Harness principle
Clean handoff is part of done.
Every session should end with:
- Current artifact status.
- Verified evidence.
- Missing or failed checks.
- Temporary assumptions or filters.
- Snapshot and version IDs.
- Next action.
- Approval boundary.
flowchart LR A["Session artifacts"] --> B["Verify status"] B --> C["Mark temporary state"] C --> D["List blockers"] D --> E["Write next action"] E --> F["Restartable handoff"]
The goal is not to make unfinished work look finished. The goal is to make unfinished work safe to continue.
Operating practice
Use a session-exit record:
| Field | Handoff state |
|---|---|
| Research object | Semiconductor revision-momentum packet |
| Current status | Draft with blockers; not review-ready |
| Verified | Net-cost backtest complete; beta and sector risk complete |
| Missing | Borrow-cost data for short candidates D and F |
| Temporary | Universe excludes three low-liquidity names for exploratory run only |
| Snapshot mismatch | Risk chart uses RISK-2026-05-15; memo table uses RISK-2026-05-16 |
| Next action | Refresh risk chart, resolve borrow data, then rerun evidence gate |
| Approval boundary | Do not move to review_requested until gate passes |
Harnessed handoff
Status: draft_with_blockers
Do not present as review-ready.
Verified:
- Backtest net of 20 bps costs completed.
- Beta and sector exposure within limits on RISK-2026-05-16.
Blockers:
- Borrow cost missing for short candidates D and F.
- Risk chart in memo must be regenerated from RISK-2026-05-16.
Temporary state:
- Universe exclusion TEMP-LIQ-03 used only for exploratory run.
Next action:
1. Resolve borrow data.
2. Regenerate risk chart.
3. Run readiness gate.
Now a fresh analyst or agent can continue without guessing.
The handoff should also protect against accidental promotion. If the memo conclusion says “review-ready” but the evidence gate has not passed, the handoff should call that mismatch out directly. A clean handoff is allowed to be uncomfortable. It should make hidden inconsistency visible before the next person relies on it.
For quant research, the safest handoff often includes a “do not use for” line. Example: “Do not use this memo for committee discussion until borrow data and risk chart refresh are complete.” That sentence prevents an unfinished artifact from becoming decision material.
Product-agent example
Use state categories:
| Category | Examples |
|---|---|
| Complete | Net-cost table generated and linked |
| Pending | Borrow check waiting on data |
| Temporary | Exploratory universe exclusion |
| Broken | Risk chart snapshot mismatch |
| Unknown | Capacity estimate not reviewed |
| Next | Resolve borrow, regenerate chart, run gate |
State categories make the handoff scannable.
The categories should be consistent across sessions. If one agent uses pending and another uses blocked for the same state, the next run has to interpret vocabulary. A small fixed state model makes handoff machine-readable and human-readable at the same time.
The handoff should be located somewhere predictable. A perfect note is much less useful if the next analyst does not know where to find it.
Common mistakes
The first mistake is writing “mostly ready.” That is not a state.
The second mistake is leaving temporary filters active without naming them. Hidden assumptions contaminate future runs.
The third mistake is failing to distinguish draft, review-requested, and approved. These are different product states.
The fourth mistake is omitting the first next action. A handoff should reduce startup cost.
Practical exercise
Create a handoff checklist for one quant-agent research workflow.
Include artifact status, verified evidence, missing checks, temporary assumptions, snapshot IDs, next action, and approval boundary. Use it after one session and see whether a fresh session can resume without verbal explanation.
Key takeaways
- A session is not complete if the next session cannot safely continue.
- Clean handoff means truthful, restartable state.
- Temporary assumptions must be removed or named.
- Draft, review-requested, and approved are different states.
- The next action should be concrete.
Further reading / source notes
- Anthropic, “Effective harnesses for long-running agents” for restartable progress and handoff practices.
- Google SRE, “Managing Incidents” for operational handoff concepts that transfer well to agent workflows.
- OpenAI, “Harness engineering: leveraging Codex in an agent-first world” for designing feedback loops around agent work.