Leave a Clean Handoff

Failure pattern

The agent ends a session with a confident summary, but the repo is not restartable. Tests are failing, debug logs remain, generated files are dirty, branch state is unclear, or the next action is vague.

The next engineer starts by cleaning up instead of continuing.

Incident: messy PR handoff

Agent task

The agent is asked:

Prepare the SSO invite fix for review, or leave a clean handoff if it is not ready.

The session ends before e2e verification is complete.

Available surface

The repo state includes:

Surface	State
Code diff	Auth guard and invite route changed
Tests	Unit and API pass, e2e not run
Debug artifacts	Console logs added during investigation
Generated files	API types regenerated
Branch	Uncommitted migration scratch file
Progress note	Missing

Bad run

The agent says:

Mostly ready. Just run the final tests.

The next engineer finds debug logs, a scratch migration, no exact command list, and no explanation for why e2e did not run. They do not know whether generated files are intentional.

Why the harness failed

The session ended with unclear state.

Handoff gap	Consequence
Test status vague	Next engineer repeats all checks
Debug artifacts hidden	Review sees noisy diff
Generated files unexplained	Reviewer questions diff
Scratch file present	Branch state is unsafe
Next action vague	Continuation starts with archaeology

Clean handoff was not part of done.

Why it happens

Agents often optimize for the final answer, not the next session. But coding work happens in a shared repo. A session can be valuable and still incomplete. The harness should force honest state at the end.

Clean does not mean finished. It means understandable, verifiable, and restartable.

Harness principle

Clean handoff is part of the definition of done.

flowchart LR
  A["Session work"] --> B["Verify status"]
  B --> C["Clean artifacts"]
  C --> D["Record blockers"]
  D --> E["Write next action"]
  E --> F["Restartable repo state"]

A clean handoff converts coding session residue into restartable state.

The handoff should say what is complete, what is not, and what to do first.

Operating practice

Use an exit checklist:

Check	Handoff state
Verification	Unit/API passed; e2e not run
Debug artifacts	Removed
Generated files	API types updated intentionally
Dirty files	Scratch migration removed
Known blockers	Preview email service unavailable
Next action	Run e2e or CI equivalent
Approval boundary	Do not mark PR ready until e2e passes

Harnessed handoff

Status: patch_with_blocker, not review-ready.

Changed:
- auth callback waits for membership hydration
- invite route preserves onboarding redirect
- API types regenerated intentionally

Verified:
- pnpm test auth-guard.spec.ts: pass
- pnpm test invite-api.spec.ts: pass

Not verified:
- e2e invite flow; preview email service unavailable locally

Cleaned:
- removed console logs
- removed scratch migration

Next:
- run pnpm playwright invite-flow.spec.ts or rely on CI equivalent

The next session can start immediately.

Coding-agent example

State categories:

Category	Examples
Complete	Unit/API tests passed
Pending	E2E verification
Temporary	Debug logs, scratch files
Broken	Known failing check
Unknown	Untested browser flow
Next	Exact command or reviewer action

Consistent categories make handoff readable.

Review artifact

A clean coding-agent handoff is a restart document. It should be possible to open it tomorrow and know exactly what is true.

Handoff: invite acceptance redirect fix

Current state:
- Patch updates redirect after membership event creation.
- Focused invite acceptance test passes locally.
- Expired-token regression passes locally.

Not done:
- Full auth e2e suite not run.
- Analytics first-login event not verified.
- Temporary debug log removed, but generated client diff remains unrelated.

Changed files:
- src/routes/invite/[token].ts
- tests/e2e/invite-acceptance.spec.ts

Known risks:
- Auth callback timing may differ in staging.
- Workspace setup redirect depends on feature flag `new_onboarding_flow`.

Next action:
- Run full auth e2e suite in staging-like env.
- Confirm generated client diff is pre-existing or regenerate cleanly.

This handoff is honest. It does not pretend the task is fully finished, and it does not bury risk in prose. It gives the next person a first command and a reason.

The handoff should also clean the workspace where possible. Remove debug logs, temporary scripts, unused fixtures, and dead experiment files. If something cannot be cleaned safely, name it. A dirty branch is acceptable only when the dirt is explained.

For coding agents, the final state is part of the product. A brilliant patch with a confusing handoff still costs the team time. The harness should define the exit criteria as clearly as it defines the start criteria: evidence attached, scope status truthful, next action named, and no unexplained residue.

Harnessed version

The harnessed run cannot close while the branch state is ambiguous. It must classify every residue: intentional change, generated artifact, temporary file removed, known unrelated diff, or unresolved blocker. If the agent cannot classify something, that uncertainty belongs in the handoff.

This is where coding harnesses differ from ordinary documentation. The handoff is tied to the actual workspace. It should match the diff, the test evidence, and the next command. If it says “debug logs removed,” the diff should support that. If it says “full auth suite not run,” the reviewer should not need to discover that by asking later.

Clean handoff also protects future agents. The next session can begin from the named next action instead of reconstructing the previous session’s intent from partial edits. That makes long-running coding work practical without pretending that one agent session will finish everything.

The standard is simple: a fresh engineer or agent should be able to resume without asking what happened.

Common mistakes

The first mistake is writing “mostly done.” That is not a state.

The second mistake is leaving debug artifacts for reviewers to find.

The third mistake is failing to explain generated files.

The fourth mistake is hiding missing tests behind confident prose.

Practical exercise

Create a session-exit checklist for one repo. Include verification, debug artifacts, generated files, dirty state, blockers, next action, and approval boundary.

Use it after five agent sessions and track which item fails most often.

Key takeaways

A coding session is not done if the repo is not restartable.
Handoff should be truthful, not optimistic.
Temporary artifacts must be removed or named.
Missing verification should block PR readiness.
The next action should be exact.