Make Context Discoverable
Expose quant research knowledge so agents can find the right facts without drowning in context.
Failure pattern
The right research knowledge exists, but the agent reaches stale factor definitions, old investment notes, or outdated risk methodology first and turns them into confident analysis.
In quant workflows, a real source can still be the wrong source. A factor note from last quarter may describe a signal before a data vendor change. A committee memo may reflect a rejected assumption. A benchmark methodology may have been revised after a sector reclassification. The agent needs a route through context, not just access to documents.
Incident: stale quality factor definition
Agent task
A researcher asks:
Explain why the quality score for our semiconductor screen changed this month and whether it affects the long/short shortlist.
The question requires current factor definitions, data lineage, benchmark changes, and prior committee context.
Available surface
The agent can search:
| Source | Date | What it says | Authority |
|---|---|---|---|
Factor note: quality_v2 | Jan 10 | Quality combines ROIC, gross margin stability, leverage | Historical |
| Data dictionary | May 14 | Quality uses quality_v3, adds accruals and excludes restated quarters | Authoritative |
| Risk model release note | May 12 | Semiconductor industry bucket changed after taxonomy update | Authoritative for risk |
| IC memo | Mar 2 | Committee rejected quality-only thesis in analog semis | Historical context |
| Portfolio constraints | May 15 | Max single-name active risk and sector exposure limits | Binding |
| Research Slack summary | May 16 | Mentions “quality got noisy” without details | Low authority |
The agent sees all of this through search, but the search ranking favors the January note because it has the exact phrase “quality score.”
Bad run
The agent answers:
The quality score changed because ROIC and margin stability deteriorated for several names.
The shortlist is still valid because leverage remains stable.
No major methodology issue detected.
The explanation is wrong. The score changed mostly because quality_v3 added accruals and changed restatement handling. The agent answered from an old factor definition.
Why the harness failed
The harness provided context access but not context priority.
| Missing routing rule | Consequence |
|---|---|
| Source authority | Historical factor note outranked current data dictionary |
| Freshness policy | January definition was not marked stale |
| Methodology bridge | Agent did not connect quality_v2 to quality_v3 |
| Binding constraint path | Portfolio constraints were not checked before shortlist impact |
| Conflict handling | Agent did not report that sources disagreed |
The problem was not that the agent lacked documents. It lacked a context ladder.
Why it happens
Long-context systems can hold more information, but more information is not the same as usable context. Research on long-context retrieval shows that models may struggle when relevant information is buried or placed among many distractors. In a quant research environment, the issue is sharper because documents have different authority levels.
A Slack summary, factor library, data dictionary, risk model note, and investment memo should not all answer the same question with equal weight. The harness needs to tell the agent which source is allowed to answer which claim.
Harness principle
Context should be discoverable, layered, and claim-specific.
For a quant analyst agent, context routing should answer:
- Which source defines the current factor?
- Which source defines the current data snapshot?
- Which source defines binding portfolio constraints?
- Which notes are historical context only?
- What should happen when definitions conflict?
flowchart TD A["Research question"] --> B["Context index"] B --> C["Claim type"] C --> D["Authoritative source"] D --> E["Freshness check"] E --> F["Supporting context"] F --> G["Answer with source judgment"] E --> H["Conflict rule"] H --> G
The map matters more than the pile.
Operating practice
Build a context ladder for quant research:
| Claim type | Authoritative source | Supporting source |
|---|---|---|
| Factor definition | Data dictionary / factor registry | Research notes |
| Factor rationale | Factor research memo | IC discussion |
| Data freshness | Data snapshot manifest | Pipeline logs |
| Portfolio limit | Constraint registry | PM notes |
| Risk exposure | Current risk model | Risk release notes |
| Prior decision | IC memo archive | Analyst comments |
For the quality-score case, the harnessed answer should begin with source judgment:
Current definition:
- Data dictionary May 14 defines active factor as quality_v3.
- January quality_v2 note is historical and should not govern current scores.
Reason for score change:
- quality_v3 added accruals and excludes restated quarters.
- Semiconductor risk bucket also changed after May 12 taxonomy update.
Impact:
- Prior shortlist must be rerun under quality_v3.
- Do not compare current scores to March IC memo without normalization.
The answer is not just cited. It explains which source won.
Product-agent example
A quant context index should be small and explicit:
| Entry | Points to | Rule |
|---|---|---|
factor.current | Factor registry | Highest authority for active definitions |
risk.current | Risk model manifest | Required for exposure claims |
data.snapshot | Dataset manifest | Required for backtest reproducibility |
ic.history | Committee archive | Historical context, never current truth |
constraints.current | Portfolio constraint registry | Binding for advisory outputs |
The agent should read the index first, then retrieve sources.
Common mistakes
The first mistake is treating search rank as authority. Search finds relevance, not truth.
The second mistake is mixing historical committee reasoning with current methodology. Old memos explain decisions; they do not define current factors.
The third mistake is citing sources without judging them. A stale source can be cited accurately and still mislead.
The fourth mistake is failing to bridge names. If quality_v2, quality_v3, and “quality score” all appear, the harness should expose the mapping.
Practical exercise
Choose one quant research question and list every source an agent might use. For each source, mark authority, freshness, claim types it may answer, and conflict behavior.
Then create a tricky question with one stale note and one current registry entry. The harnessed answer should say which source wins and why.
Key takeaways
- Context access is not context design.
- Quant research needs source authority by claim type.
- Freshness and methodology versioning are part of the answer.
- Historical notes should not silently govern current analysis.
- A reliable answer explains source judgment, not just citations.
Further reading / source notes
- Liu et al., “Lost in the Middle: How Language Models Use Long Contexts” for evidence that relevant information can become harder to use when buried in long contexts.
- Model Context Protocol architecture overview for a vocabulary around resources, tools, and prompts as separate primitives.
- OpenAI, “Harness engineering: leveraging Codex in an agent-first world” for environment and feedback-loop framing around agent work.