What this page covers
This deep dive focuses on the technical delivery pattern that made the workflow production-grade:
- Decomposing investment analysis into smaller, auditable tasks
- Using strict schemas and structured evidence capture
- Running evals (rubrics + LLM-as-a-judge) to validate quality
- Building for human review (HITL), traceability, and governance
Key patterns
1) Atomic agents with strict contracts
Rather than one monolithic “do the whole thing” prompt, we used multiple agents with narrow responsibilities and strict I/O schemas.
2) Evidence-first synthesis
Every claim in the output should map back to an evidence object (source, excerpt, timestamp, etc.).
3) Eval-driven iteration
We used evaluation loops to compare outputs against rubrics co-developed with the investment team.
4) Human-in-the-loop by default
Review gates are part of the design, not an afterthought.