Multi-Agent AI Workflows (Technical Deep Dive)

What this page covers

This deep dive focuses on the technical delivery pattern that made the workflow production-grade:

  • Decomposing investment analysis into smaller, auditable tasks
  • Using strict schemas and structured evidence capture
  • Running evals (rubrics + LLM-as-a-judge) to validate quality
  • Building for human review (HITL), traceability, and governance

Key patterns

1) Atomic agents with strict contracts

Rather than one monolithic “do the whole thing” prompt, we used multiple agents with narrow responsibilities and strict I/O schemas.

2) Evidence-first synthesis

Every claim in the output should map back to an evidence object (source, excerpt, timestamp, etc.).

3) Eval-driven iteration

We used evaluation loops to compare outputs against rubrics co-developed with the investment team.

4) Human-in-the-loop by default

Review gates are part of the design, not an afterthought.