Building a Hierarchical Supervisor Team for AI-Driven Financial-Crime Investigations

A practical, modular architecture using Python and local LLMs

Financial-crime investigations are complex, multi-step processes. Human analysts don’t simply look at a transaction and declare it suspicious—they build context, map behaviour to known typologies, and then make a recommendation. Modern agentic AI mirrors this flow well, but only when the architecture is structured, explainable, and easy to extend.

One of the most effective patterns for doing this is the hierarchical supervisor team: a central supervisor orchestrating a set of small, specialized worker agents. In this post, we walk through a complete Python implementation of that pattern, powered by local LLMs from Ollama (e.g., gemma3:4b) and designed specifically for financial-crime workflows.

1. Why a Hierarchical Supervisor Team?

A hierarchical supervisor team is a multi-agent architecture built around three principles:

A single supervisor orchestrates a deterministic workflow.
Specialized worker agents—each with its own prompts, tools, and logic—perform narrow tasks.
A shared state object flows through the system, rather than agents calling each other.

At runtime, the supervisor:

Receives an investigation request.
Enriches it with metadata.
Passes it through a sequence (or small DAG) of agents.
Aggregates all outputs into a final investigation report.

Each worker implements a simple interface:


run(state) -> state

No peer-to-peer agent calls. No hidden orchestration. The supervisor decides the ordering, branching, and error handling.

This structure gives you:

Testability — unit tests for each agent, end-to-end tests for the supervisor graph.
Modularity — swap or add agents without touching the core pipeline.
Deployment simplicity — deploy as a Python service in front of local or remote LLMs.

In financial-crime investigations—where explainability, determinism, and controlled workflows matter—this pattern is a natural fit.

2. Architecture Overview

Our example system uses four main components:

Supervisor

The central coordinator. Knows which agents exist and the order they should run in.

ContextAgent

Builds a natural-language summary of the alert and customer behaviour.

TypologyAgent

Maps behaviour to AML typologies and red flags.

RecommendationAgent

Proposes a preliminary risk rating and a high-level next step.

A small driver script (main.py) stitches everything together:

Creates the supervisor.
Iterates over sample alerts and customer profiles.
Calls supervisor.investigate(alert, customer) and prints the results.

This mirrors the hierarchical team patterns described in modern agent frameworks: a planner directing a handful of focused workers.

Please find GitHub Repository at below location. Please note entre code is developed using Antigravity.

https://github.com/sujitpatange/Multiagent/tree/0f0593c011a5836f8ae40b43a2cac529ba5066bd/fincrime_investigation

3. The Context Agent: Building the Story

The first step in any investigation is understanding the situation.

ContextAgent turns raw alert and customer data into a narrative:

A concise summary of what triggered the alert.
Key entities, amounts, and time frames.
Notable behavioural patterns.

It wraps a local Ollama model via a small _call_llm() helper. Prompt design is explicit and task-oriented—structured inputs, clear output requirements, and stable formatting. The agent returns plain text, keeping the state interface clean and predictable.

4. The Typology Agent: Interpreting Behavior

Once the story is clear, the system needs to map behaviors to financial-crime typologies—structuring, layering, mule activity, sanctions exposure, fraud patterns, and more.

TypologyAgent receives:

The raw alert metadata.
The context summary produced by ContextAgent.

It then asks the LLM to classify what is happening using the language of AML typologies, producing an explanation analysts can verify.

This is where domain knowledge shows up: regulators expect that investigations refer to red flags and typologies, not just narrative summaries. The agentic pattern helps enforce this discipline.

5. The Recommendation Agent: Risk and Next Steps

With context and typologies in hand, the final worker agent—RecommendationAgent—proposes:

An initial risk rating (e.g., low / medium / high).
A recommended investigative action (e.g., escalate, request documentation, close with monitoring).

This agent is intentionally conservative and human-centric. It is not making a final decision: it produces a structured recommendation a human investigator can accept or adapt. This aligns with current thinking in regulated environments where AI acts as a copilot, not an autonomous decision maker.

6. The Supervisor: Orchestrating the Investigation

The Supervisor is the heart of the system. It knows:

Which agents exist.
The order in which they should run.
How to merge their outputs into a structured investigation state.

The workflow is explicit and deterministic:


alert → ContextAgent → TypologyAgent → RecommendationAgent → report

Worker agents stay intentionally simple—they take input, produce output, and return control. They do not orchestrate or communicate with each other directly. This separation of concerns makes the architecture predictable, testable, and easy to evolve.

Adding a new capability—say, an AdverseMediaAgent—means:

Instantiate it in the supervisor.
Call it at the appropriate point.
Add its output to the final state.

No redesign required.

7. The Driver Script: Running Investigations

main.py acts as the operational entry point:

Initializes the supervisor (and fails fast if Ollama or the model is unavailable).
Loops through example alerts and customer profiles.
Prints context, typology analysis, and recommended action in a clear format.

It contains no business logic; its only job is inputs and outputs. In a real system, this might be a CLI tool, REST API entry point, or message-queue consumer.

8. Why This Pattern Works Well for Financial-Crime Teams

Financial-crime investigation is one of the clearest real-world fits for a hierarchical agent architecture:

1. Multi-step reasoning fits naturally

Investigations follow a repeatable pipeline: context → typologies → recommendation.

2. Explainability is built in

Each agent produces a human-readable artifact a regulator or reviewer can inspect.

3. Modular components evolve independently

Swap LLMs, add rule-based enrichment, or insert new steps like sanctions checks or customer-risk scoring.

4. On-prem and data-residency friendly

Using local models via Ollama keeps sensitive data inside your environment—an increasingly important requirement in banking.

9. Evolving the System

Once the supervisor pattern is in place, enhancements become straightforward.

You could add:

CaseWriterAgent — auto-generates SAR/STR draft narratives.
Scoring logic — determines when to escalate vs. auto-close.
Integrations — push results into case-management systems.
Metrics and logging — track summary quality, override rates, drift, and more.

The architecture scales because the mental model stabilizes:
Is this a new worker agent? A change in the supervisor’s graph? Or both?

That clarity pays dividends as your financial-crime program matures.

Final Thoughts

Hierarchical supervisor teams bring order, explainability, and modularity to LLM-powered investigations. In financial-crime—where human oversight is essential and workflows must be auditable—this pattern provides a safe, practical way to benefit from LLMs while maintaining control.

If you’re exploring AI for AML, fraud, or financial-crime operations, this architecture is one of the most reliable starting points: simple enough to implement today, flexible enough to grow with your program, and compliant with the workflows investigators already use.

✍️ Author’s Note

This blog reflects the author’s personal point of view — shaped by 25+ years of industry experience, along with a deep passion for continuous learning and teaching.
The content has been phrased and structured using Generative AI tools, with the intent to make it engaging, accessible, and insightful for a broader audience.

Search This Blog

Tech to Transform

34 GenAI in Banking & Finance : Agentic AI Hierarchical Supervisor