What it is
The autonomous diagnostic core of the broader network investigation platform. Given a natural-language troubleshooting request, the agent decomposes it into a diagnostic plan, executes multi-step workflows across 19 telemetry / database clusters and 200+ MCP tools, correlates findings, follows institutional runbook logic, and emits a structured triage summary the on-call engineer can act on.
What makes it work
- Cluster-aware tool routing. The agent has structured knowledge of which database cluster owns which signal, so it doesn't waste tokens on exploration.
- MCP-native tooling. Diagnostic tools are exposed via Model Context Protocol so the agent can chain them like a senior on-call would.
- Runbook logic encoded as workflows. Institutional troubleshooting guides are first-class citizens, not afterthoughts.
- Dual model. GPT-5 for reasoning over noisy telemetry; Claude Opus for long-context narrative summarization.
Outcome
Investigations that used to take 30–90 minutes of expert pivoting now routinely complete in 1–3 minutes. Junior engineers reach senior-grade diagnostics without years of operational tenure.