Skip to content

Incident Response Orchestrator

FieldValue
Nameincident-response-orchestrator
Categoryautomation
Complexityadvanced
Tagsincident-response, double-diamond-dag, monitoring, llm-classification, nats-events, s3-storage, postgres-state
Authorrandybias
Min Version0.1.0

Receive monitoring alerts, gather context from incident history and runbooks, classify severity with AI, and orchestrate response actions. Uses a double-diamond DAG pattern: first fan-out gathers context (history, runbooks, deploys), then fan-in to classify; second fan-out dispatches actions (ticket, Slack, event, page), then fan-in to log.

┌→ query-history ──┐ ┌→ create-ticket ──┐
receive-alert ───┼→ fetch-runbook ──┼→ classify-and-brief ─┼→ notify-slack ───┼→ log-actions
└→ query-deploys ──┘ ├→ publish-event ──┤
└→ page-oncall ────┘
NodePurpose
receive-alertReceive and parse the incoming monitoring alert
query-historyQuery Postgres for past incidents on the same service
fetch-runbookRetrieve the relevant runbook from S3
query-deploysCheck recent deploys via GitHub API
classify-and-briefAI-powered severity classification and incident brief
create-ticketCreate a tracking ticket in GitHub Issues
notify-slackPost incident brief to Slack
publish-eventPublish structured event to NATS
page-oncallPage on-call via PagerDuty (for critical severity)
log-actionsLog all response actions to Postgres
  • manual only
ServiceTypeRequired
GitHub APIExternalYes
Anthropic APIExternalYes
PagerDuty APIExternalYes (for paging)
Slack webhookExternalYes
tentacular-postgresExoskeletonYes
tentacular-rustfsExoskeletonYes
tentacular-natsExoskeletonYes
KeyDefaultDescription
timeout180sPer-node timeout
retries1Retry count per node
alert_service(empty)Service name from the alert
alert_metric(empty)Metric name from the alert
alert_value(empty)Current metric value
alert_threshold(empty)Threshold that was breached
github_orgmy-orgGitHub organization for deploy history
service_repos(example map)Mapping of service names to repo names
  • github.token — GitHub token for deploy history and issue creation
  • anthropic.api_key — Claude API key for severity classification
  • pagerduty.api_key — PagerDuty API key for on-call paging
  • slack.webhook_url — Slack webhook for incident notifications
Terminal window
tntc scaffold init incident-response-orchestrator
tntc scaffold init incident-response-orchestrator my-custom-name
tntc scaffold info incident-response-orchestrator

Scaffold source: quickstarts/incident-response-orchestrator/