Real-Time Alerting Without the Infrastructure: Why We Built Native Webhooks

When your Valkey (or Redis) instance goes down at 3 AM, you need to know immediately-not when someone notices their app is broken the next morning. But getting real-time alerts traditionally meant setting up Prometheus, configuring Alertmanager, writing YAML rules, and managing yet another piece of infrastructure.

BetterDB Monitor 0.3.0 introduces native webhook notifications, giving you real-time alerts with none of that overhead.

The Problem with Traditional Monitoring Stacks

The typical path to database alerting looks like this:

Your Database → Prometheus → Alertmanager → Slack/PagerDuty

That's three separate systems to configure, maintain, and debug when things go wrong. For teams already running Prometheus, this works fine. But for smaller teams, individual developers, or organizations that don't want to run Prometheus just for database alerts, it's overkill.

We already export 99 Prometheus metrics for teams that want that integration. But the reality is that most incident management and on-call tools-PagerDuty, Opsgenie, Incident Management and Response (IMR), Rootly, and dozens of others-use webhooks as their primary integration mechanism. The same goes for communication platforms like Slack, Discord, and Microsoft Teams. Webhooks are the universal language of alerting infrastructure.

Native webhook support means BetterDB can plug directly into whatever incident response tooling your team already uses.

How BetterDB Webhooks Work

Configure a webhook endpoint in the BetterDB UI or via API, select the events you care about, and you're done. When those events occur, BetterDB sends an HTTP POST to your endpoint with full event details.

Here's what your endpoint receives when memory usage goes critical:

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "event": "memory.critical",
  "timestamp": 1706457600000,
  "instance": {
    "host": "valkey.example.com",
    "port": 6379
  },
  "data": {
    "metric": "memory_used_percent",
    "value": 92.5,
    "threshold": 90,
    "maxmemory": 1073741824,
    "usedMemory": 993001472,
    "message": "Memory usage critical: 92.5% (threshold: 90%)"
  }
}

Along with headers for signature verification, delivery tracking, and event routing:

X-Webhook-Signature: <hmac-sha256-signature>
X-Webhook-Timestamp: 1706457600000
X-Webhook-Id: <webhook-id>
X-Webhook-Delivery-Id: <delivery-id>
X-Webhook-Event: memory.critical

No Prometheus. No Alertmanager. No YAML files to manage.

Events That Matter

We structured webhook events around three questions that different personas ask:

"Is it broken?" - Core health events available on all tiers: instance.down, instance.up, memory.critical, connection.critical, and client.blocked. These tell you when something needs immediate attention.

"Why is it slow?" - Operational insight events for Pro users: anomaly.detected, slowlog.threshold, latency.spike, connection.spike, and replication.lag. These help you understand performance degradation before it becomes an outage.

"Who did what?" - Compliance and audit events for Enterprise users: acl.violation, acl.modified, config.changed, and audit.policy.violation. These provide the audit trail security teams require.

Built for Production

Webhooks in a monitoring system need to be reliable. Here's what we built:

HMAC-SHA256 signatures - Every webhook payload is signed so you can verify it came from BetterDB. Reject spoofed requests with a few lines of code.

Automatic retries with exponential backoff - Network hiccups happen. Failed deliveries retry automatically, backing off to avoid overwhelming recovering endpoints.

Delivery history - Every webhook delivery is logged with status, response time, and response body. When something isn't working, you can see exactly what happened.

Hysteresis - Metrics that oscillate around a threshold won't spam you with alerts. A 90% memory alert fires once and clears only when memory drops below 81%, preventing alert fatigue from noisy metrics.

Dead letter queue - After retries are exhausted, failed deliveries move to a dead letter status where you can manually retry them or investigate patterns.

Integration Examples

BetterDB webhooks work with any system that accepts HTTP POST requests. Here are common integrations:

Slack - Point your webhook URL at a Slack incoming webhook. Transform payloads with a small adapter if you want formatted messages. Native Slack integration is coming later this year.

PagerDuty - Send directly to PagerDuty's Events API for incident creation.

Discord - Works out of the box with Discord webhook URLs.

Custom endpoints - Build your own handler to create tickets, trigger auto-remediation, or integrate with internal systems.

When to Use Webhooks vs. Prometheus

Both approaches have their place:

Use Prometheus/Alertmanager when:

You already have a Prometheus stack (BetterDB exports 99 metrics to /prometheus/metrics-both native Valkey data and BetterDB-specific insights)
You need complex alerting rules with multiple conditions
You want to correlate BetterDB metrics with other systems in the same pipeline
You need Alertmanager's advanced features like silencing and inhibition

Use native webhooks when:

You don't want to run Prometheus infrastructure
You need quick setup with minimal configuration
You want to configure alerting from the BetterDB UI
You need delivery history and debugging built in

You can also use both. Export metrics to Prometheus for dashboards and long-term storage, while using native webhooks for simple real-time alerts.

Getting Started

Webhooks are available in BetterDB Monitor 0.3.0 and later. Pull the latest image:

docker pull betterdb/monitor:latest

Navigate to Settings → Webhooks in the UI to create your first webhook, or use the API directly. Test connectivity with the built-in test button before enabling production alerts.

Full documentation is available in the webhooks guide.