04
Product
16
Backend
09
Auth
12
iOS
07
Infra
02
Real-Time

Adopt CrowdSec for intrusion detection

ADR-0040 ACCEPTED · 2026-01-30
CrowdSec for Intrusion Detection

Context

The application runs on a public-facing Hetzner cluster with SSH and Traefik exposed to the internet. Any public endpoint receives constant automated scanning from bots probing for vulnerabilities (WordPress, phpMyAdmin, command injection, etc.).

Previously Suricata (deep packet inspection IDS) was evaluated but provided limited value:

  • Most traffic is TLS-encrypted (can't inspect payload)
  • High false positives from TCP stream reassembly
  • Heavy resource usage and ongoing rule tuning required

A lightweight, log-based detection system better fits our architecture where application logs are the source of truth for request content.

Decision

Adopt CrowdSec for intrusion detection using log-based analysis with community threat intelligence.

Architecture

┌──────────────────┐     ┌──────────────────┐     ┌──────────────────┐
│  Application     │────▶│    CrowdSec      │────▶│    Bouncer       │
│  Logs (Traefik,  │     │    Engine        │     │   (nftables)     │
│  SSH)            │     │                  │     │                  │
└──────────────────┘     └──────────────────┘     └──────────────────┘
                                │
                                ▼
                       ┌──────────────────┐
                       │  CrowdSec Cloud  │
                       │  (CAPI - shared  │
                       │   blocklists)    │
                       └──────────────────┘

Detection Approach: Log-Based (Reactive)

CrowdSec reads application logs after requests are processed:

  1. Request hits Traefik → processed → logged
  2. CrowdSec parses log → matches scenario (e.g., SQL injection pattern)
  3. Creates ban decision → nftables blocks future requests from that IP

Trade-off: The first malicious request (or series triggering a scenario) gets through. This is acceptable because:

  • Single probes returning 404 are reconnaissance, not successful attacks
  • Most scenarios require multiple bad requests before triggering (reduces false positives)
  • Attackers learn nothing useful from error responses

Proactive Blocking: CAPI Blocklists

CrowdSec's Central API (CAPI) provides crowd-sourced threat intelligence:

  • IPs attacking other CrowdSec users are shared (anonymized)
  • Known-bad IPs blocked before they hit your server
  • large user base contributing to shared blocklist

This provides proactive protection without inline inspection overhead.

Future Consideration: AppSec WAF (Inline)

For applications requiring inline blocking (e.g., payment processing, PII handling), CrowdSec offers an AppSec component:

┌──────────┐     ┌──────────────┐     ┌──────────┐
│  Client  │────▶│   Traefik    │────▶│   App    │
└──────────┘     │  + CrowdSec  │     └──────────┘
                 │   Plugin     │
                 └──────┬───────┘
                        │ (inspect before routing)
                        ▼
                 ┌──────────────┐
                 │  CrowdSec    │
                 │  AppSec      │
                 │  (port 7422) │
                 └──────────────┘

Requirements:

  • Traefik bouncer plugin v1.2.0+
  • crowdsecurity/appsec-virtual-patching collection
  • AppSec acquisition config

Trade-offs:

  • Adds latency to every request (extra hop)
  • More complex configuration
  • Higher resource usage

Current stance: Not needed for pre-launch travel app. Log-based detection + CAPI is sufficient. Revisit if handling sensitive data or experiencing targeted attacks.

Rationale

Why log-based over inline WAF?

  • Simpler architecture, fewer failure modes
  • No latency added to legitimate requests
  • CAPI blocklists provide most proactive value anyway
  • Our attack surface is small (single GraphQL endpoint, no legacy surface area)

Why CrowdSec over alternatives?

  • Lightweight compared to Suricata/Snort
  • Works with encrypted traffic (reads decrypted logs)
  • Community blocklists use collective threat data
  • Good NixOS module support

Why per-node rather than centralized?

  • Each node receives same CAPI blocklists independently
  • Avoids single point of failure for security
  • Simpler than running shared LAPI for 3-node cluster

Consequences

Positive

  • Low overhead: Reads logs asynchronously, no request latency
  • Community intelligence: Benefits from 70k+ user threat data
  • Works with TLS: Inspects decrypted application logs
  • Automatic updates: Hub collections updated daily

Negative

  • Reactive detection: First few malicious requests get through
  • Log dependency: Only sees what applications log
  • No payload inspection: Can't detect zero-days in request bodies until logged

Monitoring

Metrics exported to Prometheus/Grafana:

  • Active bans and alerts
  • Packets blocked (local vs CAPI)
  • Log lines parsed and scenario matches

References