Adopt CrowdSec for intrusion detection
Context
The application runs on a public-facing Hetzner cluster with SSH and Traefik exposed to the internet. Any public endpoint receives constant automated scanning from bots probing for vulnerabilities (WordPress, phpMyAdmin, command injection, etc.).
Previously Suricata (deep packet inspection IDS) was evaluated but provided limited value:
- Most traffic is TLS-encrypted (can't inspect payload)
- High false positives from TCP stream reassembly
- Heavy resource usage and ongoing rule tuning required
A lightweight, log-based detection system better fits our architecture where application logs are the source of truth for request content.
Decision
Adopt CrowdSec for intrusion detection using log-based analysis with community threat intelligence.
Architecture
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ Application │────▶│ CrowdSec │────▶│ Bouncer │
│ Logs (Traefik, │ │ Engine │ │ (nftables) │
│ SSH) │ │ │ │ │
└──────────────────┘ └──────────────────┘ └──────────────────┘
│
▼
┌──────────────────┐
│ CrowdSec Cloud │
│ (CAPI - shared │
│ blocklists) │
└──────────────────┘
Detection Approach: Log-Based (Reactive)
CrowdSec reads application logs after requests are processed:
- Request hits Traefik → processed → logged
- CrowdSec parses log → matches scenario (e.g., SQL injection pattern)
- Creates ban decision → nftables blocks future requests from that IP
Trade-off: The first malicious request (or series triggering a scenario) gets through. This is acceptable because:
- Single probes returning 404 are reconnaissance, not successful attacks
- Most scenarios require multiple bad requests before triggering (reduces false positives)
- Attackers learn nothing useful from error responses
Proactive Blocking: CAPI Blocklists
CrowdSec's Central API (CAPI) provides crowd-sourced threat intelligence:
- IPs attacking other CrowdSec users are shared (anonymized)
- Known-bad IPs blocked before they hit your server
- large user base contributing to shared blocklist
This provides proactive protection without inline inspection overhead.
Future Consideration: AppSec WAF (Inline)
For applications requiring inline blocking (e.g., payment processing, PII handling), CrowdSec offers an AppSec component:
┌──────────┐ ┌──────────────┐ ┌──────────┐
│ Client │────▶│ Traefik │────▶│ App │
└──────────┘ │ + CrowdSec │ └──────────┘
│ Plugin │
└──────┬───────┘
│ (inspect before routing)
▼
┌──────────────┐
│ CrowdSec │
│ AppSec │
│ (port 7422) │
└──────────────┘
Requirements:
- Traefik bouncer plugin v1.2.0+
crowdsecurity/appsec-virtual-patchingcollection- AppSec acquisition config
Trade-offs:
- Adds latency to every request (extra hop)
- More complex configuration
- Higher resource usage
Current stance: Not needed for pre-launch travel app. Log-based detection + CAPI is sufficient. Revisit if handling sensitive data or experiencing targeted attacks.
Rationale
Why log-based over inline WAF?
- Simpler architecture, fewer failure modes
- No latency added to legitimate requests
- CAPI blocklists provide most proactive value anyway
- Our attack surface is small (single GraphQL endpoint, no legacy surface area)
Why CrowdSec over alternatives?
- Lightweight compared to Suricata/Snort
- Works with encrypted traffic (reads decrypted logs)
- Community blocklists use collective threat data
- Good NixOS module support
Why per-node rather than centralized?
- Each node receives same CAPI blocklists independently
- Avoids single point of failure for security
- Simpler than running shared LAPI for 3-node cluster
Consequences
Positive
- Low overhead: Reads logs asynchronously, no request latency
- Community intelligence: Benefits from 70k+ user threat data
- Works with TLS: Inspects decrypted application logs
- Automatic updates: Hub collections updated daily
Negative
- Reactive detection: First few malicious requests get through
- Log dependency: Only sees what applications log
- No payload inspection: Can't detect zero-days in request bodies until logged
Monitoring
Metrics exported to Prometheus/Grafana:
- Active bans and alerts
- Packets blocked (local vs CAPI)
- Log lines parsed and scenario matches
References
- CrowdSec Documentation
- CrowdSec AppSec Quickstart
- Traefik Bouncer Plugin
- GitHub Issue #76: Replace Suricata with CrowdSec