04
Product
16
Backend
09
Auth
12
iOS
07
Infra
02
Real-Time

Adopt pgBackRest GFS backup strategy

ADR-0039 ACCEPTED · 2026-01-29
pgBackRest GFS Backup Strategy

Context

The application uses PostgreSQL as its primary data store with pgBackRest for backup and point-in-time recovery (PITR) to Cloudflare R2. The initial configuration used daily full backups with 30-day retention, which provides adequate recovery capability but doesn't follow industry best practices for long-term backup retention.

A proper backup strategy must balance:

  • Recovery Point Objective (RPO): Maximum acceptable data loss
  • Recovery Time Objective (RTO): Maximum acceptable recovery time
  • Storage efficiency: Cost-effective use of backup storage
  • Recovery chain reliability: Minimizing dependencies between backups

The Grandfather-Father-Son (GFS) pattern is the industry standard for balancing these concerns, using a hierarchy of backup frequencies with different retention periods.

Decision

Adopt a GFS-style backup strategy using pgBackRest's full, differential, and incremental backup types with tiered retention.

Backup Types

Full Backup (Weekly - Sunday 03:00 UTC)

  • Complete copy of entire database
  • Self-contained, restores independently
  • Anchor point for all other backups that week

Differential Backup (Daily - Mon-Sat 03:00 UTC)

  • All changes since last full backup
  • Restores with: full + diff
  • Grows larger through the week but limits restore chain to 2 backups

Incremental Backup (Every 6 hours - 09:00, 15:00, 21:00 UTC)

  • Only changes since last backup (full, diff, or incr)
  • Smallest size, provides frequent restore points
  • Restores with: full + diff + incr chain

Retention Policy

repo1-retention-full=52      # 1 year of weekly fulls
repo1-retention-diff=7       # 7 daily diffs (current week)
repo1-retention-archive=1    # WAL retained for oldest diff
repo1-retention-archive-type=diff

Weekly Schedule

         Sun     Mon     Tue     Wed     Thu     Fri     Sat
03:00    Full    Diff    Diff    Diff    Diff    Diff    Diff
           |       |       |       |       |       |       |
09:00    Incr    Incr    Incr    Incr    Incr    Incr    Incr
15:00    Incr    Incr    Incr    Incr    Incr    Incr    Incr
21:00    Incr    Incr    Incr    Incr    Incr    Incr    Incr

Example Restore Scenarios

Restore Thursday 17:00:

  1. Sunday Full
  2. Thursday Diff
  3. Thursday 15:00 Incr
  4. WAL replay to 17:00

Restore Tuesday 10:00:

  1. Sunday Full
  2. Tuesday Diff
  3. Tuesday 09:00 Incr
  4. WAL replay to 10:00

Restore 3 months ago (any Sunday):

  1. That Sunday's Full backup
  2. No PITR available (weekly granularity only)

Recovery Capability

Time Range Granularity Method
Last 7 days Any point in time WAL PITR
Last 7 days 6-hour checkpoints Incremental backups
8 days - 52 weeks Weekly (Sundays only) Full backups

Note: WAL archive retention (repo1-retention-archive=1, type=diff) limits PITR to the current week. Beyond 7 days, restore is limited to weekly full backup points.

Rationale

Why differential over pure incremental? With pure incrementals, a corrupted Monday backup would invalidate all subsequent backups that week. Differentials limit the blast radius - each day's backup only depends on Sunday's full.

Why weekly fulls instead of daily? Daily fulls with 52-week retention would require 365 full backups. Weekly fulls reduce this to 52 while maintaining the same recovery window, saving ~85% storage.

Why 6-hour incrementals? Balances restore chain length against backup frequency. More frequent than daily provides better RPO, less frequent than hourly keeps restore operations manageable.

Why R2 storage? Cloudflare R2 provides S3-compatible storage with free egress, critical for disaster recovery scenarios where large data transfers are needed. The 10GB free tier covers our needs through significant scale.

Consequences

Positive

  • Industry-standard approach: Follows proven GFS methodology
  • Storage efficient: ~430MB for 1 year of backups at current DB size (50MB)
  • Reliable recovery: Limited backup chain dependencies
  • Cost effective: Stays within R2 free tier until DB exceeds ~1GB
  • Flexible recovery: Choose between speed (recent backup) or granularity (PITR)

Negative

  • Increased complexity: Three backup types vs. single type
  • Longer restore for old data: Weekly granularity beyond 7 days
  • WAL dependency: PITR requires continuous WAL archiving

Storage Projections

DB Size Compressed Annual Storage Monthly Cost
50MB ~7MB ~430MB Free
1GB ~140MB ~8GB Free
10GB ~1.4GB ~80GB ~$1/mo
100GB ~14GB ~800GB ~$12/mo

Verification

The backup strategy can be verified with:

# Check backup status and retention
just backup-status

# List all backups
pgbackrest info --stanza=main

# Verify backup integrity (runs weekly)
pgbackrest verify --stanza=main

Related Decisions

  • Infrastructure: pgBackRest configuration in infra/modules/platform/nixos/pgbackrest.nix
  • Storage: Cloudflare R2

References