Skip to content

Horizontal Scaling Guide

Controller (Django)

The Controller is stateless — all state lives in Postgres, Redis, and MinIO.

Scaling strategy

  • Run multiple Controller instances behind a load balancer
  • Session affinity not required (sessions stored in Redis/DB)
  • Each instance runs its own gunicorn workers

Configuration

# Production: 4 workers per instance, 2-4 instances
gunicorn config.wsgi:application \
  --bind 0.0.0.0:8000 \
  --workers 4 \
  --threads 2 \
  --timeout 120 \
  --max-requests 1000 \
  --max-requests-jitter 50

Bottlenecks

  • Database connections: Each worker holds a connection. Use PgBouncer for connection pooling at scale.
  • Brief assembly: CPU-bound (JSON serialization + hashing). Scales linearly with workers.
  • Celery workers: Scale independently. Add more workers for brief lifecycle, health checks, failure reports.

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: controller
spec:
  replicas: 3
  template:
    spec:
      containers:
        - name: controller
          resources:
            requests:
              cpu: "500m"
              memory: "512Mi"
            limits:
              cpu: "1"
              memory: "1Gi"

Dispatcher (Go)

The Dispatcher manages container lifecycle — scaling depends on the queue source.

Internal queue (Redis list)

  • Only ONE Dispatcher instance can consume from a Redis list (BRPOP is exclusive)
  • Scale by increasing NUM_CONSUMERS (parallel goroutines within one instance)
  • For multi-instance: switch to redis-stream queue source

Redis Stream queue

  • Multiple Dispatchers can consume from the same stream via consumer groups
  • Each instance joins the dispatchers consumer group
  • Messages are distributed across instances automatically
  • Set QUEUE_SOURCE=redis-stream on all instances

Configuration

# Single instance, high parallelism
NUM_CONSUMERS=10
MAX_CONCURRENT_PULLS=4

# Multi-instance with Redis Streams
QUEUE_SOURCE=redis-stream
QUEUE_STREAM_NAME=kohakku:tasks
QUEUE_CONSUMER_GROUP=dispatchers
NUM_CONSUMERS=5

Bottlenecks

  • Image pulls: Gated by MAX_CONCURRENT_PULLS. Cold pulls dominate latency.
  • Docker socket: Local backend shares one Docker daemon. For higher throughput, use K8s/ECS backends.
  • Redis: Single Redis handles queue + state. Separate Redis instances for queue vs state at high scale.

Temporal Worker

  • Stateless — run multiple workers on the same task queue
  • Temporal server distributes workflow executions across workers
  • Scale workers independently of Controller instances
# Run 3 worker instances
for i in 1 2 3; do
  python temporal_worker.py &
done

Celery Workers

  • Scale independently from the Controller
  • Separate queues for different task types if needed
# High-priority queue for dispatch
celery -A config worker -l info -Q celery,dispatch -c 4

# Background queue for cleanup
celery -A config worker -l info -Q cleanup -c 2

Database

  • Read replicas: Django supports database routers for read-heavy loads
  • Connection pooling: PgBouncer in transaction mode
  • Indexes: Run manage.py dbshell and check EXPLAIN ANALYZE on slow queries

Redis

  • Persistence: AOF enabled by default in docker-compose (appendonly yes)
  • Maxmemory: Set to prevent OOM. LRU eviction for cache, noeviction for queue
  • Separate instances: One for cache/sessions, one for task queue, one for Celery broker