Horizontal Scaling Guide¶
Controller (Django)¶
The Controller is stateless — all state lives in Postgres, Redis, and MinIO.
Scaling strategy¶
- Run multiple Controller instances behind a load balancer
- Session affinity not required (sessions stored in Redis/DB)
- Each instance runs its own gunicorn workers
Configuration¶
# Production: 4 workers per instance, 2-4 instances
gunicorn config.wsgi:application \
--bind 0.0.0.0:8000 \
--workers 4 \
--threads 2 \
--timeout 120 \
--max-requests 1000 \
--max-requests-jitter 50
Bottlenecks¶
- Database connections: Each worker holds a connection. Use PgBouncer for connection pooling at scale.
- Brief assembly: CPU-bound (JSON serialization + hashing). Scales linearly with workers.
- Celery workers: Scale independently. Add more workers for brief lifecycle, health checks, failure reports.
Kubernetes¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller
spec:
replicas: 3
template:
spec:
containers:
- name: controller
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
Dispatcher (Go)¶
The Dispatcher manages container lifecycle — scaling depends on the queue source.
Internal queue (Redis list)¶
- Only ONE Dispatcher instance can consume from a Redis list (BRPOP is exclusive)
- Scale by increasing
NUM_CONSUMERS(parallel goroutines within one instance) - For multi-instance: switch to
redis-streamqueue source
Redis Stream queue¶
- Multiple Dispatchers can consume from the same stream via consumer groups
- Each instance joins the
dispatchersconsumer group - Messages are distributed across instances automatically
- Set
QUEUE_SOURCE=redis-streamon all instances
Configuration¶
# Single instance, high parallelism
NUM_CONSUMERS=10
MAX_CONCURRENT_PULLS=4
# Multi-instance with Redis Streams
QUEUE_SOURCE=redis-stream
QUEUE_STREAM_NAME=kohakku:tasks
QUEUE_CONSUMER_GROUP=dispatchers
NUM_CONSUMERS=5
Bottlenecks¶
- Image pulls: Gated by
MAX_CONCURRENT_PULLS. Cold pulls dominate latency. - Docker socket: Local backend shares one Docker daemon. For higher throughput, use K8s/ECS backends.
- Redis: Single Redis handles queue + state. Separate Redis instances for queue vs state at high scale.
Temporal Worker¶
- Stateless — run multiple workers on the same task queue
- Temporal server distributes workflow executions across workers
- Scale workers independently of Controller instances
Celery Workers¶
- Scale independently from the Controller
- Separate queues for different task types if needed
# High-priority queue for dispatch
celery -A config worker -l info -Q celery,dispatch -c 4
# Background queue for cleanup
celery -A config worker -l info -Q cleanup -c 2
Database¶
- Read replicas: Django supports database routers for read-heavy loads
- Connection pooling: PgBouncer in transaction mode
- Indexes: Run
manage.py dbshelland checkEXPLAIN ANALYZEon slow queries
Redis¶
- Persistence: AOF enabled by default in docker-compose (appendonly yes)
- Maxmemory: Set to prevent OOM. LRU eviction for cache, noeviction for queue
- Separate instances: One for cache/sessions, one for task queue, one for Celery broker