Scaling¶
Controller (Django)¶
The Controller is stateless -- all state lives in Postgres, Redis, and MinIO.
Strategy¶
- Run multiple Controller instances behind a load balancer
- Session affinity not required (sessions stored in Redis/DB)
- Each instance runs its own gunicorn workers
Configuration¶
# Production: 4 workers per instance, 2-4 instances
gunicorn config.wsgi:application \
--bind 0.0.0.0:8000 \
--workers 4 \
--threads 2 \
--timeout 120 \
--max-requests 1000 \
--max-requests-jitter 50
Bottlenecks¶
- Database connections -- Each worker holds a connection. Use PgBouncer for connection pooling at scale.
- Brief assembly -- CPU-bound (JSON serialization + hashing). Scales linearly with workers.
- Celery workers -- Scale independently. Add more workers for brief lifecycle, health checks, failure reports.
Kubernetes¶
apiVersion: apps/v1
kind: Deployment
metadata:
name: controller
spec:
replicas: 3
template:
spec:
containers:
- name: controller
resources:
requests:
cpu: "500m"
memory: "512Mi"
limits:
cpu: "1"
memory: "1Gi"
Dispatcher (Go)¶
The Dispatcher manages container lifecycle -- scaling depends on the queue source.
Internal Queue (Redis List)¶
- Only one Dispatcher instance can consume from a Redis list (BRPOP is exclusive)
- Scale by increasing
NUM_CONSUMERS(parallel goroutines within one instance) - For multi-instance: switch to
redis-streamqueue source
Redis Stream Queue¶
- Multiple Dispatchers can consume from the same stream via consumer groups
- Each instance joins the
dispatchersconsumer group - Messages are distributed across instances automatically
- Set
QUEUE_SOURCE=redis-streamon all instances
Configuration¶
Bottlenecks¶
- Image pulls -- Gated by
MAX_CONCURRENT_PULLS. Cold pulls dominate latency. - Docker socket -- Local backend shares one Docker daemon. For higher throughput, use K8s/ECS backends.
- Redis -- Single Redis handles queue + state. Separate Redis instances for queue vs state at high scale.
Temporal Worker¶
Stateless -- run multiple workers on the same task queue. Temporal server distributes workflow executions across workers.
Celery Workers¶
Scale independently from the Controller. Separate queues for different task types if needed.
# High-priority queue for dispatch
celery -A config worker -l info -Q celery,dispatch -c 4
# Background queue for cleanup
celery -A config worker -l info -Q cleanup -c 2
Database¶
| Strategy | When |
|---|---|
| Read replicas | Read-heavy loads. Django supports database routers. |
| Connection pooling | High worker count. PgBouncer in transaction mode. |
| Index audit | Slow queries. EXPLAIN ANALYZE on hot paths. |
Redis¶
| Setting | Recommendation |
|---|---|
| Persistence | AOF enabled by default in docker-compose (appendonly yes) |
| Maxmemory | Set to prevent OOM. LRU eviction for cache, noeviction for queue |
| Separate instances | One for cache/sessions, one for task queue, one for Celery broker |