Scaling n8n for production demands three interconnected layers working
together under a single concurrency governor. Queue mode
splits n8n into a main process — handling the editor UI, REST API, and
trigger listening — and independent worker processes that pull workflow
jobs from a Redis ≥6.0 BullMQ-backed queue for
execution. PostgreSQL is the required database: SQLite
cannot handle concurrent multi-process writes and will corrupt. A single
environment variable — N8N_CONCURRENCY_PRODUCTION_LIMIT —
caps production execution concurrency in both regular and queue modes,
globally overriding per-worker --concurrency flags when set
to any value other than -1. This guide provides concrete, tested
configurations for each layer — from Redis Cluster to PostgreSQL pool
sizing, from worker count to I/O-vs-CPU concurrency tuning — informed
by production n8n deployments [1]
[2].
How does queue mode distribute workflow jobs across Redis, workers, and PostgreSQL?
Queue mode splits n8n into three roles: the main process (editor UI, REST API, trigger listening — webhook reception, cron timer evaluation, polling cycles), Redis (BullMQ job broker receiving execution IDs from the main process and serving them to idle workers), and worker processes (each running its own Node.js instance, polling Redis for jobs, retrieving full workflow definitions and credentials from PostgreSQL, executing via WorkflowExecute, and writing results back) [1] [4].
The complete execution follows a six-step flow: (1) the main process receives a trigger event and generates an execution; (2) it stores the initial execution record in PostgreSQL; (3) the execution ID is pushed into the Redis BullMQ queue; (4) the next available worker pops the job from Redis; (5) the worker fetches the full workflow definition and credentials from PostgreSQL; (6) the worker executes every node via WorkflowExecute and writes results back to PostgreSQL, then signals completion to Redis [4]. Critically, the main process never executes workflows directly in queue mode — it only enqueues jobs. Workers are stateless, meaning you can add or remove them without downtime, and a single worker failure never loses queued jobs — BullMQ redistributes them automatically [4]. An optional fourth role — the webhook processor — can be deployed independently to scale incoming webhook handling. For the complete production deployment blueprint, see the n8n Docker Compose production stack guide.
How does N8N_CONCURRENCY_PRODUCTION_LIMIT control concurrency in both regular and queue mode?
N8N_CONCURRENCY_PRODUCTION_LIMIT is a single environment
variable that caps the number of production executions running
simultaneously — in both regular and queue modes. It
defaults to -1 (disabled). When set to any value other
than -1, n8n uses this value as the global ceiling for the entire
deployment, overriding the per-worker --concurrency flag in
queue mode [2].
This limit applies only to production executions —
those started from webhooks or trigger nodes. Manual executions (from
the editor UI), sub-workflow calls, error executions, and CLI executions
(n8n execute --id=N) are exempt [2].
In regular mode, set it to a reasonable value (e.g.,
export N8N_CONCURRENCY_PRODUCTION_LIMIT=20) to prevent
event-loop thrashing caused by too many simultaneous production
executions. In queue mode, when this variable is set to any value other
than -1, n8n takes the limit from this variable and ignores
the per-worker flag entirely [3].
Concurrency less than 5 can lead to an unstable environment — consider
setting it to at least 5 for best performance [3].
Executions beyond the limit are held in queue and processed in
FIFO order as slots free up. The environment variable’s
value is consumed once at startup; runtime changes require a restart
[2].
For the complete execution lifecycle and how concurrency interacts with
memory per execution, see the
n8n Node Execution Engine guide.
N8N_CONCURRENCY_PRODUCTION_LIMIT=20 for a safe
starting ceiling. (2) In queue mode, this globally overrides
per-worker --concurrency. (3) Default per-worker
concurrency is 10; minimum recommended is 5. (4) Excess executions
queue in FIFO order. (5) Applies only to production executions —
manual, sub-workflow, error, and CLI executions are exempt.
[2]
How do you configure Redis as the BullMQ message broker for queue mode?
Redis is the mandatory message broker for queue mode and must be
version ≥6.0. It is configured via
QUEUE_BULL_REDIS_HOST and
QUEUE_BULL_REDIS_PORT (default 6379), plus optional
QUEUE_BULL_REDIS_DB (default 0),
QUEUE_BULL_REDIS_USERNAME, and
QUEUE_BULL_REDIS_PASSWORD. These variables must be
identical across the main process and every worker [1]
[5].
For Redis Cluster deployments, set
QUEUE_BULL_REDIS_CLUSTER_NODES as a comma‑separated list
of host:port entries. When this variable is set, n8n
creates a Redis Cluster client instead of a standalone client, and
n8n will ignore QUEUE_BULL_REDIS_HOST
and QUEUE_BULL_REDIS_PORT [1].
For high-availability Redis, use Redis Sentinel or
a cloud-managed Redis service. In Kubernetes environments, set
QUEUE_BULL_REDIS_HOST to the Redis service name (e.g.,
redis-service). n8n uses the BullMQ library with Redis
for queue orchestration — it relies on Redis as a fast in-memory data
store for job metadata, while workflow definitions, execution state,
and results live in PostgreSQL [6].
For troubleshooting, if workers aren’t picking up jobs, 99% of the
time the issue is incorrect Redis configuration — verify that all
QUEUE_BULL_REDIS_* environment variables match across
every container with network connectivity to the Redis instance
[5].
How do you configure PostgreSQL as the production database for queue mode scaling?
PostgreSQL is required for queue mode. SQLite cannot handle concurrent multi-process writes — when multiple worker processes write simultaneously, SQLite’s file-level write lock causes “database is locked” errors and progressive database corruption [1]. PostgreSQL handles this natively through MVCC (Multi-Version Concurrency Control) with row-level locking — multiple workers can read and write simultaneously without blocking.
Configure PostgreSQL by setting DB_TYPE=postgresdb along
with DB_POSTGRESDB_HOST,
DB_POSTGRESDB_PORT (default 5432),
DB_POSTGRESDB_DATABASE,
DB_POSTGRESDB_USER, and
DB_POSTGRESDB_PASSWORD [1].
For SSL-encrypted connections, set
DB_POSTGRESDB_SSL_ENABLED=true along with optional
DB_POSTGRESDB_SSL_CA,
DB_POSTGRESDB_SSL_CERT, and
DB_POSTGRESDB_SSL_KEY. For connection pooling,
configure DB_POSTGRESDB_POOL_SIZE (default: 2). Increase
to 10–20 for single-instance production, and 20–50 for queue mode
with 3+ workers — each worker opens multiple connections, and
insufficient pool size causes queued connection requests. For
high-availability PostgreSQL, deploy with primary-
replica replication and automated failover via Patroni or a cloud-
managed PostgreSQL service [4].
For the complete database selection guide covering SQLite vs
PostgreSQL benchmarks and migration procedures, see the
n8n Database: SQLite vs PostgreSQL guide.
DB_TYPE=postgresdb,
DB_POSTGRESDB_HOST=postgres,
DB_POSTGRESDB_PORT=5432,
DB_POSTGRESDB_POOL_SIZE=20.
SSL: DB_POSTGRESDB_SSL_ENABLED=true.
The main process writes initial execution records; workers write
results back — both must reach the same database instance.
[1]
How many workers do you need, and how do you tune concurrency for I/O-bound vs CPU-bound workflows?
Workers are launched via the n8n worker CLI command with
an optional --concurrency flag (default: 10
per worker). The optimal worker count and concurrency depend entirely
on workload type. For I/O‑bound workflows (HTTP calls,
database queries, webhook forwarding — where workers spend most time
waiting on external services): set concurrency to 10–20 per
worker. For CPU‑bound workflows (large data
transformations, image processing, heavy Code nodes — where the
bottleneck is the CPU itself): keep concurrency at 2–5 per
worker and scale by adding more workers instead [7].
A pragmatic starting rule: one worker per available vCPU
core for CPU‑bound workloads, and two to three
workers per core for I/O‑bound workloads — then benchmark
your actual workflow mix [8].
Set N8N_CONCURRENCY_PRODUCTION_LIMIT as the global
ceiling, and let per-worker concurrency handle distribution within
that cap. In Kubernetes deployments, n8n workers are I/O-bound by
nature — they wait for external responses, not CPU — so a Horizontal
Pod Autoscaler (HPA) monitoring CPU may never trigger scaling. Scale
instead by queue-length metrics [9].
For database connection pool exhaustion, be aware
that extremely low concurrency (--concurrency=1) combined
with a large number of workers can exhaust PostgreSQL’s connection
pool — each worker opens multiple connections, leading to processing
delays and execution failures [10].
For the complete worker lifecycle guide including graceful shutdown
(N8N_GRACEFUL_SHUTDOWN_TIMEOUT, default 30 seconds), see
the
n8n Node Execution Engine guide.
| Workload Type | Concurrency per Worker | Worker Scaling Strategy | Bottleneck |
|---|---|---|---|
| I/O‑bound (APIs, webhooks, database queries) | 10–20 | 2–3 workers per vCPU core | External service latency |
| CPU‑bound (data transforms, image processing, Code nodes) | 2–5 | 1 worker per vCPU core; add more workers | CPU/memory; isolate heavy workloads |
| Mixed (typical production) | 5–10 (start here, then measure) | Start 2 workers, scale by queue depth | Varies; benchmark your actual workflow mix |
How do multi-main HA setups provide high availability in n8n queue mode?
Multi-main mode — an Enterprise feature for Self-hosted plans — enables running multiple main n8n instances that share the same PostgreSQL database and Redis instance. Each main process serves the UI and API independently, and webhooks can be load‑balanced across them. This provides high availability: if one main process fails, the others continue serving users and enqueueing jobs without interruption [10].
To configure multi-main, deploy multiple instances with
EXECUTIONS_PROCESS=main behind a load balancer, and set
N8N_MULTI_MAIN_SETUP_ENABLED=true on all main pods.
This uses Redis-based leader election to designate
one instance as the leader and others as followers — ensuring that
cron schedules, polling triggers, and other singleton tasks execute
only once even with multiple mains present [10].
All instances must share the same N8N_ENCRYPTION_KEY so
workers can decrypt credentials regardless of which main process
enqueued the job. Database migrations must be run from a single main
instance — running them concurrently from multiple instances causes
race conditions [1].
For high-availability Redis, use Sentinel or a cloud-managed service;
for high-availability PostgreSQL, deploy primary-replica with
automated failover. For the complete production hardening blueprint
covering Nginx reverse proxy, SSL termination, and firewall rules,
see the
n8n Node Security Hardening guide.
| HA Component | Configuration | Key Requirement |
|---|---|---|
| Multi‑Main | 2+ instances with EXECUTIONS_PROCESS=main behind load balancer; N8N_MULTI_MAIN_SETUP_ENABLED=true |
Same N8N_ENCRYPTION_KEY across all instances; Enterprise license required |
| Redis HA | Redis Sentinel or Cluster with 3+ nodes | Automatic failover; configure QUEUE_BULL_REDIS_SENTINEL or QUEUE_BULL_REDIS_CLUSTER_NODES |
| PostgreSQL HA | Primary + 1–2 replicas with streaming replication; DB_POSTGRESDB_POOL_SIZE=20–50 |
Automated failover via Patroni or cloud-managed; connection pooling via PgBouncer |
| Worker Redundancy | 2+ workers with identical config; BullMQ auto-redistributes failed jobs | Stateless; any worker can execute any workflow; N8N_GRACEFUL_SHUTDOWN_TIMEOUT=45 |
References
- n8n Documentation — Queue Mode: horizontal scaling, Redis setup, PostgreSQL requirement, encryption key sharing, S3 binary data storage, multi-main HA (2026)
- n8n Documentation — Self-hosted Concurrency Control: N8N_CONCURRENCY_PRODUCTION_LIMIT (-1 default), production-only scope, FIFO queue, override of per-worker –concurrency (2026)
- Mintlify — n8n CLI Commands: worker –concurrency flag (default 10, minimum 5 for stability), N8N_CONCURRENCY_PRODUCTION_LIMIT override, N8N_GRACEFUL_SHUTDOWN_TIMEOUT (Feb 2026)
- DeepWiki — Workflow Execution System: regular vs queue mode comparison, six-step queue mode flow, configuration requirements, webhook processors (Apr 2026)
- n8n Community — How to properly scale n8n with heavy compute workflows: 99% of worker issues are bad Redis config, verify QUEUE_BULL_REDIS_* variables across all containers (Sep 2025)
- DeepWiki — Runtime Architecture and Process Models: main/worker/webhook process types, Bull queue library, Redis dependency, process lifecycle (Apr 2026)
- Elest.io — How to Scale n8n with Redis Queue Mode: I/O‑bound concurrency 10–20 per worker, CPU‑bound concurrency 2–5 per worker with more workers (Apr 2026)
- Toolient — Workers and Concurrency in n8n: concurrency controls simultaneous workflows per worker, higher helps I/O‑bound but harms CPU‑bound performance (Dec 2025)
- n8n Community — K8s Queue Mode Jobs Not Distributed: n8n workers are I/O‑bound by nature, HPA on CPU won’t trigger scaling; scale by queue-length metrics instead (Feb 2026)
- DeepWiki — Queue Mode and Horizontal Scaling: multi-main setup for HA, Redis-based leader election, database connection pool exhaustion with low concurrency + many workers (Mar 2026)
- Hostinger — How to configure n8n queue mode on VPS: Redis container, PostgreSQL, environment variables (QUEUE_BULL_REDIS_HOST, DB_POSTGRESDB_*), main process and worker deployment (Mar 2026)
- Azguards Technolabs — Solving the N8N Orphaned Job Trap in K8s Clusters: BullMQ stalled-job mechanics, N8N_GRACEFUL_SHUTDOWN_TIMEOUT default 30s, SIGKILL risk at 45s (Mar 2026)

