n8n for Hackers: Turn Your Canvas Into a Local, LLM-Powered, Multi-Agent OS

Low-code is essentially traditional code dressed up in a tuxedo.

0. TL;DR

If you can, you can ship production-grade automations today—without waiting for SaaS feature roadmaps.

Below is the cheat sheet I wish I had when I first opened n8n: how to embed JS/Python, self-host LLMs (Llama, Claude, GPT), orchestrate multiple agents, drive 50 Chrome tabs, and still keep everything in Git.

1. The Node Model Consists of a 3-Stage Execution Pipeline.

Pre-execution → Core Function → Post-execution (retry/catch/rollback)

JavaScript runs on Node 18, full require() power: Puppeteer, tfjs, axios, whatever.
Python is sandbox-spawned; pip packages must be baked into your image (see §4).
The return value is auto-wrapped json[] for downstream nodes—no boilerplate.

2. Roll Your API Node in 5 min

Export OpenAPI/Swagger JSON.
npx n8n-nodes-dev new → pick “From OpenAPI” → paste URL.
npm pack → You now have a custom node with auth, pagination, and schema validation.
Push to private GitHub Package; consume via docker build (§4).

3. Local-First LLM Integration

Architecture
┌─ n8n main (TS)
├─ queue (Bull + Redis)
└─ executor containers (scalable)

Self-hosted LLM paths
A. llama.cpp + OpenAI-compatible proxy

   ./server -m llama-3-8b.q4_0.gguf --port 11434 -np 8 -c 4096

In n8n choose the “OpenAI” node, and set baseURL = http://llama:11434/v1. Done.
B. Ollama/LocalAI one-liner

   docker run -d -p 11434:11434 ollama/ollama

C. Offline cache layer
The function node writes every prompt and response to SQLite, allowing for a cache hit with zero cost for replay.

Benchmark (M2-Ubuntu, 32 GB RAM)
With Llama-3-8B and 20 parallel threads, the system achieves a rate of 1,200 tokens per second, while the CPU usage of the n8n executor remains below 15%.

4. Multi-Agent Design Pattern

Each agent is a standalone workflow with declared JSON schema I/O.
Master flow uses the “Execute Workflow” node; pass state via Redis stream.
Errors bubble up → master catches → Slack “human-in-the-loop” approval.
Version each sub-workflow separately; Git tags map to Docker image tags.

5. Browser Automation at Scale

The official Puppeteer node covers ~20% of the API. Real hackers do:

const puppeteer = require('puppeteer-cluster');
const cluster = await puppeteer.launch({
  concurrency: puppeteer.CONCURRENCY_CONTEXT,
  maxConcurrency: 50
});
const page = await cluster.newPage();
await page.goto('https://...');
const pdf = await page.pdf({format: 'A4'});
return [{binaryData: pdf, fileName: 'report.pdf'}];

Pipe the binary attribute to the “Write Binary File” or S3 node.
Use “Wait for Webhook” to pause for manual CAPTCHA solves; the callback resumes the flow.
Memory footprint ≈ 300 MB per tab; 2 vCPUs handle 50 concurrent pages.

6. Hardening & Performance

Split large files: toggle “Split into items” → 200 rows/item keeps heap <512 MB.
External Postgres; SQLite WAL locks after ~5 GB.
Redis queue + executor replicas; rule of thumb: 1 CPU core per 5 concurrent jobs.
Export logs as JSON → Loki/Grafana; alert if node runtime >30 s.

7. CI/CD Your Workflows

Workflows are plain JSON—treat as code:

# .github/workflows/e2e.yml
- name: Import workflows
  run: n8n import:workflow --separate --input ./workflows
- name: Integration test
  run: n8n execute --id=wf_price_alert --dataFile=test/payload.json

PR fails? Workflow never reaches main.

8. Packaging Custom Nodes

npm init @n8n/nodes-module mynodes

Push to a private GitHub package; in the Dockerfile:

ENV N8N_CUSTOM_EXTENSIONS=/data/node_modules/@yourorg/mynodes

No fork, no rebuild of core.

9. What You Still Shouldn’t Do

Sub-millisecond streaming: minimum poll interval is 1 s.
50 MB workflow. JSON: Canvas becomes a slide show and is split into sub-flows.
Exactly-once semantics: nodes are at least once; design for idempotency or the Saga pattern.
True multi-tenant isolation: the community edition shares one DB; SaaS needs source mods.

10. One-Liner Recap

n8n = JSON describing your pipeline + Node runtime + NPM ecosystem.
Wrap it in Docker, bolt on Redis, Postgres, a self-hosted LLM, and a browser pool.
and you’ve got a version-controlled, horizontally-scalable, air-gapped automation OS—
all without waiting for the next SaaS release train.

Happy hacking, and may your queues always drain.