{"slug":"self-hosted-websockets-workflows-multiple-vps-postgres","url":"https://tako.sh/blog/self-hosted-websockets-workflows-multiple-vps-postgres/","canonical":"https://tako.sh/blog/self-hosted-websockets-workflows-multiple-vps-postgres/","title":"Self-Hosted WebSockets and Workflows Across Multiple VPS Servers with Postgres","date":"2026-06-07T00:27","description":"Use postgres_url to share Tako channel replay and workflow state across multiple VPS servers instead of per-server SQLite.","author":null,"image":"ec7ad3d89cdd","imageAlt":null,"headings":[{"depth":2,"slug":"the-setup","text":"The setup"},{"depth":2,"slug":"why-channels-need-shared-replay","text":"Why channels need shared replay"},{"depth":2,"slug":"why-workflows-get-a-choice","text":"Why workflows get a choice"},{"depth":2,"slug":"what-changes-for-app-code","text":"What changes for app code?"}],"markdown":"Single-server state is easy to reason about. One app, one proxy, one SQLite file, one place where channel replay and workflow runs live.\n\nMulti-server state is where the footguns start. If a user opens a WebSocket connection to the Tokyo VPS and your checkout route publishes from the Los Angeles VPS, that publish still has to arrive. If a scheduled workflow runs on three servers, it should not accidentally send the same reminder three times unless that is exactly what you asked for.\n\nTako now has the missing switch for that shape: set the environment credential `postgres_url`, and durable channels plus workflows move from per-server SQLite to shared Postgres runtime state.\n\nThis is not a new app database abstraction. Your product data still belongs in your app database. This is Tako-owned runtime state: channel replay, workflow runs, workflow steps, event waiters, schedules, and cron coordination. The full config surface lives in [`tako.toml`](/docs/tako-toml/), the deploy checks are covered in [Deployment](/docs/deployment/), and the command lives in the [CLI reference](/docs/cli/).\n\n## The setup\n\nStart with a normal multi-server environment:\n\n```toml\nruntime = \"bun\"\npreset = \"nextjs\"\napp_root = \".\"\n\n[envs.production]\nroutes = [\"app.example.com\"]\nservers = [\"lax\", \"nrt\", \"fra\"]\n```\n\nThen store the shared runtime database URL as a provider credential:\n\n```bash\ntako credentials set postgres_url --env production\n```\n\nThat is intentionally not a top-level `postgres_url` field in `tako.toml`. Provider credentials are encrypted in `.tako/secrets.json`, scoped to an environment, and sent only through the deployment binding that needs them. They are not exposed to app code, not included in generated secret types, and not pushed by `tako secrets sync`.\n\nWith that one credential set, Tako chooses shared storage for the runtime pieces:\n\n| Runtime state     | Single-server default                               | With `postgres_url`                                     |\n| ----------------- | --------------------------------------------------- | ------------------------------------------------------- |\n| Channel replay    | Local SQLite at `data/tako/channels.sqlite`         | Postgres schema `tako_channels`, keyed by deployed app  |\n| Workflow runs     | Local SQLite at `data/tako/workflows.sqlite`        | Postgres schema `tako_workflows`, keyed by deployed app |\n| Channel publish   | Store before fanout on the local server             | Store before fanout in shared replay                    |\n| Channel reconnect | Replay from the local server's retained rows        | Replay from the shared retained rows                    |\n| Workflow cron     | Local schedule set                                  | Shared workflow storage and coordination                |\n| SDK access        | SDK talks to `tako-server` over the internal socket | Same SDK path; `tako-server` owns the database writes   |\n\nThe deployed app id matters here. Tako scopes runtime state to `{name}/{env}`, not to a release or one process. A rolling deploy can replace instances without making old channel cursors or workflow runs belong to the wrong build.\n\n```d2\ndirection: right\n\nbrowser: \"Browsers\\nWS / SSE\"\nlax: \"LAX VPS\\ntako-server\"\nnrt: \"NRT VPS\\ntako-server\"\nfra: \"FRA VPS\\ntako-server\"\npg: \"Postgres\\nschemas:\\ntako_channels\\ntako_workflows\"\napp: \"App code\\npublish / enqueue\"\n\nbrowser -> lax: \"connect\"\nbrowser -> nrt: \"connect\"\napp -> fra: \"HTTP route publishes\"\nfra -> pg: \"append channel message\"\nlax -> pg: \"poll replay\"\nnrt -> pg: \"poll replay\"\nlax -> browser: \"fanout + replay\"\nnrt -> browser: \"fanout + replay\"\napp -> lax: \"enqueue workflow\"\nlax -> pg: \"insert run + steps\"\n```\n\n## Why channels need shared replay\n\nTako channels are durable WebSocket/SSE endpoints under `/_tako/channels/<name>`. A publish is inserted before delivery, and reconnecting clients can replay retained messages from a bounded window. The default replay window is 10 minutes, which is meant for browser reloads, laptop sleep, short network drops, and rolling deploys.\n\nOn one server, SQLite is perfect for that. It is local, fast, and private to the app. On multiple servers, local SQLite would split the replay log into islands. A subscriber connected to one server would only see messages that landed on that same server.\n\nShared Postgres fixes the shape. A publish on any server writes to `tako_channels`; subscribers on every server poll the same replay store, fan out new retained rows, and can reconnect against the same cursor space.\n\nThat is why channels do not have a \"local multi-server\" opt-out. Channel delivery is inherently cross-server once traffic can land on more than one machine. If your environment has `<app_root>/channels/` and more than one target server, deploy requires `postgres_url`.\n\n## Why workflows get a choice\n\nWorkflows are different. Many workflows should be global: send one receipt, charge one card, run one daily digest, process one webhook. For those, shared Postgres is the right default in a multi-server environment. The workflow engine stores runs, completed step results, waits, schedules, and leader leases in `tako_workflows`, while workers still run as supervised app-adjacent processes.\n\nBut some workflows are intentionally local. A cache warmer that runs once per server is local. A regional health sampler is local. A cleanup task for files on that VPS is local. Those should not need a global database.\n\nFor that case, set `local: true` in every workflow that should stay per-server:\n\n```ts\nimport { defineWorkflow } from \"tako.sh\";\n\nexport default defineWorkflow(\"warm-local-cache\", {\n  local: true,\n  schedule: \"*/10 * * * *\",\n  async handler(payload, ctx) {\n    await ctx.run(\"warm\", async () => {\n      ctx.logger.info(\"warming this server\");\n    });\n  },\n});\n```\n\nThe safety rule is simple:\n\n| Project shape                                      | Deploy behavior                                                      |\n| -------------------------------------------------- | -------------------------------------------------------------------- |\n| One server, channels or workflows                  | SQLite is allowed                                                    |\n| Multiple servers, channels                         | `postgres_url` is required                                           |\n| Multiple servers, workflows with no `local: true`  | `postgres_url` is required                                           |\n| Multiple servers, every workflow has `local: true` | Per-server SQLite is allowed                                         |\n| Multiple servers, channels plus local workflows    | `postgres_url` is still required because channels need shared replay |\n\nTako checks this before build/deploy work starts. That matters. The failure happens while you are still at the CLI, with an action like:\n\n```bash\ntako credentials set postgres_url --env production\n```\n\nNo half-deployed release, no accidental split-brain runtime, no learning from duplicated emails.\n\n## What changes for app code?\n\nAlmost nothing.\n\nYour channel definitions still live in `<app_root>/channels/`:\n\n```ts\nimport { defineChannel } from \"tako.sh\";\n\nexport default defineChannel(\"orders\", {\n  auth: \"public\",\n}).$messageTypes<{\n  updated: { orderId: string; status: string };\n}>();\n```\n\nYour workflows still live in `<app_root>/workflows/`:\n\n```ts\nimport { defineWorkflow } from \"tako.sh\";\n\nexport default defineWorkflow<{ orderId: string }>(\"send-receipt\", {\n  retries: 4,\n  async handler(payload, ctx) {\n    await ctx.run(\"send\", async () => {\n      ctx.logger.info(\"sending receipt\", { orderId: payload.orderId });\n    });\n  },\n});\n```\n\nAnd your app still publishes or enqueues through the SDK:\n\n```ts\nimport orders from \"@/channels/orders\";\nimport sendReceipt from \"@/workflows/send-receipt\";\n\nawait orders().publish({\n  type: \"updated\",\n  data: { orderId: \"ord_123\", status: \"paid\" },\n});\n\nawait sendReceipt.enqueue({ orderId: \"ord_123\" });\n```\n\nThe storage backend is a deployment decision, not a call-site decision. SDKs do not open SQLite or Postgres directly; they talk to `tako-server` over the internal socket, and `tako-server` owns the selected backend.\n\nThat is the point of the feature. Multi-server self-hosting should feel like adding capacity, not rebuilding your app around a queue service and a WebSocket gateway. Add servers, set `postgres_url`, deploy, and the runtime state follows the environment.\n\nRead the [How Tako Works](/docs/how-tako-works/) runtime section, the [`tako credentials`](/docs/cli/#tako-credentials) command docs, or the [multi-server deployment guide](/docs/deployment/) to wire it up. The app can stay boring. The state is finally shared where it needs to be."}