{"slug":"scale-to-zero-without-containers","url":"https://tako.sh/blog/scale-to-zero-without-containers/","canonical":"https://tako.sh/blog/scale-to-zero-without-containers/","title":"Scale-to-Zero Without Containers","date":"2026-04-05T05:17","description":"How Tako scales apps to zero and cold-starts them on demand — without Docker, containers, or a cloud platform.","author":null,"image":"0ae9be74aa80","imageAlt":null,"headings":[{"depth":2,"slug":"how-it-works","text":"How it works"},{"depth":2,"slug":"what-happens-to-requests-during-cold-start","text":"What happens to requests during cold start"},{"depth":2,"slug":"why-this-matters-for-cost","text":"Why this matters for cost"},{"depth":2,"slug":"configuration","text":"Configuration"},{"depth":2,"slug":"not-serverless","text":"Not serverless"},{"depth":2,"slug":"try-it","text":"Try it"}],"markdown":"Scale-to-zero is usually a cloud or container thing. Google Cloud Run, AWS Lambda, Fly.io Machines — they all do it by pausing or destroying containers. If you're running apps on your own servers with native processes, you're expected to keep them running 24/7.\n\nTako does it differently. Your app scales to zero and cold-starts on demand, with no containers involved.\n\n## How it works\n\nEvery Tako app starts with desired instances set to `0` — on-demand mode. Here's the lifecycle:\n\n```d2\ndirection: down\n\ndeploy: Deploy {style.fill: \"#9BC4B6\"; style.font-size: 20}\nwarm: Warm instance {style.fill: \"#9BC4B6\"; style.font-size: 20}\nserving: Serving {style.fill: \"#9BC4B6\"; style.font-size: 20}\nidle: Idle timeout {style.fill: \"#E88783\"; style.font-size: 20}\nzero: Zero instances {style.fill: \"#FFF9F4\"; style.stroke: \"#2F2A44\"; style.font-size: 20}\ncold: Cold start {style.fill: \"#E88783\"; style.font-size: 20}\nback: Serving again {style.fill: \"#9BC4B6\"; style.font-size: 20}\n\ndeploy -> warm: start 1 instance\nwarm -> serving: request arrives\nserving -> idle: no requests for 5 min\nidle -> zero: instance stopped\nzero -> cold: next request arrives\ncold -> back: \"often 10s of ms\"\n```\n\n**Deploy.** When you run [`tako deploy`](/docs/deployment), the server starts one warm instance immediately — so your app is reachable right away. If that instance fails to start, the deploy fails. No surprise cold starts after shipping.\n\n**Serve.** Requests route to healthy instances through Tako's [Pingora-based proxy](/blog/pingora-vs-caddy-vs-traefik). Each instance tracks in-flight requests and the timestamp of its last request.\n\n**Idle.** An idle monitor checks instances periodically. If an instance has no in-flight requests and has been idle longer than `idle_timeout` (default: 5 minutes), it gets stopped. The app drops to zero running instances.\n\n**Cold start.** The next request triggers a cold start. The proxy spawns a new process, waits for the app's readiness signal (`TAKO:READY:<port>` via the [SDK](/docs)), and routes the request once the instance is healthy. For lightweight APIs, that first response is often only tens of milliseconds slower. Heavier apps can take longer.\n\n## What happens to requests during cold start\n\nThis is the tricky part. What if 50 requests arrive while the app is booting?\n\nTako uses a leader/waiter pattern. The first request becomes the \"leader\" and triggers the instance spawn. Every subsequent request becomes a \"waiter\" and queues behind it. Up to 1000 requests can queue per app. When the instance is ready, all waiters are unblocked simultaneously.\n\n| Scenario                    | Response                                                |\n| --------------------------- | ------------------------------------------------------- |\n| Instance starts in time     | Normal response (after cold start delay)                |\n| Startup exceeds 30s         | `504 App startup timed out`                             |\n| Process crashes on start    | `502 App failed to start`                               |\n| Queue exceeds 1000 requests | `503 App startup queue is full` (with `Retry-After: 1`) |\n\nInstances are never killed while serving in-flight requests. The idle monitor only stops instances that are both idle _and_ have zero active connections.\n\n## Why this matters for cost\n\nIf you're running one app per server, scale-to-zero doesn't save much. But most people don't run one app per server.\n\nA typical Tako setup might have a production API (always-on), plus a staging environment, an admin dashboard, a webhook processor, and a docs site — all on the same box. Without scale-to-zero, each of those keeps processes running around the clock. A Node.js process idles at 50-100MB. Five idle apps? That's 250-500MB of RAM doing nothing.\n\nWith Tako's on-demand model, those low-traffic apps consume zero resources when idle. The staging environment that nobody touches on weekends? Gone. The admin panel your team uses twice a day? Boots in 200ms when someone opens it.\n\nThis is especially useful on VPS instances where RAM is the constraint. A $6/month box with 1GB of RAM can comfortably host a handful of apps when most of them aren't loaded into memory at the same time.\n\n## Configuration\n\nScale-to-zero is the default. You don't need to configure anything for it to work. But you can tune it:\n\n```toml\n# tako.toml\n[envs.production]\nidle_timeout = 300  # seconds (default: 5 minutes)\n\n[envs.staging]\nidle_timeout = 60   # aggressive timeout for staging\n```\n\nFor always-on apps, use [`tako scale`](/docs/cli) to set a minimum instance count:\n\n```bash\ntako scale 2 --env production  # always keep 2 instances running\n```\n\nThis persists across deploys, rollbacks, and server restarts.\n\n## Not serverless\n\nThis isn't serverless. There's no per-request billing, no function isolation, no event-driven invocation model. Your app is a normal long-running process — it just doesn't run when nobody's using it.\n\nThe cold start is a real process spawn, not a container unpause or a microVM boot. That's why it's fast: no image layers to unpack, no filesystem to mount, no network namespace to create. Just fork, exec, wait for readiness.\n\nAnd because Tako's proxy handles the queuing transparently, your app doesn't need to know it was cold-started. No special warming logic, no readiness hacks. The [SDK's status endpoint](/docs) is enough.\n\n## Try it\n\nEvery Tako app gets scale-to-zero out of the box. Deploy anything and watch it idle down after 5 minutes of quiet:\n\n```bash\ntako deploy\ntako status  # see instance count drop to 0\n# visit your app — it cold-starts on the first request\n```\n\nCheck the [deployment docs](/docs/deployment) for the full setup, or [how Tako works](/docs/how-tako-works) for the architecture behind on-demand scaling."}