Latest public benchmark
Serious load, measured in public.
A single small VM pushed through heavy HTTPS concurrency with CPU, memory, latency, clean-response behavior, and raw rows published for inspection.
TLDR: Tako stayed clean through c20000 on a small exe.dev VM while doing app-aware routing, limiter accounting, forwarding headers, and instance selection.
- largest run
- c20000
- non-200
- 0
- client errors
- 0
Load generator, proxy, and app all share this host.
AMD EPYC 9554P on KVM.
No swap configured.
Linux 6.12.90.
0 non-200, 0 client errors.
Stable at the largest tested HTTP concurrency.
0 client errors in every heavy Tako row.
Channels and workflows both stay clean through c4000.
Proxy comparison
Raw HTTPS proxy path#
Same route, same self-signed TLS certificate, same upstream application, same benchmark VM. The heavy rows show the real capacity of one small VM when load generator, proxy, and app all share two vCPUs.
throughput
HTTP 200 RPS by concurrency
Tako stays clean at high concurrency and beats Caddy and Envoy across the heavy rows. nginx and HAProxy show the static-proxy ceiling for this VM.
- nginx
- HAProxy
- Tako
- Envoy
- Caddy
tail latency
p99 latency by concurrency
Tako completes every high-load row cleanly, with tail latency published beside RPS so the tradeoff stays visible. nginx is the tightest p99 reference in this run.
- nginx
- HAProxy
- Tako
- Envoy
- Caddy
errors
Clean-run behavior by concurrency
The line combines non-200 responses and client-side errors, so lower is better. Tako remains at 0% through c20000 on this run.
- nginx
- HAProxy
- Tako
- Envoy
- Caddy
memory
Proxy memory by concurrency
Memory is published beside throughput so connection cost stays visible. The high-concurrency Tako rows include Pingora/TLS keepalive state from live downstream connections.
- nginx
- HAProxy
- Tako
- Envoy
- Caddy
| proxy | c5000 | c10000 | c20000 | c20000 p99 | proxy RSS | readout |
|---|---|---|---|---|---|---|
| nginx | 17.7k | 15.3k | 11.0k | 3.8s | 262 MiB | Static-proxy RPS reference |
| HAProxy | 17.1k | 14.8k | 11.2k | 15.7s | 896 MiB | High RPS, wider p99 |
| Tako | 12.5k | 10.4k | 7.3k | 15.5s | 2.7 GiB | Clean through c20000 |
| Envoy | 4.7k | 3.7k | 0.8k | 26.6s | 999 MiB | High-load pressure |
| Caddy | 5.2k | 1.7k | 1.3k | 26.4s | 1.5 GiB | High-load pressure |
Channels and workflows
Built-in feature paths, measured separately#
These rows exercise more than the proxy. The app uses the JavaScript SDK, publishes durable channel messages, and enqueues workflows with persisted steps while everything shares the same 2 vCPU budget.
built-in features
Channels and workflows 200 RPS
Both feature paths stay clean through c4000 on the same 2 vCPU VM, while still using the SDK, SQLite-backed persistence, and the proxy path.
- Channel publish
- Workflow enqueue
feature tail latency
Channels and workflows p99 latency
Workflow enqueue persists steps, so it naturally carries more work than channel publish. Both paths stay clean through c4000 in this single-instance run.
- Channel publish
- Workflow enqueue
What it means
App-aware routing under load.#
Tako stayed clean while doing product work a static reverse proxy does not need to do: route lookup, source IP derivation, per-client limiter accounting, app and instance selection, in-flight accounting, upstream peer construction, and forwarding header normalization.
The report still keeps static proxy references, p99, memory, and clean-run percentages in view because those are the tuning levers. Future runs can isolate larger-VM behavior, external same-region load generation, and narrower Pingora session and upstream-proxy costs under 10k to 20k live TLS connections.
Method
Same conditions, public raw data.#
The public report intentionally omits hostnames, public IPs, private network addresses, peer names, and user identifiers.
Load generator, proxy, and app all run on the VM. The route is bench.test:18443, resolved to loopback, with Host and SNI set to bench.test.
Tako tako-server 0.0.0-09b3dc6, nginx 1.24.0, HAProxy 2.8.16, Envoy 1.38.0, Caddy 2.11.3 with rate limiting.
10 second warmup, 30 second measurement window, HTTP/1.1 over TLS, 60 second request timeout, metrics sampled from /proc once per second.