Timeouts — Resumable.js Ops

Ops·Updated 2025-09-20

Timeouts are the silent killer of large file uploads. Everything looks fine on a fast office connection, then a user on hotel Wi-Fi hits a 60-second proxy timeout mid-chunk and the whole upload stalls with no useful error. Configuring timeouts correctly across the browser, application server, reverse proxy, and load balancer is essential to making Resumable.js uploads actually resumable in practice. This page covers every timeout layer in the stack, the math for choosing values, keepalive behavior, AbortController patterns, and strategies for detecting stuck uploads. For the complete list of infrastructure topics, see the ops hub.

Timeout configuration layers from browser through load balancer to server

The Timeout Stack

A single chunk upload passes through at least four timeout boundaries before it completes. Each one can kill the connection independently, and the error messages are rarely helpful.

Layer	Config directive	Typical default
Browser	`XMLHttpRequest.timeout` / `AbortController`	None (waits forever)
Load balancer	Idle timeout	60 s (AWS ALB), 30 s (some others)
Reverse proxy (Nginx)	`proxy_read_timeout`	60 s
Application server	Request timeout	Varies (30–120 s)

When a chunk upload takes longer than the shortest timeout in this chain, the connection drops. The browser sees a network error. Resumable.js retries. If the timeout is consistently too short for the chunk size and user bandwidth, the retry also fails. The upload enters a death loop of failed retries, and the user sees a progress bar stuck at 47%.

Timeout Math: Chunk Size and Throughput

Here's the fundamental question: how long should a single chunk upload be allowed to take? The answer comes from dividing chunk size by the slowest realistic upload speed you want to support, then adding headroom.

timeout = (chunkSize in bytes / min expected throughput in bytes per second) × safety multiplier

A 2 MB chunk on a 500 Kbps upload connection (roughly 62.5 KB/s) takes about 32 seconds to transfer. With a 2x safety multiplier, you'd set your timeout to 64 seconds. At 60 seconds, you're already cutting it close—and that's exactly why the default Nginx proxy_read_timeout of 60 seconds breaks uploads for slower connections.

Practical timeout values

Chunk size	Min throughput	Transfer time	Recommended timeout
1 MB	500 Kbps	~16 s	45 s
2 MB	500 Kbps	~32 s	90 s
5 MB	500 Kbps	~80 s	180 s
2 MB	2 Mbps	~8 s	30 s

If you support mobile users or users in regions with constrained bandwidth, err on the generous side. A timeout that's too long wastes a few seconds on genuinely dead connections. A timeout that's too short kills legitimate uploads.

Server-Side Timeout Configuration

Nginx

location /api/upload {
    proxy_read_timeout 180s;
    proxy_send_timeout 180s;
    proxy_connect_timeout 10s;
    client_max_body_size 20m;
    proxy_pass http://upload_backend;
}

proxy_read_timeout is the one that matters most. It governs how long Nginx waits between two successive read operations from the upstream server. During a chunk upload, the upstream is busy receiving and writing data—if it doesn't send any response bytes within this window, Nginx drops the connection.

proxy_connect_timeout should stay short (5–10 seconds). If your backend isn't accepting connections within 10 seconds, something is wrong and you want to know immediately.

Apache

<Location "/api/upload">
    ProxyPass http://localhost:3000/api/upload
    ProxyPassReverse http://localhost:3000/api/upload
    Timeout 180
    ProxyTimeout 180
</Location>

Load Balancers

AWS ALB defaults to a 60-second idle timeout. If your chunk uploads can exceed 60 seconds, increase it. In the ALB target group settings, set idle_timeout.timeout_seconds to at least your longest expected chunk transfer time plus a buffer.

Google Cloud Load Balancing has a similar timeoutSec on the backend service. Azure Application Gateway uses requestTimeout. Every managed load balancer has this knob. Find it and set it before you go to production.

Browser-Side Timeout Handling

Resumable.js doesn't set an XMLHttpRequest.timeout by default, which means the browser will wait indefinitely for a response. That's actually fine for most deployments—the server-side timeouts act as the backstop. But there are situations where you want client-side abort control.

AbortController Pattern

For uploads that appear stuck, you can wrap chunk requests with an AbortController timeout:

const r = new Resumable({
  target: '/api/upload',
  chunkSize: 2 * 1024 * 1024,
  chunkRetryInterval: 5000,
  maxChunkRetries: 5,
});

The chunkRetryInterval setting controls how long Resumable.js waits before retrying a failed chunk, in milliseconds. After a timeout-induced failure, a 5-second pause gives transient network issues time to resolve before the retry fires. Setting this too low creates a stampede of retries against a server that might already be under load.

Detecting Stuck Uploads

A chunk upload can appear alive (the connection is open, bytes are trickling) but effectively stalled. The progress event stops firing, or fires with negligible increments. Detecting this requires tracking progress over time:

let lastProgress = 0;
let lastProgressTime = Date.now();

r.on('fileProgress', (file) => {
  const currentProgress = file.progress();
  if (currentProgress > lastProgress) {
    lastProgress = currentProgress;
    lastProgressTime = Date.now();
  } else if (Date.now() - lastProgressTime > 30000) {
    // No progress for 30 seconds — cancel and retry
    file.retry();
    lastProgressTime = Date.now();
  }
});

This pattern catches the scenario where the TCP connection stays open but throughput drops to effectively zero—something that pure timeout values won't detect.

Keepalive Considerations

HTTP keepalive reuses TCP connections across multiple chunk requests, eliminating the overhead of TCP handshake and TLS negotiation for each chunk. This is almost always desirable. But keepalive introduces its own timeout dimension.

Nginx's keepalive_timeout (default 75 seconds) controls how long an idle keepalive connection stays open. If there's a gap between chunk uploads longer than this value—say the user's browser pauses between files—the connection closes silently. The next chunk request opens a new connection, which is fine but adds latency.

More dangerous: some load balancers close keepalive connections on their own schedule. If the load balancer's idle timeout is shorter than Nginx's keepalive_timeout, you get a race condition where Nginx thinks the connection is alive but the load balancer has already torn it down. The next request on that connection fails with a 502 Bad Gateway.

The fix: set your reverse proxy's keepalive_timeout slightly shorter than your load balancer's idle timeout. That way the proxy closes the connection before the load balancer does, and there's always a clean state.

Timeout Alignment Checklist

Before deploying, verify that your timeouts increase from outside to inside:

Browser abort (optional): longest or disabled
Load balancer idle timeout: generous (120–300 s)
Reverse proxy read timeout: slightly less than load balancer
Application server timeout: matches or exceeds reverse proxy

If any inner timeout is longer than an outer timeout, the outer layer kills the connection while the inner layer is still happily working. The user sees an error. The server might keep processing a chunk that nobody is waiting for anymore. Aligning these values from the outside in prevents ghost processing and gives meaningful error messages at every layer.

Timeouts feel like a boring configuration task until they cause a production incident. Measure your slowest realistic user, do the math, set every layer accordingly, and revisit the numbers when you change chunk sizes. A fifteen-minute configuration session now saves a midnight debugging session later.