You can't fix what you can't see. Upload pipelines are multi-step, asynchronous, and distributed across client and server—exactly the kind of system where problems hide until they compound into user-visible failures. Structured logging transforms your upload infrastructure from a black box into a transparent, debuggable system. This page covers what to log at each stage of the Resumable.js upload lifecycle, how to correlate events using resumableIdentifier, client-side event logging, server-side tracking, alerting thresholds, and the dashboard metrics that actually matter in production. For the full infrastructure reference, see the ops hub.
What to Log: The Upload Lifecycle
A single file upload in Resumable.js passes through distinct stages. Each stage should emit a structured log entry. Miss one and you'll have a gap in your timeline exactly where the bug lives.
Upload lifecycle events
| Event | Where | Key data |
|---|---|---|
| Upload started | Client | resumableIdentifier, filename, file size, total chunks, timestamp |
| Chunk upload begin | Client + Server | resumableIdentifier, resumableChunkNumber, chunk size |
| Chunk upload success | Server | resumableIdentifier, resumableChunkNumber, duration, bytes received |
| Chunk upload error | Client + Server | resumableIdentifier, resumableChunkNumber, HTTP status, error message |
| Chunk test (GET) | Server | resumableIdentifier, resumableChunkNumber, result (found/not found) |
| All chunks received | Server | resumableIdentifier, total chunks, total bytes, elapsed time |
| Assembly started | Server | resumableIdentifier, target path, expected size |
| Assembly complete | Server | resumableIdentifier, final file path, final size, checksum, assembly duration |
| Cleanup | Server | resumableIdentifier, chunks deleted, temp space reclaimed |
That's nine distinct event types for a single file. It sounds like a lot. It's not—each one is a single structured log line, and you'll be grateful for every one of them when debugging a partial upload failure at 2 AM.
Structured Log Format
Plain text logs are useless at scale. When you're processing thousands of concurrent uploads, grepping through unstructured text for a specific file identifier is slow and error-prone. Use structured JSON logs from the start.
{
"timestamp": "2025-10-15T14:23:07.441Z",
"level": "info",
"event": "chunk_received",
"resumableIdentifier": "abc123-photo-jpg-4194304",
"chunkNumber": 7,
"totalChunks": 25,
"chunkSize": 2097152,
"durationMs": 847,
"userId": "user_9f3a2b",
"ip": "203.0.113.42"
}
Every field is queryable. Want to see all chunks for a specific upload? Filter by resumableIdentifier. Want to find slow chunks? Sort by durationMs. Want to correlate with user reports? Search by userId. Structured logs make these queries trivial in any log aggregation tool—ELK, Loki, Datadog, CloudWatch Logs Insights, whatever your stack uses.
Correlation with resumableIdentifier
The resumableIdentifier field is the single most important piece of data in your upload logs. Resumable.js generates this identifier from the file's name, size, and a relative path. It stays consistent across retries, browser refreshes, and even resumed sessions. That makes it the natural correlation key for every log entry related to a specific upload.
On the server, extract it from every request:
app.post('/api/upload', (req, res) => {
const identifier = req.body.resumableIdentifier || req.query.resumableIdentifier;
const chunkNumber = parseInt(req.body.resumableChunkNumber || req.query.resumableChunkNumber);
logger.info({
event: 'chunk_received',
resumableIdentifier: identifier,
chunkNumber: chunkNumber,
totalChunks: parseInt(req.body.resumableTotalChunks),
contentLength: req.headers['content-length'],
});
// Process chunk...
});
On the client, attach the same identifier to your logs:
r.on('chunkingComplete', (file) => {
clientLogger.info({
event: 'upload_started',
resumableIdentifier: file.uniqueIdentifier,
fileName: file.fileName,
fileSize: file.size,
totalChunks: file.chunks.length,
});
});
r.on('fileError', (file, message) => {
clientLogger.error({
event: 'upload_error',
resumableIdentifier: file.uniqueIdentifier,
error: message,
});
});
With both client and server logs keyed to the same identifier, you can reconstruct the complete timeline of any upload—from the moment the user selected the file to the final assembly on the server.
Client-Side Event Logging
Resumable.js exposes a rich event system. The events that matter most for logging:
r.on('fileAdded', (file) => { /* Log file metadata */ });
r.on('fileProgress', (file) => { /* Log progress milestones (25%, 50%, 75%) */ });
r.on('fileSuccess', (file, message) => { /* Log completion */ });
r.on('fileError', (file, message) => { /* Log failure with server response */ });
r.on('fileRetry', (file) => { /* Log retry — this is a signal worth tracking */ });
Don't log every fileProgress event—that fires continuously and creates noise. Instead, log at meaningful thresholds or at fixed intervals:
const loggedMilestones = new Set();
r.on('fileProgress', (file) => {
const percent = Math.floor(file.progress() * 100);
const milestone = Math.floor(percent / 25) * 25;
if (milestone > 0 && !loggedMilestones.has(`${file.uniqueIdentifier}-${milestone}`)) {
loggedMilestones.add(`${file.uniqueIdentifier}-${milestone}`);
clientLogger.info({
event: 'upload_progress',
resumableIdentifier: file.uniqueIdentifier,
percent: milestone,
});
}
});
Alerting Thresholds
Raw logs are useful for debugging. Alerts are useful for not needing to debug in the first place. Define thresholds that trigger notifications before users start complaining.
| Metric | Warning threshold | Critical threshold |
|---|---|---|
| Chunk error rate | > 5% over 5 minutes | > 15% over 5 minutes |
| Assembly failure rate | > 1% over 15 minutes | > 5% over 15 minutes |
| p95 chunk upload time | > 10 s | > 30 s |
| Upload abandonment rate | > 20% over 1 hour | > 40% over 1 hour |
| Temp directory disk usage | > 70% capacity | > 90% capacity |
Chunk error rates above 5% usually indicate a server-side issue—disk full, timeout misconfiguration, or a bad deployment. Assembly failures above 1% suggest data corruption or missing chunks, possibly due to premature cleanup of temporary files.
The upload abandonment rate is one that teams often overlook. If users start uploads and never complete them, something in your pipeline is broken—but it might not throw errors. Maybe timeouts are too aggressive. Maybe the UI doesn't communicate progress clearly. This metric surfaces UX problems that error rates alone won't catch.
Dashboard Metrics
Beyond alerting, maintain a dashboard with these operational metrics:
- Upload throughput: total MB/s across all active uploads, sampled every minute
- Active uploads: count of unique
resumableIdentifiervalues with activity in the last 60 seconds - Chunk success rate: percentage of chunk POST requests returning 200, bucketed by minute
- p50/p95/p99 chunk duration: latency distribution for chunk uploads
- Assembly queue depth: number of files waiting for assembly after all chunks arrived
- Temp storage utilization: disk usage of your chunk staging directory
The assembly queue depth is particularly telling. If it grows steadily, your assembly process is slower than your ingest rate. You either need faster assembly (parallel concatenation, faster disk) or a queue with backpressure that slows down new uploads when assembly falls behind.
Log Rotation for Temp Directories
Chunk temporary files accumulate in staging directories and must be cleaned up after assembly—or after a timeout if the upload is abandoned. Log these cleanup operations:
async function cleanupStaleTempFiles(maxAgeMs = 24 * 60 * 60 * 1000) {
const tempDir = '/tmp/resumable-chunks';
const entries = await fs.readdir(tempDir, { withFileTypes: true });
for (const entry of entries) {
if (entry.isDirectory()) {
const stat = await fs.stat(path.join(tempDir, entry.name));
const age = Date.now() - stat.mtimeMs;
if (age > maxAgeMs) {
await fs.rm(path.join(tempDir, entry.name), { recursive: true });
logger.info({
event: 'temp_cleanup',
identifier: entry.name,
ageHours: Math.round(age / 3600000),
action: 'deleted',
});
}
}
}
}
Run this on a schedule—every hour is reasonable for most deployments. The logs tell you how many abandoned uploads you're cleaning up and how old they are. A sudden spike in stale temp directories is an early warning that something upstream is preventing uploads from completing.
Good logging turns your upload pipeline from "it works, I think" into "I know exactly what happened to every byte." The upfront cost is minimal—a structured logger and a few event handlers. The payoff is every production incident you diagnose in minutes instead of hours.
