Structured Logging
Status: Complete
Category: Observability
Default enforcement: Soft
Author: PushBackLog team
Tags
- Topic: observability, quality
- Skillset: backend, devops
- Technology: generic
- Stage: execution, deployment
Summary
Structured logging emits log entries as machine-readable key-value pairs (typically JSON) rather than unstructured text strings. This makes logs queryable, filterable, and aggregatable by monitoring systems — turning what would otherwise be noise into actionable operational data.
Rationale
Logs are for machines, not just humans
Unstructured logs like "User 12345 did something at 14:32" are readable in a terminal but unusable at scale. When you have thousands of requests per second across dozens of services, the only way to find what you need is to query your logs. Querying requires fields — which requires structure.
Structured logs enable:
- Filtering by
user_id,correlation_id,http_status,duration_ms - Counting error rates by
error_codeover time - Joining logs with traces and metrics in a unified observability platform (Datadog, Grafana, Honeycomb)
12-Factor App: logs as event streams
Factor XI of the 12-Factor App methodology states that logs should be treated as event streams: the application writes to stdout; the platform collects and routes them. This means:
- No file rotation logic in application code
- No log file paths to configure per environment
- Logs visible immediately in any platform (Kubernetes, Heroku, ECS)
What to include, what to exclude
Uniformly including a baseline set of fields turns every log line into a queryable event:
| Always include | Never include |
|---|---|
timestamp (ISO 8601, UTC) | Passwords, tokens, API keys |
level (debug, info, warn, error) | Full credit card numbers |
message | PII beyond what is necessary (emails, full names) |
service name | Health check noise (suppress or reduce verbosity) |
correlation_id / trace_id | Stack traces in INFO-level logs |
user_id (opaque, not email) |
Guidance
Log levels and when to use them
| Level | Use for |
|---|---|
debug | Detailed diagnostic information; disabled in production by default |
info | Normal, expected events: request received, order placed |
warn | Unexpected but recoverable: retry succeeded, deprecated endpoint called |
error | Unexpected failure requiring attention: uncaught exception, payment failed |
fatal | System cannot continue; process about to exit |
Using a structured logger
import pino from 'pino';
const logger = pino({
level: process.env.LOG_LEVEL ?? 'info',
base: { service: 'payments-api' }, // Injected into every log line
});
// Good: structured fields separate from message
logger.info({ orderId, userId, amount }, 'Payment processed');
// Output: {"level":"info","service":"payments-api","orderId":"ord_abc","userId":"usr_123","amount":4900,"msg":"Payment processed"}
// Bad: interpolated strings — unqueryable
console.log(`Payment processed for user ${userId}, order ${orderId}`);
Correlation IDs across services
Attach a correlation ID on inbound requests and propagate it to all downstream calls and log lines:
app.use((req, res, next) => {
const correlationId = req.headers['x-correlation-id'] as string ?? crypto.randomUUID();
req.logger = logger.child({ correlationId, path: req.path });
res.setHeader('x-correlation-id', correlationId);
next();
});
// In handlers: req.logger.info({ userId }, 'Starting checkout');
// Every log line for this request shares the correlationId — queryable as a group
Examples
Structured vs unstructured side-by-side
// Unstructured: try finding all orders over £50 that failed
console.log(`Order failed: order_id=${orderId}, user=${userId}, amount=50.00, error: Payment declined`);
// Structured: trivially queryable
logger.error({
orderId,
userId,
amountPence: 5000,
currency: 'GBP',
errorCode: 'PAYMENT_DECLINED',
provider: 'stripe',
}, 'Order payment failed');
// Query: level=error AND errorCode=PAYMENT_DECLINED AND amountPence>5000
Request logging middleware
app.use((req, res, next) => {
const start = Date.now();
res.on('finish', () => {
req.logger.info({
method: req.method,
path: req.path,
statusCode: res.statusCode,
durationMs: Date.now() - start,
}, 'Request completed');
});
next();
});
Anti-patterns
1. String concatenation in log messages
console.log("User " + userId + " paid " + amount); // Unqueryable
Log the values as structured fields; keep the message as a static, human-readable label.
2. Logging sensitive data
API keys, auth tokens, passwords, and PII have no place in log output. Logs are often shipped to third-party platforms (Datadog, Splunk) and stored for months. A secret in a log line is an ongoing credential leak.
3. Every-request debug logging in production
Debug-level logs in production can double or triple log volume and cost. Set LOG_LEVEL=info in production; enable debug logging per-request with a header flag or per-service via config when diagnosing.
4. No correlation IDs
Without a shared ID, tracing a user’s failed checkout across 3 services requires matching timestamps manually. Correlation IDs reduce a 30-minute debugging session to a single query.
Related practices
Part of the PushBackLog Best Practices Library. Suggest improvements →