Logging & Observability
"Where are my logs?" - the most common question when debugging containers. Docker handles logging differently than traditional deployments. This article covers logging drivers, structured logging, metrics, health checks, and building observable containerized applications.
📋 At a Glance
| Aspect | Details |
|---|---|
| Topic | Logging drivers, structured logging, health checks, metrics |
| Complexity | Intermediate |
| Prerequisites | Basic Docker usage, Part 1 (Container Internals) |
| Key Insight | Containers should log to stdout/stderr - Docker handles the rest |
| Time to Master | 2-3 hours |
🎯 What You'll Learn
- Logging drivers - json-file, syslog, fluentd, and when to use each
- Structured logging - JSON logs for machine parsing
- Health checks - liveness vs readiness, implementing correctly
- Metrics collection - Prometheus patterns for containers
- Debugging without logs - when logs aren't enough
🔥 Production Story: The Silent Failure
An application ran fine for weeks, then started dropping requests. No errors in logs. The team added more replicas, but problems persisted.
BASH(6 lines)CodeLoading syntax highlighter...
Everything looked fine. But checking deeper:
BASH(5 lines)CodeLoading syntax highlighter...
/health but not process actual requests.DOCKERFILE(2 lines)CodeLoading syntax highlighter...
JAVASCRIPT(10 lines)CodeLoading syntax highlighter...
🧠 Mental Model: Container Observability Stack
┌─────────────────────────────────────────────────────────────────────────┐ │ CONTAINER OBSERVABILITY │ │ │ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │ APPLICATION │ │ │ │ │ │ Log to stdout/stderr ──────────────────────────────────────────┐ │ │ │ Expose /metrics endpoint ──────────────────────────────────┐ │ │ │ │ Implement /health endpoints ─────────────────────────────┐ │ │ │ │ └───────────────────────────────────────────────────────────┼─┼───┼────┘ │ │ │ │ │ ┌───────────────────────────────────────────────────────────┼─┼───┼────┐ │ │ DOCKER DAEMON │ │ │ │ │ │ │ │ │ │ │ │ Logging Driver ◄─────────────────────────────────────────┘ │ │ │ │ │ (json-file, fluentd, etc.) │ │ │ │ │ │ │ │ │ │ │ ├── json-file ──► /var/lib/docker/containers/*/ │ │ │ │ │ ├── syslog ────► syslog server │ │ │ │ │ ├── fluentd ───► Fluentd/Fluent Bit │ │ │ │ │ └── awslogs ───► CloudWatch │ │ │ │ │ │ │ │ │ │ Health Check ◄─────────────────────────────────────────────┘ │ │ │ │ (HEALTHCHECK instruction) │ │ │ │ │ │ │ │ │ └── Updates container status │ │ │ │ │ │ │ └─────────────────────────────────────────────────────────────────┼────┘ │ │ │ ┌─────────────────────────────────────────────────────────────────┼───┐ │ │ MONITORING SYSTEM │ │ │ │ │ │ │ │ Prometheus ◄───────────────────────────────────────────────────┘ │ │ │ (scrapes /metrics) │ │ │ │ │ │ │ └── Grafana (visualization) │ │ │ │ │ │ Log Aggregator (ELK, Loki, etc.) │ │ │ │ │ │ │ └── Receives logs from driver │ │ │ │ │ └─────────────────────────────────────────────────────────────────────┘ │ │ └────────────────────────────────────────────────────────────────────────┘
🔬 Deep Dive
The Twelve-Factor App Logging Principle
Containers follow the 12-factor app methodology for logs:
"Treat logs as event streams. A twelve-factor app never concerns itself with routing or storage of its output stream.
PYTHON(8 lines)CodeLoading syntax highlighter...
JAVASCRIPT(6 lines)CodeLoading syntax highlighter...
Logging Drivers
Docker captures stdout/stderr and sends to a logging driver:
BASH(8 lines)CodeLoading syntax highlighter...
| Driver | Description | Use Case |
|---|---|---|
| json-file | JSON files on disk (default) | Development, simple deployments |
| syslog | System syslog | Traditional infrastructure |
| journald | systemd journal | Linux with systemd |
| fluentd | Fluentd collector | Kubernetes, centralized logging |
| awslogs | AWS CloudWatch | AWS deployments |
| gcplogs | Google Cloud Logging | GCP deployments |
| none | Disable logging | When logs handled elsewhere |
JSON(8 lines)CodeLoading syntax highlighter...
JSON-File Driver (Default)
Most common for development and simple deployments:
BASH(11 lines)CodeLoading syntax highlighter...
YAML(10 lines)CodeLoading syntax highlighter...
BASH(5 lines)CodeLoading syntax highlighter...
Structured Logging
JSON logs are machine-parseable:
PYTHON(21 lines)CodeLoading syntax highlighter...
JAVASCRIPT(5 lines)CodeLoading syntax highlighter...
- Machine-parseable (Elasticsearch, Loki, etc.)
- Filterable by fields
- Aggregatable for metrics
- Consistent format
Health Checks
Health checks tell Docker (and orchestrators) if a container is working:
DOCKERFILE(11 lines)CodeLoading syntax highlighter...
| Option | Default | Description |
|---|---|---|
--interval | 30s | Time between checks |
--timeout | 30s | Max time for check to complete |
--start-period | 0s | Initialization grace period |
--retries | 3 | Failures before unhealthy |
starting- In start-period, checks not counted yethealthy- Last N checks passedunhealthy- Last N checks failed
BASH(5 lines)CodeLoading syntax highlighter...
JAVASCRIPT(26 lines)CodeLoading syntax highlighter...
Metrics with Prometheus
Standard pattern for container metrics:
JAVASCRIPT(39 lines)CodeLoading syntax highlighter...
YAML(6 lines)CodeLoading syntax highlighter...
Log Aggregation Patterns
YAML(15 lines)CodeLoading syntax highlighter...
YAML(15 lines)CodeLoading syntax highlighter...
YAML(18 lines)CodeLoading syntax highlighter...
⚠️ Common Mistakes
Mistake 1: Logging to Files Instead of stdout
DOCKERFILE(8 lines)CodeLoading syntax highlighter...
Mistake 2: No Log Rotation
YAML(15 lines)CodeLoading syntax highlighter...
Mistake 3: Shallow Health Checks
DOCKERFILE(8 lines)CodeLoading syntax highlighter...
🐛 Debug This: The Disappearing Logs
A developer reports: "My container logs show nothing! But I know it's writing logs!"
BASH(9 lines)CodeLoading syntax highlighter...
docker logs show anything?BASH(2 lines)CodeLoading syntax highlighter...
DOCKERFILE(2 lines)CodeLoading syntax highlighter...
DOCKERFILE(2 lines)CodeLoading syntax highlighter...
PYTHON(7 lines)CodeLoading syntax highlighter...
💻 Exercises
Exercise 1: Configure Log Rotation
⭐ Difficulty: Easy | ⏱️ Time: 15 minutes
BASH(20 lines)CodeLoading syntax highlighter...
Exercise 2: Implement Health Check
⭐⭐ Difficulty: Medium | ⏱️ Time: 20 minutes
BASH(57 lines)CodeLoading syntax highlighter...
Exercise 3: Structured Logging
⭐⭐ Difficulty: Medium | ⏱️ Time: 20 minutes
BASH(47 lines)CodeLoading syntax highlighter...
Exercise 4: Prometheus Metrics
⭐⭐⭐ Difficulty: Hard | ⏱️ Time: 30 minutes
Create a complete metrics setup:
YAML(22 lines)CodeLoading syntax highlighter...
YAML(8 lines)CodeLoading syntax highlighter...
Tasks:
- Create a simple app that exposes
/metricsendpoint - Include request count and duration metrics
- Set up Prometheus to scrape the metrics
- Create a simple Grafana dashboard
Exercise 5: Complete Observability Stack
⭐⭐⭐⭐ Difficulty: Expert | ⏱️ Time: 45 minutes
Build a complete observability setup:
- Application with structured JSON logging
- Health checks (liveness + readiness)
- Prometheus metrics
- Log aggregation with Loki
- Grafana dashboards for both metrics and logs
YAML(7 lines)CodeLoading syntax highlighter...
🎤 Senior-Level Interview Questions
Q1: Why should containers log to stdout instead of files?
"This follows the twelve-factor app methodology and has practical benefits:
- Application produces logs
- Platform handles routing, storage, rotation
- App doesn't need to know about log infrastructure
- Same container works with any logging backend
- json-file for dev, fluentd for prod, CloudWatch for AWS
- No code changes needed
- No rotation logic in app
- No disk full issues from unrotated logs
- No volume mounts for log persistence
- Logs available via
docker logsimmediately - Survive container restarts (with json-file driver)
- Aggregatable across containers
docker logsjust works- No need to exec into container
- Consistent interface across all containers
ln -sf /dev/stdout /var/log/app.log"Q2: Explain the difference between liveness and readiness health checks.
"They serve different purposes in container orchestration:
- Question: 'Is the process alive and not deadlocked?'
- Failure action: Restart the container
- Should be: Fast, minimal dependencies
- Example: Can the process respond to a simple request?
GET /health/live Response: 200 OK (process is alive)
- Question: 'Can this instance serve traffic?'
- Failure action: Remove from load balancer, don't restart
- Should check: Dependencies (DB, cache, upstream services)
- Example: Are all required connections healthy?
GET /health/ready Response: 200 if DB + cache + upstream OK 503 if any dependency is down
Scenario: Database goes down temporarily.
- Readiness fails: Container removed from LB, no traffic
- Liveness passes: Container stays running
- When DB recovers: Readiness passes, traffic resumes
- No unnecessary container restarts
If we only had liveness including DB check:
- DB down → liveness fails → container restarts
- Restarting doesn't fix DB
- Container keeps crash-looping
- Worse than just waiting for DB
DOCKERFILECodeLoading syntax highlighter...
YAML(17 lines)CodeLoading syntax highlighter...
- JSON format for machine parsing
- Include service name in every log
- Include trace/correlation ID
- Propagate trace ID across service calls
- Use OpenTelemetry or similar
- Enables request flow visualization
Container → Logging driver → Collector → Storage → Query UI (fluentd) (Fluentd) (ES/Loki) (Kibana/Grafana)
- ERROR: Requires attention
- WARN: Concerning but handled
- INFO: Business events
- DEBUG: Only in dev/troubleshooting
PYTHON(6 lines)CodeLoading syntax highlighter...
- Alert on error rate increase
- Alert on specific error types
- Don't alert on every error
The goal is: from any error, I can trace the entire request flow across all services."
Q4: How do you configure Docker logging to prevent disk space issues?
"This is a common production issue. The solution involves multiple layers:
YAML(7 lines)CodeLoading syntax highlighter...
Each container: max 50MB logs (5 × 10MB)
JSON(8 lines)CodeLoading syntax highlighter...
Applies to all containers without explicit config.
BASH(2 lines)CodeLoading syntax highlighter...
- Production: fluentd/syslog to external storage
- Development: json-file with rotation
BASH(5 lines)CodeLoading syntax highlighter...
- Production: INFO and above
- Debug logs only when troubleshooting
The key is proactive configuration. Default json-file with no limits will eventually fill any disk."
Q5: How would you implement metrics collection for Docker containers?
"I use the Prometheus pull model as the standard approach:
PYTHON(9 lines)CodeLoading syntax highlighter...
YAML(8 lines)CodeLoading syntax highlighter...
Exposes CPU, memory, network, disk metrics per container.
YAML(7 lines)CodeLoading syntax highlighter...
- Request rate, errors, duration (RED)
- Resource usage (CPU, memory, network)
- Business metrics (orders, users, etc.)
- Dependency health (DB connections, queue depth)
- Per-service panels
- Container resource usage
- Error rates and latencies
- Alerting thresholds
📝 Summary & Key Takeaways
Logging Best Practices
| Practice | Implementation |
|---|---|
| Log to stdout | App writes to console, Docker captures |
| Structured format | JSON for machine parsing |
| Include context | trace_id, service, request details |
| Configure rotation | max-size and max-file options |
| Centralize logs | Fluentd, Loki, or cloud service |
Health Check Types
| Type | Checks | On Failure |
|---|---|---|
| Liveness | Process alive? | Restart container |
| Readiness | Can serve traffic? | Remove from LB |
| Startup | Initialized? | Delay other checks |
Observability Stack
Application ├── Logs → stdout → Logging driver → Aggregator ├── Metrics → /metrics → Prometheus → Grafana └── Health → /health/* → Docker/K8s → Orchestrator
📋 Quick Reference
Logging Commands
BASH(9 lines)CodeLoading syntax highlighter...
Health Check Dockerfile
DOCKERFILE(3 lines)CodeLoading syntax highlighter...
Compose Logging
YAML(7 lines)CodeLoading syntax highlighter...
📅 Review Schedule
| Day | Task | Time |
|---|---|---|
| Day 1 | Review logging driver options | 10 min |
| Day 3 | Configure log rotation in a project | 15 min |
| Day 7 | Implement health check | 20 min |
| Day 14 | Set up structured logging | 25 min |
| Day 30 | Deploy complete monitoring stack | 45 min |
📚 Series Navigation
| Previous | Current | Next |
|---|---|---|
| Part 10: Volumes & Storage | Part 11: Logging & Observability | Part 12: Container Security |
- Part 0: How to Use This Series
- Part 1-10: Previous parts...
- Part 11: Logging & Observability ← You are here
- Part 12: Container Security Hardening