Logging & Observability

"Where are my logs?" - the most common question when debugging containers. Docker handles logging differently than traditional deployments. This article covers logging drivers, structured logging, metrics, health checks, and building observable containerized applications.

📋 At a Glance

Aspect	Details
Topic	Logging drivers, structured logging, health checks, metrics
Complexity	Intermediate
Prerequisites	Basic Docker usage, Part 1 (Container Internals)
Key Insight	Containers should log to stdout/stderr - Docker handles the rest
Time to Master	2-3 hours

🎯 What You'll Learn

Logging drivers - json-file, syslog, fluentd, and when to use each
Structured logging - JSON logs for machine parsing
Health checks - liveness vs readiness, implementing correctly
Metrics collection - Prometheus patterns for containers
Debugging without logs - when logs aren't enough

🔥 Production Story: The Silent Failure

An application ran fine for weeks, then started dropping requests. No errors in logs. The team added more replicas, but problems persisted.

Investigation:

BASH(6 lines)
Code
Loading syntax highlighter...

Everything looked fine. But checking deeper:

BASH(5 lines)
Code
Loading syntax highlighter...

Root cause: The health check only verified the HTTP server was responding. The database connection pool was exhausted - app could respond to /health but not process actual requests.

The fix:

DOCKERFILE(2 lines)
Code
Loading syntax highlighter...

JAVASCRIPT(10 lines)
Code
Loading syntax highlighter...

Lesson: Health checks must verify actual functionality, not just "is the process running."

🧠 Mental Model: Container Observability Stack

┌─────────────────────────────────────────────────────────────────────────┐
│                    CONTAINER OBSERVABILITY                              │
│                                                                         │
│  ┌──────────────────────────────────────────────────────────────────────┐
│  │                      APPLICATION                                     │
│  │                                                                      │
│  │  Log to stdout/stderr ──────────────────────────────────────────┐    │
│  │  Expose /metrics endpoint ──────────────────────────────────┐   │    │ 
│  │  Implement /health endpoints ─────────────────────────────┐ │   │    │
│  └───────────────────────────────────────────────────────────┼─┼───┼────┘
│                                                              │ │   │
│  ┌───────────────────────────────────────────────────────────┼─┼───┼────┐
│  │                   DOCKER DAEMON                           │ │   │    │
│  │                                                           │ │   │    │
│  │  Logging Driver ◄─────────────────────────────────────────┘ │   │    │
│  │  (json-file, fluentd, etc.)                                 │   │    │
│  │      │                                                      │   │    │
│  │      ├── json-file ──► /var/lib/docker/containers/*/        │   │    │
│  │      ├── syslog ────► syslog server                         │   │    │
│  │      ├── fluentd ───► Fluentd/Fluent Bit                    │   │    │
│  │      └── awslogs ───► CloudWatch                            │   │    │
│  │                                                             │   │    │
│  │  Health Check ◄─────────────────────────────────────────────┘   │    │
│  │  (HEALTHCHECK instruction)                                      │    │
│  │      │                                                          │    │
│  │      └── Updates container status                               │    │
│  │                                                                 │    │
│  └─────────────────────────────────────────────────────────────────┼────┘
│                                                                    │
│  ┌─────────────────────────────────────────────────────────────────┼───┐
│  │                   MONITORING SYSTEM                             │   │
│  │                                                                 │   │
│  │  Prometheus ◄───────────────────────────────────────────────────┘   │
│  │  (scrapes /metrics)                                                 │
│  │      │                                                              │
│  │      └── Grafana (visualization)                                    │
│  │                                                                     │
│  │  Log Aggregator (ELK, Loki, etc.)                                   │
│  │      │                                                              │
│  │      └── Receives logs from driver                                  │
│  │                                                                     │
│  └─────────────────────────────────────────────────────────────────────┘
│                                                                        │
└────────────────────────────────────────────────────────────────────────┘

🔬 Deep Dive

The Twelve-Factor App Logging Principle

Containers follow the 12-factor app methodology for logs:

"

Treat logs as event streams. A twelve-factor app never concerns itself with routing or storage of its output stream.

Translation: Write to stdout/stderr. Let the platform (Docker) handle the rest.

PYTHON(8 lines)
Code
Loading syntax highlighter...

JAVASCRIPT(6 lines)
Code
Loading syntax highlighter...

Logging Drivers

Docker captures stdout/stderr and sends to a logging driver:

BASH(8 lines)
Code
Loading syntax highlighter...

Available drivers:

Driver	Description	Use Case
json-file	JSON files on disk (default)	Development, simple deployments
syslog	System syslog	Traditional infrastructure
journald	systemd journal	Linux with systemd
fluentd	Fluentd collector	Kubernetes, centralized logging
awslogs	AWS CloudWatch	AWS deployments
gcplogs	Google Cloud Logging	GCP deployments
none	Disable logging	When logs handled elsewhere

Configure default driver:

JSON(8 lines)
Code
Loading syntax highlighter...

JSON-File Driver (Default)

Most common for development and simple deployments:

BASH(11 lines)
Code
Loading syntax highlighter...

Configure log rotation:

YAML(10 lines)
Code
Loading syntax highlighter...

BASH(5 lines)
Code
Loading syntax highlighter...

Structured Logging

JSON logs are machine-parseable:

PYTHON(21 lines)
Code
Loading syntax highlighter...

JAVASCRIPT(5 lines)
Code
Loading syntax highlighter...

Benefits of structured logging:

Machine-parseable (Elasticsearch, Loki, etc.)
Filterable by fields
Aggregatable for metrics
Consistent format

Health Checks

Health checks tell Docker (and orchestrators) if a container is working:

DOCKERFILE(11 lines)
Code
Loading syntax highlighter...

Health check options:

Option	Default	Description
`--interval`	30s	Time between checks
`--timeout`	30s	Max time for check to complete
`--start-period`	0s	Initialization grace period
`--retries`	3	Failures before unhealthy

Container health states:

starting - In start-period, checks not counted yet
healthy - Last N checks passed
unhealthy - Last N checks failed

BASH(5 lines)
Code
Loading syntax highlighter...

Comprehensive health check pattern:

JAVASCRIPT(26 lines)
Code
Loading syntax highlighter...

Metrics with Prometheus

Standard pattern for container metrics:

JAVASCRIPT(39 lines)
Code
Loading syntax highlighter...

Prometheus scrape config:

YAML(6 lines)
Code
Loading syntax highlighter...

Log Aggregation Patterns

Pattern 1: Sidecar container:

YAML(15 lines)
Code
Loading syntax highlighter...

Pattern 2: Logging driver to aggregator:

YAML(15 lines)
Code
Loading syntax highlighter...

Pattern 3: stdout to Loki:

YAML(18 lines)
Code
Loading syntax highlighter...

⚠️ Common Mistakes

Mistake 1: Logging to Files Instead of stdout

DOCKERFILE(8 lines)
Code
Loading syntax highlighter...

Mistake 2: No Log Rotation

YAML(15 lines)
Code
Loading syntax highlighter...

Mistake 3: Shallow Health Checks

DOCKERFILE(8 lines)
Code
Loading syntax highlighter...

🐛 Debug This: The Disappearing Logs

A developer reports: "My container logs show nothing! But I know it's writing logs!"

BASH(9 lines)
Code
Loading syntax highlighter...

Why doesn't docker logs show anything?

✅ Solution:

The application is logging to a file instead of stdout/stderr. Docker only captures stdout/stderr streams.

Fixes:

1. Configure application to log to stdout:

BASH(2 lines)
Code
Loading syntax highlighter...

2. Redirect file to stdout in Dockerfile:

DOCKERFILE(2 lines)
Code
Loading syntax highlighter...

3. Use tail in entrypoint:

DOCKERFILE(2 lines)
Code
Loading syntax highlighter...

4. Change application logging config:

PYTHON(7 lines)
Code
Loading syntax highlighter...

The correct fix is usually option 1 or 4 - configure the application properly. Options 2 and 3 are workarounds.

12-factor principle: Applications should never manage log files. Write to stdout, let the platform handle routing and storage.

💻 Exercises

Exercise 1: Configure Log Rotation

⭐ Difficulty: Easy | ⏱️ Time: 15 minutes

BASH(20 lines)
Code
Loading syntax highlighter...

Exercise 2: Implement Health Check

⭐⭐ Difficulty: Medium | ⏱️ Time: 20 minutes

BASH(57 lines)
Code
Loading syntax highlighter...

Exercise 3: Structured Logging

⭐⭐ Difficulty: Medium | ⏱️ Time: 20 minutes

BASH(47 lines)
Code
Loading syntax highlighter...

Exercise 4: Prometheus Metrics

⭐⭐⭐ Difficulty: Hard | ⏱️ Time: 30 minutes

Create a complete metrics setup:

YAML(22 lines)
Code
Loading syntax highlighter...

YAML(8 lines)
Code
Loading syntax highlighter...

Tasks:

Create a simple app that exposes /metrics endpoint
Include request count and duration metrics
Set up Prometheus to scrape the metrics
Create a simple Grafana dashboard

Exercise 5: Complete Observability Stack

⭐⭐⭐⭐ Difficulty: Expert | ⏱️ Time: 45 minutes

Build a complete observability setup:

Application with structured JSON logging
Health checks (liveness + readiness)
Prometheus metrics
Log aggregation with Loki
Grafana dashboards for both metrics and logs

YAML(7 lines)
Code
Loading syntax highlighter...

🎤 Senior-Level Interview Questions

Q1: Why should containers log to stdout instead of files?

Strong Answer:

"This follows the twelve-factor app methodology and has practical benefits:

1. Separation of concerns:

Application produces logs
Platform handles routing, storage, rotation
App doesn't need to know about log infrastructure

2. Portability:

Same container works with any logging backend
json-file for dev, fluentd for prod, CloudWatch for AWS
No code changes needed

3. No log file management:

No rotation logic in app
No disk full issues from unrotated logs
No volume mounts for log persistence

4. Container lifecycle:

Logs available via docker logs immediately
Survive container restarts (with json-file driver)
Aggregatable across containers

5. Debugging:

docker logs just works
No need to exec into container
Consistent interface across all containers

The exception is when you need file-based logging for specific tools. In that case, symlink the file to /dev/stdout: ln -sf /dev/stdout /var/log/app.log"

Q2: Explain the difference between liveness and readiness health checks.

Strong Answer:

"They serve different purposes in container orchestration:

Liveness:

Question: 'Is the process alive and not deadlocked?'
Failure action: Restart the container
Should be: Fast, minimal dependencies
Example: Can the process respond to a simple request?

GET /health/live
Response: 200 OK (process is alive)

Readiness:

Question: 'Can this instance serve traffic?'
Failure action: Remove from load balancer, don't restart
Should check: Dependencies (DB, cache, upstream services)
Example: Are all required connections healthy?

GET /health/ready
Response: 200 if DB + cache + upstream OK
         503 if any dependency is down

Why separate them:

Scenario: Database goes down temporarily.

Readiness fails: Container removed from LB, no traffic
Liveness passes: Container stays running
When DB recovers: Readiness passes, traffic resumes
No unnecessary container restarts

If we only had liveness including DB check:

DB down → liveness fails → container restarts
Restarting doesn't fix DB
Container keeps crash-looping
Worse than just waiting for DB

In Docker:

DOCKERFILE
Code
Loading syntax highlighter...

In Kubernetes:

YAML(17 lines)
Code
Loading syntax highlighter...

JSON format for machine parsing
Include service name in every log
Include trace/correlation ID

2. Distributed tracing:

Propagate trace ID across service calls
Use OpenTelemetry or similar
Enables request flow visualization

3. Centralized aggregation:

Container → Logging driver → Collector → Storage → Query UI
           (fluentd)       (Fluentd)   (ES/Loki) (Kibana/Grafana)

4. Log levels strategically:

ERROR: Requires attention
WARN: Concerning but handled
INFO: Business events
DEBUG: Only in dev/troubleshooting

5. Include context:

PYTHON(6 lines)
Code
Loading syntax highlighter...

6. Alerting on patterns:

Alert on error rate increase
Alert on specific error types
Don't alert on every error

The goal is: from any error, I can trace the entire request flow across all services."

Q4: How do you configure Docker logging to prevent disk space issues?

Strong Answer:

"This is a common production issue. The solution involves multiple layers:

1. Configure log rotation (container level):

YAML(7 lines)
Code
Loading syntax highlighter...

Each container: max 50MB logs (5 × 10MB)

2. Set daemon defaults (host level):

JSON(8 lines)
Code
Loading syntax highlighter...

Applies to all containers without explicit config.

3. Monitor disk usage:

BASH(2 lines)
Code
Loading syntax highlighter...

4. Consider alternative drivers:

Production: fluentd/syslog to external storage
Development: json-file with rotation

5. Regular cleanup:

BASH(5 lines)
Code
Loading syntax highlighter...

6. Log level management:

Production: INFO and above
Debug logs only when troubleshooting

The key is proactive configuration. Default json-file with no limits will eventually fill any disk."

Q5: How would you implement metrics collection for Docker containers?

Strong Answer:

"I use the Prometheus pull model as the standard approach:

Application metrics:

PYTHON(9 lines)
Code
Loading syntax highlighter...

Container metrics (cAdvisor):

YAML(8 lines)
Code
Loading syntax highlighter...

Exposes CPU, memory, network, disk metrics per container.

Prometheus configuration:

YAML(7 lines)
Code
Loading syntax highlighter...

Key metrics to collect:

Request rate, errors, duration (RED)
Resource usage (CPU, memory, network)
Business metrics (orders, users, etc.)
Dependency health (DB connections, queue depth)

Visualization: Grafana dashboards with:

Per-service panels
Container resource usage
Error rates and latencies
Alerting thresholds

For Kubernetes: Use kube-state-metrics and node-exporter in addition to cAdvisor."

📝 Summary & Key Takeaways

Logging Best Practices

Practice	Implementation
Log to stdout	App writes to console, Docker captures
Structured format	JSON for machine parsing
Include context	trace_id, service, request details
Configure rotation	max-size and max-file options
Centralize logs	Fluentd, Loki, or cloud service

Health Check Types

Type	Checks	On Failure
Liveness	Process alive?	Restart container
Readiness	Can serve traffic?	Remove from LB
Startup	Initialized?	Delay other checks

Observability Stack

Application
├── Logs → stdout → Logging driver → Aggregator
├── Metrics → /metrics → Prometheus → Grafana
└── Health → /health/* → Docker/K8s → Orchestrator

📋 Quick Reference

Logging Commands

BASH(9 lines)
Code
Loading syntax highlighter...

Health Check Dockerfile

DOCKERFILE(3 lines)
Code
Loading syntax highlighter...

Compose Logging

YAML(7 lines)
Code
Loading syntax highlighter...

📅 Review Schedule

Day	Task	Time
Day 1	Review logging driver options	10 min
Day 3	Configure log rotation in a project	15 min
Day 7	Implement health check	20 min
Day 14	Set up structured logging	25 min
Day 30	Deploy complete monitoring stack	45 min

Previous	Current	Next
Part 10: Volumes & Storage	Part 11: Logging & Observability	Part 12: Container Security

Docker Compendium Series:

Part 0: How to Use This Series
Part 1-10: Previous parts...
Part 11: Logging & Observability ← You are here
Part 12: Container Security Hardening

📋 At a Glance

🎯 What You'll Learn

🔥 Production Story: The Silent Failure

🧠 Mental Model: Container Observability Stack

🔬 Deep Dive

The Twelve-Factor App Logging Principle

Logging Drivers

JSON-File Driver (Default)

Structured Logging

Health Checks

Metrics with Prometheus

Log Aggregation Patterns

⚠️ Common Mistakes

Mistake 1: Logging to Files Instead of stdout

Mistake 2: No Log Rotation

Mistake 3: Shallow Health Checks

🐛 Debug This: The Disappearing Logs

💻 Exercises

Exercise 1: Configure Log Rotation

Exercise 2: Implement Health Check

Exercise 3: Structured Logging

Exercise 4: Prometheus Metrics

Exercise 5: Complete Observability Stack

🎤 Senior-Level Interview Questions

Q1: Why should containers log to stdout instead of files?

Q2: Explain the difference between liveness and readiness health checks.

Q4: How do you configure Docker logging to prevent disk space issues?

Q5: How would you implement metrics collection for Docker containers?

📝 Summary & Key Takeaways

Logging Best Practices

Health Check Types

Observability Stack

📋 Quick Reference

Logging Commands

Health Check Dockerfile

Compose Logging

📅 Review Schedule

📚 Series Navigation

Tags: