Devops
Service Dependencies & Orchestration
π At a Glance
| Aspect | Details |
|---|---|
| Difficulty | π‘ Intermediate |
| Prerequisites | Part 14 (Compose Deep Dive) |
| Key Concepts | depends_on, health checks, startup ordering, graceful shutdown |
| Time Investment | 26 minutes read + 45 minutes practice |
| Payoff | Zero-downtime deployments and reliable service orchestration |
π― What You'll Learn
After this article, you'll be able to:
- Implement proper startup ordering with health-based dependencies
- Design health checks that accurately reflect service readiness
- Handle graceful shutdown to prevent request loss during deployments
- Manage service initialization including migrations and seeding
- Debug dependency issues and circular dependency problems
π₯ Production Story: The Race Condition Deployment
The Setup: A microservices application with API, worker, and database. Every deployment, about 10% of initial requests failed with "Connection refused."
The Configuration:
YAML(12 lines)CodeLoading syntax highlighter...
The Investigation:
BASH(7 lines)CodeLoading syntax highlighter...
Root Cause:
depends_on only waits for container start, not database readiness. PostgreSQL takes 2-5 seconds to initialize. API started connecting before Postgres was ready.The Fix:
YAML(22 lines)CodeLoading syntax highlighter...
Result: Zero failed connections during deployment. API waits for databases to be fully ready before starting.
π§ Mental Model: Service Lifecycle
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β SERVICE LIFECYCLE β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ β β β STARTUP PHASE β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β β β β [Created] βββ [Starting] βββ [Running] βββ [Healthy] β β β β β β β β β β β β β start_period healthcheck ready β β β β β (grace time) begins to serve β β β β β β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β DEPENDENCY ORDERING β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β β β β Infrastructure βββ Core Services βββ App β β β β (db, redis) (migrations) (api) β β β β β β β β service_healthy service_completed service_healthyβ β β β _successfully β β β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β SHUTDOWN PHASE β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β β β β β [Running] βββ [SIGTERM] βββ [Draining] βββ [Stopped] β β β β β β β β β β β stop_signal stop_grace SIGKILL β β β β received _period if needed β β β β β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β β β ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π¬ Deep Dive
1. Understanding depends_on Conditions
The Three Conditions:
YAML(14 lines)CodeLoading syntax highlighter...
When to Use Each:
| Condition | Use Case | Example |
|---|---|---|
service_started | Quick-start services, app handles retry | Feature flags service |
service_healthy | Must be ready before dependent starts | Database, cache, queue |
service_completed_successfully | One-time setup tasks | Migrations, seed data |
Complete Dependency Example:
YAML(58 lines)CodeLoading syntax highlighter...
2. Designing Effective Health Checks
Health Check Components:
YAML(6 lines)CodeLoading syntax highlighter...
Database Health Checks:
YAML(20 lines)CodeLoading syntax highlighter...
Cache/Queue Health Checks:
YAML(27 lines)CodeLoading syntax highlighter...
Application Health Checks:
YAML(28 lines)CodeLoading syntax highlighter...
Liveness vs Readiness (implementing both in one endpoint):
JAVASCRIPT(31 lines)CodeLoading syntax highlighter...
3. Graceful Shutdown Patterns
Understanding Stop Signals:
YAML(4 lines)CodeLoading syntax highlighter...
Signal Handling in Application:
JAVASCRIPT(27 lines)CodeLoading syntax highlighter...
JAVA(17 lines)CodeLoading syntax highlighter...
Compose Configuration for Graceful Shutdown:
YAML(10 lines)CodeLoading syntax highlighter...
Pre-Stop Hook Pattern (wait for connections to drain):
YAML(16 lines)CodeLoading syntax highlighter...
4. Migration and Initialization Patterns
One-Time Migration Service:
YAML(10 lines)CodeLoading syntax highlighter...
Migration with Locking (prevent concurrent runs):
YAML(21 lines)CodeLoading syntax highlighter...
Seed Data Pattern:
YAML(10 lines)CodeLoading syntax highlighter...
Wait-For Pattern (when depends_on isn't enough):
YAML(6 lines)CodeLoading syntax highlighter...
Or build it into the image:
DOCKERFILE(11 lines)CodeLoading syntax highlighter...
5. Handling Circular Dependencies
Problem: Circular Dependency:
YAML(12 lines)CodeLoading syntax highlighter...
Solution 1: Break the Cycle with service_started:
YAML(18 lines)CodeLoading syntax highlighter...
Solution 2: Design Out the Circular Dependency:
YAML(20 lines)CodeLoading syntax highlighter...
Solution 3: Event-Driven Communication:
YAML(17 lines)CodeLoading syntax highlighter...
6. Startup and Shutdown Order
Startup Order (compose figures this out from depends_on):
1. Infrastructure (db, redis, queue) 2. Setup tasks (migrations, seeds) 3. Core services (api, worker) 4. Auxiliary services (monitoring, logging)
Shutdown Order (reverse of startup):
YAML(14 lines)CodeLoading syntax highlighter...
Controlling Shutdown Order Explicitly:
BASH(5 lines)CodeLoading syntax highlighter...
Automated Graceful Shutdown Script:
BASH(19 lines)CodeLoading syntax highlighter...
7. Complete Production Example
YAML(120 lines)CodeLoading syntax highlighter...
Deployment Script:
BASH(36 lines)CodeLoading syntax highlighter...
β οΈ Common Mistakes
Mistake 1: Using depends_on Without Condition
YAML(15 lines)CodeLoading syntax highlighter...
Mistake 2: Shallow Health Checks
YAML(7 lines)CodeLoading syntax highlighter...
Mistake 3: Missing start_period
YAML(13 lines)CodeLoading syntax highlighter...
Mistake 4: Immediate SIGKILL
YAML(6 lines)CodeLoading syntax highlighter...
Mistake 5: Not Handling SIGTERM
JAVASCRIPT(14 lines)CodeLoading syntax highlighter...
π Debug This
Services won't start in the right order:
YAML(26 lines)CodeLoading syntax highlighter...
BASH(7 lines)CodeLoading syntax highlighter...
API fails even though db shows healthy. What's wrong?
Click to reveal analysis
Problems identified:
- Missing start_period on API - API health check starts immediately, but API needs time to boot:
YAML(5 lines)CodeLoading syntax highlighter...
- Missing PostgreSQL user in health check:
YAML(2 lines)CodeLoading syntax highlighter...
- API needs connection retry logic - Even with depends_on, there's a small window where connection might fail.
Fixed version:
YAML(34 lines)CodeLoading syntax highlighter...
Additionally, the application should implement connection retry:
JAVASCRIPT(12 lines)CodeLoading syntax highlighter...
π» Exercises
Exercise 1: Basic Dependency Chain
Create a compose file with:
- PostgreSQL with proper health check
- API that depends on PostgreSQL being healthy
- Worker that depends on API being healthy
Exercise 2: Migration Pattern
Implement a migration service that:
- Runs only once (doesn't restart)
- Waits for database to be healthy
- API waits for migration to complete
Exercise 3: Graceful Shutdown
Create an application that:
- Handles SIGTERM properly
- Completes in-flight requests before stopping
- Has appropriate stop_grace_period
Exercise 4: Health Check Design
Design health checks for:
- A service that needs 45 seconds to start
- A service that should fail if Redis is unreachable
- A service with both liveness and readiness requirements
Exercise 5: Circular Dependency Resolution
You have two services that need to communicate with each other. Design a solution that:
- Avoids circular depends_on
- Ensures both services can find each other
- Handles the case where one service starts before the other
π€ Interview Questions
Q1: What's the difference between service_started and service_healthy conditions?
Answer:
| Condition | Waits For | Use Case |
|---|---|---|
service_started | Container is running | Quick services, apps with built-in retry |
service_healthy | Health check passes | Databases, caches that need init time |
YAML(11 lines)CodeLoading syntax highlighter...
Important: Without explicit condition,
depends_on defaults to service_started, which is why many deployments have race conditions - developers assume it waits for readiness.Q2: How would you implement a health check that verifies database connectivity?
Answer: Two approaches - database-side and app-side:
Database-side health check:
YAML(8 lines)CodeLoading syntax highlighter...
Application-side health check (more comprehensive):
YAML(4 lines)CodeLoading syntax highlighter...
JAVASCRIPT(10 lines)CodeLoading syntax highlighter...
Best practice is both: database health check ensures it accepts connections, application health check ensures your specific user and database work.
Q3: How do you handle graceful shutdown in a containerized application?
Answer: Multi-layer approach:
- Compose configuration:
YAML(4 lines)CodeLoading syntax highlighter...
- Application signal handling:
JAVASCRIPT(26 lines)CodeLoading syntax highlighter...
- Load balancer coordination: Remove container from pool before stopping:
BASH(4 lines)CodeLoading syntax highlighter...
Q4: How do you prevent race conditions when multiple services need the same database?
Answer: Several strategies:
- Single migration service with completion dependency:
YAML(17 lines)CodeLoading syntax highlighter...
- Database-level migration locking:
JAVASCRIPT(11 lines)CodeLoading syntax highlighter...
- Idempotent migrations: Design migrations to be safe if run multiple times:
SQL(3 lines)CodeLoading syntax highlighter...
Q5: What's the startup and shutdown order when you run docker compose down?
Answer: Compose reverses the dependency order for shutdown:
Startup order (based on depends_on):
1. db (no dependencies) 2. redis (no dependencies) 3. migrate (depends on db) 4. api (depends on db, redis, migrate) 5. worker (depends on api) 6. nginx (depends on api)
Shutdown order (reverse):
1. nginx (most dependent) 2. worker 3. api 4. migrate (if still running) 5. db, redis (least dependent)
Important nuances:
docker compose downsends SIGTERM to all containers in reverse dependency order- Containers have
stop_grace_periodto finish gracefully - After grace period, SIGKILL is sent
- Networks and volumes are removed last
Best practice for clean shutdown:
BASH(7 lines)CodeLoading syntax highlighter...
π Summary & Key Takeaways
Dependency Conditions
| Condition | Wait For | Use Case |
|---|---|---|
service_started | Container running | Quick services |
service_healthy | Health check pass | Databases, caches |
service_completed_successfully | Exit code 0 | Migrations |
Health Check Best Practices
- Always include
start_periodfor slow-starting services - Test actual functionality, not just port availability
- Use appropriate intervals (5-30s depending on service)
Graceful Shutdown
- Handle SIGTERM in application
- Stop accepting new connections
- Complete in-flight requests
- Close resources (DB, cache)
- Exit cleanly
Startup Order
Infrastructure β Setup β Application β Proxy
Shutdown Order
Proxy β Application β Setup β Infrastructure
π Quick Reference
YAML(22 lines)CodeLoading syntax highlighter...
π Review Schedule
- Day 1: Practice depends_on with conditions
- Day 3: Implement health checks for your services
- Day 7: Add graceful shutdown handling
- Day 14: Design complete startup/shutdown flow
- Day 30: Audit production compose files
π Series Navigation
- Previous: Part 14 - Compose File Deep Dive
- Next: Part 16 - Development vs Production Compose
- Index: Docker Compendium Series