Devops

Service Dependencies & Orchestration

πŸ“‹ At a Glance

AspectDetails
Difficulty🟑 Intermediate
PrerequisitesPart 14 (Compose Deep Dive)
Key Conceptsdepends_on, health checks, startup ordering, graceful shutdown
Time Investment26 minutes read + 45 minutes practice
PayoffZero-downtime deployments and reliable service orchestration

🎯 What You'll Learn

After this article, you'll be able to:

  1. Implement proper startup ordering with health-based dependencies
  2. Design health checks that accurately reflect service readiness
  3. Handle graceful shutdown to prevent request loss during deployments
  4. Manage service initialization including migrations and seeding
  5. Debug dependency issues and circular dependency problems

πŸ”₯ Production Story: The Race Condition Deployment

The Setup: A microservices application with API, worker, and database. Every deployment, about 10% of initial requests failed with "Connection refused."
The Configuration:
YAML(12 lines)
Code
Loading syntax highlighter...
The Investigation:
BASH(7 lines)
Code
Loading syntax highlighter...
Root Cause: depends_on only waits for container start, not database readiness. PostgreSQL takes 2-5 seconds to initialize. API started connecting before Postgres was ready.
The Fix:
YAML(22 lines)
Code
Loading syntax highlighter...
Result: Zero failed connections during deployment. API waits for databases to be fully ready before starting.

🧠 Mental Model: Service Lifecycle

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    SERVICE LIFECYCLE                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                  β”‚
β”‚   STARTUP PHASE                                                  β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚   β”‚                                                         β”‚    β”‚
β”‚   β”‚   [Created] ──→ [Starting] ──→ [Running] ──→ [Healthy]  β”‚    β”‚
β”‚   β”‚       β”‚             β”‚              β”‚             β”‚      β”‚    β”‚
β”‚   β”‚       β”‚         start_period   healthcheck    ready     β”‚    β”‚
β”‚   β”‚       β”‚         (grace time)   begins        to serve   β”‚    β”‚
β”‚   β”‚       β”‚                                                 β”‚    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                                  β”‚
β”‚   DEPENDENCY ORDERING                                            β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚   β”‚                                                         β”‚    β”‚
β”‚   β”‚   Infrastructure    ──→    Core Services   ──→   App    β”‚    β”‚
β”‚   β”‚   (db, redis)            (migrations)         (api)     β”‚    β”‚
β”‚   β”‚                                                         β”‚    β”‚
β”‚   β”‚   service_healthy   service_completed    service_healthyβ”‚    β”‚
β”‚   β”‚                     _successfully                       β”‚    β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”‚                                                                  β”‚
β”‚   SHUTDOWN PHASE                                                 β”‚
β”‚   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚
β”‚   β”‚                                                        β”‚     β”‚
β”‚   β”‚   [Running] ──→ [SIGTERM] ──→ [Draining] ──→ [Stopped] β”‚     β”‚
β”‚   β”‚                     β”‚             β”‚              β”‚     β”‚     β”‚
β”‚   β”‚               stop_signal   stop_grace    SIGKILL      β”‚     β”‚
β”‚   β”‚               received      _period       if needed    β”‚     β”‚
β”‚   β”‚                                                        β”‚     β”‚
β”‚   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚
β”‚                                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”¬ Deep Dive

1. Understanding depends_on Conditions

The Three Conditions:
YAML(14 lines)
Code
Loading syntax highlighter...
When to Use Each:
ConditionUse CaseExample
service_startedQuick-start services, app handles retryFeature flags service
service_healthyMust be ready before dependent startsDatabase, cache, queue
service_completed_successfullyOne-time setup tasksMigrations, seed data
Complete Dependency Example:
YAML(58 lines)
Code
Loading syntax highlighter...

2. Designing Effective Health Checks

Health Check Components:
YAML(6 lines)
Code
Loading syntax highlighter...
Database Health Checks:
YAML(20 lines)
Code
Loading syntax highlighter...
Cache/Queue Health Checks:
YAML(27 lines)
Code
Loading syntax highlighter...
Application Health Checks:
YAML(28 lines)
Code
Loading syntax highlighter...
Liveness vs Readiness (implementing both in one endpoint):
JAVASCRIPT(31 lines)
Code
Loading syntax highlighter...

3. Graceful Shutdown Patterns

Understanding Stop Signals:
YAML(4 lines)
Code
Loading syntax highlighter...
Signal Handling in Application:
JAVASCRIPT(27 lines)
Code
Loading syntax highlighter...
JAVA(17 lines)
Code
Loading syntax highlighter...
Compose Configuration for Graceful Shutdown:
YAML(10 lines)
Code
Loading syntax highlighter...
Pre-Stop Hook Pattern (wait for connections to drain):
YAML(16 lines)
Code
Loading syntax highlighter...

4. Migration and Initialization Patterns

One-Time Migration Service:
YAML(10 lines)
Code
Loading syntax highlighter...
Migration with Locking (prevent concurrent runs):
YAML(21 lines)
Code
Loading syntax highlighter...
Seed Data Pattern:
YAML(10 lines)
Code
Loading syntax highlighter...
Wait-For Pattern (when depends_on isn't enough):
YAML(6 lines)
Code
Loading syntax highlighter...

Or build it into the image:

DOCKERFILE(11 lines)
Code
Loading syntax highlighter...

5. Handling Circular Dependencies

Problem: Circular Dependency:
YAML(12 lines)
Code
Loading syntax highlighter...
Solution 1: Break the Cycle with service_started:
YAML(18 lines)
Code
Loading syntax highlighter...
Solution 2: Design Out the Circular Dependency:
YAML(20 lines)
Code
Loading syntax highlighter...
Solution 3: Event-Driven Communication:
YAML(17 lines)
Code
Loading syntax highlighter...

6. Startup and Shutdown Order

Startup Order (compose figures this out from depends_on):
1. Infrastructure (db, redis, queue)
2. Setup tasks (migrations, seeds)
3. Core services (api, worker)
4. Auxiliary services (monitoring, logging)
Shutdown Order (reverse of startup):
YAML(14 lines)
Code
Loading syntax highlighter...
Controlling Shutdown Order Explicitly:
BASH(5 lines)
Code
Loading syntax highlighter...
Automated Graceful Shutdown Script:
BASH(19 lines)
Code
Loading syntax highlighter...

7. Complete Production Example

YAML(120 lines)
Code
Loading syntax highlighter...
Deployment Script:
BASH(36 lines)
Code
Loading syntax highlighter...

⚠️ Common Mistakes

Mistake 1: Using depends_on Without Condition

YAML(15 lines)
Code
Loading syntax highlighter...

Mistake 2: Shallow Health Checks

YAML(7 lines)
Code
Loading syntax highlighter...

Mistake 3: Missing start_period

YAML(13 lines)
Code
Loading syntax highlighter...

Mistake 4: Immediate SIGKILL

YAML(6 lines)
Code
Loading syntax highlighter...

Mistake 5: Not Handling SIGTERM

JAVASCRIPT(14 lines)
Code
Loading syntax highlighter...

πŸ› Debug This

Services won't start in the right order:

YAML(26 lines)
Code
Loading syntax highlighter...
BASH(7 lines)
Code
Loading syntax highlighter...

API fails even though db shows healthy. What's wrong?

Click to reveal analysis
Problems identified:
  1. Missing start_period on API - API health check starts immediately, but API needs time to boot:
YAML(5 lines)
Code
Loading syntax highlighter...
  1. Missing PostgreSQL user in health check:
YAML(2 lines)
Code
Loading syntax highlighter...
  1. API needs connection retry logic - Even with depends_on, there's a small window where connection might fail.
Fixed version:
YAML(34 lines)
Code
Loading syntax highlighter...

Additionally, the application should implement connection retry:

JAVASCRIPT(12 lines)
Code
Loading syntax highlighter...

πŸ’» Exercises

Exercise 1: Basic Dependency Chain

Create a compose file with:

  • PostgreSQL with proper health check
  • API that depends on PostgreSQL being healthy
  • Worker that depends on API being healthy

Exercise 2: Migration Pattern

Implement a migration service that:

  • Runs only once (doesn't restart)
  • Waits for database to be healthy
  • API waits for migration to complete

Exercise 3: Graceful Shutdown

Create an application that:

  • Handles SIGTERM properly
  • Completes in-flight requests before stopping
  • Has appropriate stop_grace_period

Exercise 4: Health Check Design

Design health checks for:

  • A service that needs 45 seconds to start
  • A service that should fail if Redis is unreachable
  • A service with both liveness and readiness requirements

Exercise 5: Circular Dependency Resolution

You have two services that need to communicate with each other. Design a solution that:

  • Avoids circular depends_on
  • Ensures both services can find each other
  • Handles the case where one service starts before the other

🎀 Interview Questions

Q1: What's the difference between service_started and service_healthy conditions?

Answer:
ConditionWaits ForUse Case
service_startedContainer is runningQuick services, apps with built-in retry
service_healthyHealth check passesDatabases, caches that need init time
YAML(11 lines)
Code
Loading syntax highlighter...
Important: Without explicit condition, depends_on defaults to service_started, which is why many deployments have race conditions - developers assume it waits for readiness.

Q2: How would you implement a health check that verifies database connectivity?

Answer: Two approaches - database-side and app-side:
Database-side health check:
YAML(8 lines)
Code
Loading syntax highlighter...
Application-side health check (more comprehensive):
YAML(4 lines)
Code
Loading syntax highlighter...
JAVASCRIPT(10 lines)
Code
Loading syntax highlighter...

Best practice is both: database health check ensures it accepts connections, application health check ensures your specific user and database work.


Q3: How do you handle graceful shutdown in a containerized application?

Answer: Multi-layer approach:
  1. Compose configuration:
YAML(4 lines)
Code
Loading syntax highlighter...
  1. Application signal handling:
JAVASCRIPT(26 lines)
Code
Loading syntax highlighter...
  1. Load balancer coordination: Remove container from pool before stopping:
BASH(4 lines)
Code
Loading syntax highlighter...

Q4: How do you prevent race conditions when multiple services need the same database?

Answer: Several strategies:
  1. Single migration service with completion dependency:
YAML(17 lines)
Code
Loading syntax highlighter...
  1. Database-level migration locking:
JAVASCRIPT(11 lines)
Code
Loading syntax highlighter...
  1. Idempotent migrations: Design migrations to be safe if run multiple times:
SQL(3 lines)
Code
Loading syntax highlighter...

Q5: What's the startup and shutdown order when you run docker compose down?

Answer: Compose reverses the dependency order for shutdown:
Startup order (based on depends_on):
1. db (no dependencies)
2. redis (no dependencies)
3. migrate (depends on db)
4. api (depends on db, redis, migrate)
5. worker (depends on api)
6. nginx (depends on api)
Shutdown order (reverse):
1. nginx (most dependent)
2. worker
3. api
4. migrate (if still running)
5. db, redis (least dependent)
Important nuances:
  • docker compose down sends SIGTERM to all containers in reverse dependency order
  • Containers have stop_grace_period to finish gracefully
  • After grace period, SIGKILL is sent
  • Networks and volumes are removed last
Best practice for clean shutdown:
BASH(7 lines)
Code
Loading syntax highlighter...

πŸ“ Summary & Key Takeaways

Dependency Conditions

ConditionWait ForUse Case
service_startedContainer runningQuick services
service_healthyHealth check passDatabases, caches
service_completed_successfullyExit code 0Migrations

Health Check Best Practices

  • Always include start_period for slow-starting services
  • Test actual functionality, not just port availability
  • Use appropriate intervals (5-30s depending on service)

Graceful Shutdown

  1. Handle SIGTERM in application
  2. Stop accepting new connections
  3. Complete in-flight requests
  4. Close resources (DB, cache)
  5. Exit cleanly

Startup Order

Infrastructure β†’ Setup β†’ Application β†’ Proxy

Shutdown Order

Proxy β†’ Application β†’ Setup β†’ Infrastructure

πŸ“‹ Quick Reference

YAML(22 lines)
Code
Loading syntax highlighter...

πŸ“… Review Schedule

  • Day 1: Practice depends_on with conditions
  • Day 3: Implement health checks for your services
  • Day 7: Add graceful shutdown handling
  • Day 14: Design complete startup/shutdown flow
  • Day 30: Audit production compose files

πŸ“š Series Navigation