Linear Time Sorting
The previous article proved that comparison-based sorting cannot beat O(n log n). But what if we don't use comparisons? This article explores three algorithms that achieve O(n) time by exploiting properties of the data: Counting Sort, Radix Sort, and Bucket Sort.
These algorithms aren't always applicable, but when they are, they can be orders of magnitude faster than QuickSort or MergeSort - especially for large datasets with bounded integer keys.
π At a Glance
| Algorithm | Time | Space | Stable | When to Use |
|---|---|---|---|---|
| Counting Sort | O(n + k) | O(k) | Yes | Small range integers |
| Radix Sort | O(d Γ (n + k)) | O(n + k) | Yes | Fixed-width integers/strings |
| Bucket Sort | O(n + k) | O(n + k) | Yes | Uniformly distributed data |
Where k = range of values, d = number of digits
π― What You'll Learn
After reading this article, you will be able to:
- Implement Counting Sort for integers with known range
- Apply Radix Sort to sort millions of integers faster than QuickSort
- Use Bucket Sort for uniformly distributed floating-point numbers
- Know when O(n) sorts beat O(n log n) in practice
- Combine these algorithms with comparison sorts for hybrid approaches
- Avoid common pitfalls like negative numbers and memory overhead
π₯ Production Story: 100 Million Log Entries
A monitoring system needed to sort 100 million log entries by timestamp for daily analysis. The timestamp was a Unix epoch in milliseconds, ranging from midnight to midnight (86,400,000 values).
JAVA(5 lines)CodeLoading syntax highlighter...
- Timestamps were bounded (one day = 86.4M milliseconds)
- Timestamps were dense (many entries per millisecond)
- MergeSort was doing ~1.7 billion comparisons
JAVA(25 lines)CodeLoading syntax highlighter...
JAVA(8 lines)CodeLoading syntax highlighter...
π§ Mental Model: Trading Assumptions for Speed
Linear time sorts work by avoiding comparisons entirely. Instead, they use direct addressing or arithmetic:
TEXT(27 lines)CodeLoading syntax highlighter...
TEXT(6 lines)CodeLoading syntax highlighter...
π¬ Deep Dive
1. Counting Sort: The Foundation
Counting Sort works by counting occurrences of each value, then using those counts to place elements directly.
JAVA(21 lines)CodeLoading syntax highlighter...
JAVA(27 lines)CodeLoading syntax highlighter...
TEXT(10 lines)CodeLoading syntax highlighter...
JAVA(21 lines)CodeLoading syntax highlighter...
2. Radix Sort: Sorting by Digits
Radix Sort sorts numbers digit by digit, using a stable sort (usually Counting Sort) for each digit.
JAVA(38 lines)CodeLoading syntax highlighter...
TEXT(15 lines)CodeLoading syntax highlighter...
JAVA(27 lines)CodeLoading syntax highlighter...
JAVA(31 lines)CodeLoading syntax highlighter...
3. Bucket Sort: For Uniform Distributions
Bucket Sort works by distributing elements into buckets, sorting each bucket, and concatenating.
JAVA(36 lines)CodeLoading syntax highlighter...
TEXT(9 lines)CodeLoading syntax highlighter...
JAVA(29 lines)CodeLoading syntax highlighter...
4. When to Use Which?
JAVA(28 lines)CodeLoading syntax highlighter...
β οΈ Common Mistakes
Mistake 1: Forgetting About Negative Numbers
JAVA(21 lines)CodeLoading syntax highlighter...
Mistake 2: Using Radix Sort When Range Is Huge
JAVA(6 lines)CodeLoading syntax highlighter...
Mistake 3: Unstable Counting Sort Breaking Radix Sort
JAVA(17 lines)CodeLoading syntax highlighter...
π Debug This
The following Radix Sort produces wrong results. Find the bug:
JAVA(36 lines)CodeLoading syntax highlighter...
Click to reveal the bug and solution
Two issues:
-
Loop condition:
exp <= maxshould bemax / exp > 0- For max=170,
exp=1000is included but shouldn't be
- For max=170,
-
Not traversing backwards: The output loop goes forwards, breaking stability
TEXT(5 lines)CodeLoading syntax highlighter...
JAVA(31 lines)CodeLoading syntax highlighter...
π» Exercises
Exercise 1: Sort Strings by Length β
Use Counting Sort to sort strings by their length:
JAVA(4 lines)CodeLoading syntax highlighter...
Solution
JAVA(28 lines)CodeLoading syntax highlighter...
Exercise 2: MSD Radix Sort ββ
Implement Most Significant Digit (MSD) Radix Sort:
JAVA(4 lines)CodeLoading syntax highlighter...
Solution
JAVA(39 lines)CodeLoading syntax highlighter...
MSD is useful for variable-length strings (can stop early) but more complex than LSD.
Exercise 3: Sort IP Addresses ββ
Sort IP addresses (as strings) efficiently:
JAVA(5 lines)CodeLoading syntax highlighter...
Solution
JAVA(55 lines)CodeLoading syntax highlighter...
Exercise 4: Hybrid Sort βββ
Implement a hybrid sort that chooses the best algorithm:
JAVA(3 lines)CodeLoading syntax highlighter...
Solution
JAVA(56 lines)CodeLoading syntax highlighter...
Exercise 5: External Radix Sort ββββ
Implement Radix Sort for data too large to fit in memory:
JAVA(4 lines)CodeLoading syntax highlighter...
Solution
JAVA(68 lines)CodeLoading syntax highlighter...
π€ Interview Questions
Question #1: When would you use Counting Sort over QuickSort?
-
Keys are integers in a known, small range [0, k]
-
k β€ n log n: If range k exceeds n log n, QuickSort is faster
- Example: n=1M, k=20M β Counting Sort
- Example: n=1M, k=1B β QuickSort
-
Stability is needed: Counting Sort is naturally stable
-
Memory is available: Needs O(k) extra space
- Sorting grades (0-100) for 10,000 students β Counting Sort (k=100, n=10000)
- Sorting salaries ($0-$10M) for 1000 employees β QuickSort (k=10M >> n log n)
Question #2: Why does Radix Sort use LSD (Least Significant Digit) order?
- After sorting by ones digit: numbers ending in 0 come first, etc.
- After sorting by tens digit: stability ensures numbers with same tens digit remain sorted by ones
- MSD must recursively sort each bucket separately
- Can't rely on a global stable sort because higher digits are more significant
- More complex but can short-circuit for variable-length strings
[170, 45, 75, 90] After ones: [170, 90, 45, 75] (0s, then 5s) After tens: [45, 70, 75, 90] (4,7,7,9 - stable keeps 70<75) After hundreds: [45, 75, 90, 170]
Question #3: How would you sort 1 billion 32-bit integers with only 1GB RAM?
- Split into ~8 chunks of 125M integers each
- Sort each chunk in memory with QuickSort
- K-way merge the sorted chunks
- 4 passes (one per byte)
- Each pass: distribute to 256 buckets (files)
- Concatenate buckets
- If range is <1B distinct values, use streaming counting
- First pass: count occurrences (bitmap or count array)
- Second pass: output values in order
- Predictable I/O pattern
- Only 4 passes over the data
- ~10 minutes for 1B integers on SSD
Question #4: Explain why Bucket Sort degrades to O(nΒ²) with non-uniform distribution.
- If uniform: each bucket has O(n/k) elements
- Sorting each bucket: O((n/k)Β²) with insertion sort
- Total: k Γ O((n/k)Β²) = O(nΒ²/k)
- With k=n: O(n)
- All elements in one bucket: O(nΒ²)
- Example: [0.001, 0.002, 0.003, ...] all go to bucket 0
- Use comparison sort for buckets (O(n/k log n/k))
- Adaptive bucket sizing based on samples
- Hybrid: switch to QuickSort if bucket too large
Question #5: How would you sort strings alphabetically using Radix Sort?
1. Sort by first character (256-way counting sort) 2. Recursively sort each bucket by remaining characters 3. Stop when single character or empty Optimization: Switch to insertion sort for small buckets
1. Pad all strings to max length 2. Sort from rightmost character to leftmost 3. Use stable counting sort for each position Problem: Wastes space/time with variable lengths
| Aspect | MSD | LSD |
|---|---|---|
| Variable length | Natural | Needs padding |
| Early termination | Yes | No |
| Implementation | Complex (recursive) | Simple |
| Cache behavior | Poor | Good |
- Partition by first character
- Recursively sort substrings
- Combines QuickSort efficiency with string structure
π Quick Reference
Algorithm Selection
| Condition | Best Algorithm |
|---|---|
| k β€ n | Counting Sort |
| k β€ n log n | Counting Sort |
| k > n log n, bounded integers | Radix Sort |
| Uniform distribution floats | Bucket Sort |
| General/unbounded | QuickSort/TimSort |
Complexity Summary
| Algorithm | Time | Space | Stable | Requirements |
|---|---|---|---|---|
| Counting | O(n+k) | O(k) | Yes | Keys in [0,k] |
| Radix (LSD) | O(d(n+k)) | O(n+k) | Yes | d-digit keys |
| Radix (MSD) | O(nΓlen) | O(n+k) | Yes | Variable-length |
| Bucket | O(n+k) | O(n+k) | Yes | Uniform dist |
Code Templates
JAVA(13 lines)CodeLoading syntax highlighter...
π Summary & Key Takeaways
Core Principles
- Linear sorts bypass Ξ©(n log n) by not using comparisons
- Counting Sort: O(n+k) when k (range) is small
- Radix Sort: O(d(n+k)) for fixed-width integers
- Bucket Sort: O(n) for uniformly distributed data
- Trade-off: speed vs assumptions - linear sorts have requirements
What You Can Do Now
- Implement all three linear sorting algorithms
- Choose the right algorithm based on data characteristics
- Handle edge cases (negatives, large ranges)
- Combine linear sorts with comparison sorts for hybrid approaches
- Analyze when O(n) sorts beat O(n log n) in practice
π Review Schedule
| Day | Task | Time |
|---|---|---|
| Day 1 | Review when to use each algorithm | 5 min |
| Day 3 | Implement Counting Sort from memory | 10 min |
| Day 7 | Implement Radix Sort from memory | 15 min |
| Day 14 | Redo Debug This exercise | 10 min |
| Day 30 | Solve the hybrid sort exercise | 20 min |