ArrayList Deep Dive
ArrayList is the workhorse of Java collections. For most use cases, it's the right choice - but understanding its internals transforms you from someone who uses ArrayList to someone who uses it optimally. This article dives deep into how ArrayList works, why it makes certain trade-offs, and how to avoid common performance pitfalls.
When you understand the internal array, growth strategy, and memory layout, you'll write code that performs well at scale and avoids the OutOfMemoryError surprises that plague production systems.
๐ At a Glance
| Aspect | Details |
|---|---|
| Internal Structure | Dynamic array (Object[] elementData) |
| Default Capacity | 10 elements |
| Growth Strategy | 1.5x (newCapacity = oldCapacity + oldCapacity >> 1) |
| Random Access | O(1) |
| Insertion at End | O(1) amortized |
| Insertion at Index | O(n) - requires shifting |
| Memory Overhead | ~12 bytes per element + unused capacity |
๐ฏ What You'll Learn
After reading this article, you will be able to:
- Explain the internal structure: How
Object[]backs ArrayList - Predict memory usage: Calculate actual memory footprint
- Optimize capacity: Use
ensureCapacity()andtrimToSize()effectively - Avoid common pitfalls: Resize storms, subList traps, inefficient patterns
- Make informed decisions: When ArrayList beats alternatives
Interactive Visualizer
See ArrayList's internal array, automatic resizing, and element shifting during add/remove operations:
Loading visualizer...
๐ฅ Production Story: The Memory Explosion
A data import service was processing a large file - 10 million records. Everything worked in testing with 100K records. In production, with the full dataset, the service crashed:
TEXT(3 lines)CodeLoading syntax highlighter...
JAVA(12 lines)CodeLoading syntax highlighter...
TEXT(13 lines)CodeLoading syntax highlighter...
JAVA(18 lines)CodeLoading syntax highlighter...
- Memory usage reduced by 40%
- No more resize operations during import
- Import time decreased by 25% (no copying)
๐ง Mental Model: ArrayList as a Stretchy Container
Think of ArrayList as an elastic container that holds a fixed-size box inside:
TEXT(30 lines)CodeLoading syntax highlighter...
add() operations are O(1). The occasional O(n) resize is "paid back" over many O(1) operations, giving O(1) amortized time.๐ฌ Deep Dive
1. The Internal Structure
JAVA(19 lines)CodeLoading syntax highlighter...
Object[] and not E[]?new E[capacity]. The array is Object[], and elements are cast on retrieval:JAVA(4 lines)CodeLoading syntax highlighter...
2. The Growth Strategy
When the array is full, ArrayList grows by 50%:
JAVA(18 lines)CodeLoading syntax highlighter...
TEXT(10 lines)CodeLoading syntax highlighter...
3. Memory Layout and Cache Locality
ArrayList stores references contiguously, which is cache-friendly:
TEXT(34 lines)CodeLoading syntax highlighter...
4. The add() Operation
JAVA(31 lines)CodeLoading syntax highlighter...
add(E)at end: O(1) amortizedadd(int, E)at index: O(n) due to shifting
5. System.arraycopy() - The Native Powerhouse
JAVA(6 lines)CodeLoading syntax highlighter...
TEXT(9 lines)CodeLoading syntax highlighter...
6. Common Operations Complexity
JAVA(42 lines)CodeLoading syntax highlighter...
7. subList() - The View Trap
JAVA(11 lines)CodeLoading syntax highlighter...
JAVA(13 lines)CodeLoading syntax highlighter...
JAVA(7 lines)CodeLoading syntax highlighter...
โ ๏ธ Common Mistakes
Mistake 1: Not Setting Initial Capacity
JAVA(4 lines)CodeLoading syntax highlighter...
JAVA(5 lines)CodeLoading syntax highlighter...
JAVA(5 lines)CodeLoading syntax highlighter...
Mistake 2: Forgetting trimToSize()
JAVA(7 lines)CodeLoading syntax highlighter...
JAVA(5 lines)CodeLoading syntax highlighter...
Mistake 3: Inefficient Middle Insertions
JAVA(5 lines)CodeLoading syntax highlighter...
JAVA(13 lines)CodeLoading syntax highlighter...
Mistake 4: Using contains() in a Loop
JAVA(9 lines)CodeLoading syntax highlighter...
JAVA(9 lines)CodeLoading syntax highlighter...
Mistake 5: Modifying subList and Original
JAVA(5 lines)CodeLoading syntax highlighter...
JAVA(9 lines)CodeLoading syntax highlighter...
๐ Debug This: The Slow Removal
A developer reports: "Removing elements from my ArrayList takes forever!"
JAVA(11 lines)CodeLoading syntax highlighter...
Two problems:
- O(n) removal: Each
remove(i)shifts all subsequent elements - Index bug: After removal, index
inow points to what wasi+1, so we skip elements
JAVA(6 lines)CodeLoading syntax highlighter...
JAVA(8 lines)CodeLoading syntax highlighter...
JAVA(4 lines)CodeLoading syntax highlighter...
JAVA(5 lines)CodeLoading syntax highlighter...
๐ป Exercises
Exercise 1: Capacity Benchmark
โญโญ Difficulty: Medium | โฑ๏ธ Time: 15 minutes
JAVA(7 lines)CodeLoading syntax highlighter...
JAVA(60 lines)CodeLoading syntax highlighter...
Exercise 2: Memory Analysis
โญโญโญ Difficulty: Medium | โฑ๏ธ Time: 15 minutes
JAVA(7 lines)CodeLoading syntax highlighter...
JAVA(70 lines)CodeLoading syntax highlighter...
Exercise 3: Implement GrowableArray
โญโญโญโญ Difficulty: Hard | โฑ๏ธ Time: 25 minutes
JAVA(14 lines)CodeLoading syntax highlighter...
JAVA(92 lines)CodeLoading syntax highlighter...
Exercise 4: Efficient Batch Removal
โญโญโญ Difficulty: Medium | โฑ๏ธ Time: 15 minutes
JAVA(7 lines)CodeLoading syntax highlighter...
JAVA(56 lines)CodeLoading syntax highlighter...
Exercise 5: toArray() Patterns
โญโญ Difficulty: Medium | โฑ๏ธ Time: 10 minutes
JAVA(4 lines)CodeLoading syntax highlighter...
JAVA(42 lines)CodeLoading syntax highlighter...
๐ Summary & Key Takeaways
Performance Characteristics
| Operation | Time | Notes |
|---|---|---|
| get(index) | O(1) | Direct array access |
| set(index, e) | O(1) | Direct array access |
| add(e) | O(1)* | Amortized, occasional O(n) resize |
| add(index, e) | O(n) | Requires shifting |
| remove(index) | O(n) | Requires shifting |
| contains(o) | O(n) | Linear search |
| iterator.remove() | O(n) | Requires shifting |
| removeIf() | O(n) | Single pass, bulk shift |
Capacity Planning
JAVA(11 lines)CodeLoading syntax highlighter...
When to Use ArrayList
| Use Case | ArrayList? | Alternative |
|---|---|---|
| Default list | โ Yes | - |
| Random access | โ Yes | - |
| Add at end | โ Yes | - |
| Frequent contains() | โ No | HashSet |
| Frequent add at front | โ No | ArrayDeque |
| Frequent insert in middle | โ ๏ธ Depends | LinkedList* |
| Memory-constrained | โ ๏ธ Careful | Array |
*LinkedList rarely wins in practice due to cache locality.
๐ค Senior-Level Interview Questions
Question #1: What is the default size of ArrayList and why?
new ArrayList<>(), it actually starts with capacity 0 and grows to 10 on the first add - this optimization avoids allocating arrays for lists that might never be used.Question #2: How does ArrayList grow when it runs out of space?
newCapacity = oldCapacity + (oldCapacity >> 1). This 1.5x growth factor balances two concerns: minimizing resize operations (want larger growth) and minimizing wasted capacity (want smaller growth). The formula uses bit shifting (>> 1) instead of division or multiplication for performance. This gives us amortized O(1) add operations - even though occasional resizes are O(n), they happen infrequently enough that the average is constant.Question #3: When should you use ensureCapacity()?
ensureCapacity() before adding many elements when you know the count in advance. This prevents multiple resize operations during bulk additions. For example, before addAll() with a large collection, or before a loop that adds many elements. Without it, adding 1 million elements to a default ArrayList triggers ~27 resize operations, each copying all existing elements. With ensureCapacity(1_000_000), there's zero resizing.Question #4: What does subList() return - a copy or a view?
subList() returns a view backed by the original list, not a copy. Changes to the sublist affect the original, and vice versa. The dangerous part: if you structurally modify the original list (add/remove), the sublist becomes invalid and throws ConcurrentModificationException on any access. This catches many developers by surprise. If you need an independent list, explicitly copy: new ArrayList<>(original.subList(start, end)).Question #5: Why is ArrayList faster than LinkedList for random access?
get(i) is a single memory access: elementData[i] - O(1). LinkedList must traverse from the head (or tail, whichever is closer), following next references one by one - O(n/2) average, O(n) worst case. Beyond algorithmic complexity, ArrayList has better cache locality: array elements are contiguous in memory, so accessing one often brings nearby elements into CPU cache. LinkedList nodes are scattered across the heap, causing cache misses on every hop.Question #6: How does System.arraycopy() work and why is it fast?
System.arraycopy() is a native method implemented in C/C++, not Java bytecode. It can use optimized CPU instructions like memcpy or memmove, and potentially SIMD (Single Instruction Multiple Data) for bulk operations. Unlike a Java loop that checks bounds on every iteration, arraycopy checks bounds once at the start then copies in bulk. Benchmarks show it's 5-10x faster than a manual loop. It's the engine behind ArrayList's resize operations and many other JDK internal operations.๐ Conclusion
ArrayList is simple in concept but rich in optimization opportunities. Understanding its internals transforms how you write code:
- Pre-size when you know the count
- Trim after building large collections
- Avoid middle insertions or use alternatives
- Use removeIf() for bulk removal
- Be cautious with subList() views
- Default capacity is 10, grows by 1.5x
- Pre-allocate for known sizes to avoid resize storms
- Add at end is O(1) amortized; middle insertion is O(n)
- subList() returns a view, not a copy
- System.arraycopy() makes bulk operations fast
- Cache locality makes ArrayList faster than LinkedList for most workloads
๐ Review Schedule for This Article
| Day | Task | Time |
|---|---|---|
| Day 1 | Review growth strategy and capacity planning | 5 min |
| Day 3 | Redo Exercise 1 (Capacity Benchmark) | 15 min |
| Day 7 | Answer interview questions without looking | 10 min |
| Day 14 | Redo Debug This exercise | 10 min |
| Day 30 | Review operation complexity table | 5 min |