Java

ArrayList Deep Dive

ArrayList is the workhorse of Java collections. For most use cases, it's the right choice - but understanding its internals transforms you from someone who uses ArrayList to someone who uses it optimally. This article dives deep into how ArrayList works, why it makes certain trade-offs, and how to avoid common performance pitfalls.

When you understand the internal array, growth strategy, and memory layout, you'll write code that performs well at scale and avoids the OutOfMemoryError surprises that plague production systems.

๐Ÿ“‹ At a Glance

AspectDetails
Internal StructureDynamic array (Object[] elementData)
Default Capacity10 elements
Growth Strategy1.5x (newCapacity = oldCapacity + oldCapacity >> 1)
Random AccessO(1)
Insertion at EndO(1) amortized
Insertion at IndexO(n) - requires shifting
Memory Overhead~12 bytes per element + unused capacity

๐ŸŽฏ What You'll Learn

After reading this article, you will be able to:

  • Explain the internal structure: How Object[] backs ArrayList
  • Predict memory usage: Calculate actual memory footprint
  • Optimize capacity: Use ensureCapacity() and trimToSize() effectively
  • Avoid common pitfalls: Resize storms, subList traps, inefficient patterns
  • Make informed decisions: When ArrayList beats alternatives

Interactive Visualizer

See ArrayList's internal array, automatic resizing, and element shifting during add/remove operations:

Loading visualizer...

๐Ÿ”ฅ Production Story: The Memory Explosion

A data import service was processing a large file - 10 million records. Everything worked in testing with 100K records. In production, with the full dataset, the service crashed:

TEXT(3 lines)
Code
Loading syntax highlighter...
The problematic code:
JAVA(12 lines)
Code
Loading syntax highlighter...
What happened:
TEXT(13 lines)
Code
Loading syntax highlighter...
The fix:
JAVA(18 lines)
Code
Loading syntax highlighter...
Results:
  • Memory usage reduced by 40%
  • No more resize operations during import
  • Import time decreased by 25% (no copying)

๐Ÿง  Mental Model: ArrayList as a Stretchy Container

Think of ArrayList as an elastic container that holds a fixed-size box inside:

TEXT(30 lines)
Code
Loading syntax highlighter...
Key insight: Most add() operations are O(1). The occasional O(n) resize is "paid back" over many O(1) operations, giving O(1) amortized time.

๐Ÿ”ฌ Deep Dive

1. The Internal Structure

JAVA(19 lines)
Code
Loading syntax highlighter...
Why Object[] and not E[]?
Due to type erasure, you can't create new E[capacity]. The array is Object[], and elements are cast on retrieval:
JAVA(4 lines)
Code
Loading syntax highlighter...

2. The Growth Strategy

When the array is full, ArrayList grows by 50%:

JAVA(18 lines)
Code
Loading syntax highlighter...
Why 1.5x and not 2x?
TEXT(10 lines)
Code
Loading syntax highlighter...

3. Memory Layout and Cache Locality

ArrayList stores references contiguously, which is cache-friendly:

TEXT(34 lines)
Code
Loading syntax highlighter...

4. The add() Operation

JAVA(31 lines)
Code
Loading syntax highlighter...
Performance:
  • add(E) at end: O(1) amortized
  • add(int, E) at index: O(n) due to shifting

5. System.arraycopy() - The Native Powerhouse

JAVA(6 lines)
Code
Loading syntax highlighter...
Why is it so fast?
TEXT(9 lines)
Code
Loading syntax highlighter...

6. Common Operations Complexity

JAVA(42 lines)
Code
Loading syntax highlighter...

7. subList() - The View Trap

JAVA(11 lines)
Code
Loading syntax highlighter...
How subList works internally:
JAVA(13 lines)
Code
Loading syntax highlighter...
Safe usage:
JAVA(7 lines)
Code
Loading syntax highlighter...

โš ๏ธ Common Mistakes

Mistake 1: Not Setting Initial Capacity

โŒ Problem:
JAVA(4 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(5 lines)
Code
Loading syntax highlighter...
Capacity planning formula:
JAVA(5 lines)
Code
Loading syntax highlighter...

Mistake 2: Forgetting trimToSize()

โŒ Problem:
JAVA(7 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(5 lines)
Code
Loading syntax highlighter...

Mistake 3: Inefficient Middle Insertions

โŒ Problem:
JAVA(5 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(13 lines)
Code
Loading syntax highlighter...

Mistake 4: Using contains() in a Loop

โŒ Problem:
JAVA(9 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(9 lines)
Code
Loading syntax highlighter...

Mistake 5: Modifying subList and Original

โŒ Problem:
JAVA(5 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(9 lines)
Code
Loading syntax highlighter...

๐Ÿ› Debug This: The Slow Removal

A developer reports: "Removing elements from my ArrayList takes forever!"

JAVA(11 lines)
Code
Loading syntax highlighter...
What's wrong and how would you fix it?

โœ… Solution:

Two problems:

  1. O(n) removal: Each remove(i) shifts all subsequent elements
  2. Index bug: After removal, index i now points to what was i+1, so we skip elements
JAVA(6 lines)
Code
Loading syntax highlighter...
Fix 1: Iterate backwards
JAVA(8 lines)
Code
Loading syntax highlighter...
Fix 2: Use removeIf (best)
JAVA(4 lines)
Code
Loading syntax highlighter...
Fix 3: Filter to new list
JAVA(5 lines)
Code
Loading syntax highlighter...

๐Ÿ’ป Exercises

Exercise 1: Capacity Benchmark

โญโญ Difficulty: Medium | โฑ๏ธ Time: 15 minutes

Task: Benchmark adding 1 million elements with different initial capacities.
JAVA(7 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(60 lines)
Code
Loading syntax highlighter...

Exercise 2: Memory Analysis

โญโญโญ Difficulty: Medium | โฑ๏ธ Time: 15 minutes

Task: Calculate the memory footprint of ArrayList vs raw array.
JAVA(7 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(70 lines)
Code
Loading syntax highlighter...

Exercise 3: Implement GrowableArray

โญโญโญโญ Difficulty: Hard | โฑ๏ธ Time: 25 minutes

Task: Implement a simplified ArrayList from scratch.
JAVA(14 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(92 lines)
Code
Loading syntax highlighter...

Exercise 4: Efficient Batch Removal

โญโญโญ Difficulty: Medium | โฑ๏ธ Time: 15 minutes

Task: Remove all elements matching a predicate efficiently.
JAVA(7 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(56 lines)
Code
Loading syntax highlighter...

Exercise 5: toArray() Patterns

โญโญ Difficulty: Medium | โฑ๏ธ Time: 10 minutes

Task: Demonstrate correct and incorrect ways to use toArray().
JAVA(4 lines)
Code
Loading syntax highlighter...
โœ… Solution:
JAVA(42 lines)
Code
Loading syntax highlighter...

๐Ÿ“ Summary & Key Takeaways

Performance Characteristics

OperationTimeNotes
get(index)O(1)Direct array access
set(index, e)O(1)Direct array access
add(e)O(1)*Amortized, occasional O(n) resize
add(index, e)O(n)Requires shifting
remove(index)O(n)Requires shifting
contains(o)O(n)Linear search
iterator.remove()O(n)Requires shifting
removeIf()O(n)Single pass, bulk shift

Capacity Planning

JAVA(11 lines)
Code
Loading syntax highlighter...

When to Use ArrayList

Use CaseArrayList?Alternative
Default listโœ… Yes-
Random accessโœ… Yes-
Add at endโœ… Yes-
Frequent contains()โŒ NoHashSet
Frequent add at frontโŒ NoArrayDeque
Frequent insert in middleโš ๏ธ DependsLinkedList*
Memory-constrainedโš ๏ธ CarefulArray

*LinkedList rarely wins in practice due to cache locality.


๐ŸŽค Senior-Level Interview Questions

Question #1: What is the default size of ArrayList and why?

What interviewers want to hear: Understanding of the default capacity and reasoning.
Strong answer: The default capacity is 10 elements. This is a balance between memory waste (too large) and resize overhead (too small). For typical use cases, lists often have 10 or fewer elements. If you construct with new ArrayList<>(), it actually starts with capacity 0 and grows to 10 on the first add - this optimization avoids allocating arrays for lists that might never be used.

Question #2: How does ArrayList grow when it runs out of space?

What interviewers want to hear: Understanding of the growth formula and trade-offs.
Strong answer: ArrayList grows by 50% each time it's full: newCapacity = oldCapacity + (oldCapacity >> 1). This 1.5x growth factor balances two concerns: minimizing resize operations (want larger growth) and minimizing wasted capacity (want smaller growth). The formula uses bit shifting (>> 1) instead of division or multiplication for performance. This gives us amortized O(1) add operations - even though occasional resizes are O(n), they happen infrequently enough that the average is constant.

Question #3: When should you use ensureCapacity()?

What interviewers want to hear: Practical optimization knowledge.
Strong answer: Use ensureCapacity() before adding many elements when you know the count in advance. This prevents multiple resize operations during bulk additions. For example, before addAll() with a large collection, or before a loop that adds many elements. Without it, adding 1 million elements to a default ArrayList triggers ~27 resize operations, each copying all existing elements. With ensureCapacity(1_000_000), there's zero resizing.

Question #4: What does subList() return - a copy or a view?

What interviewers want to hear: Understanding of view semantics and the trap.
Strong answer: subList() returns a view backed by the original list, not a copy. Changes to the sublist affect the original, and vice versa. The dangerous part: if you structurally modify the original list (add/remove), the sublist becomes invalid and throws ConcurrentModificationException on any access. This catches many developers by surprise. If you need an independent list, explicitly copy: new ArrayList<>(original.subList(start, end)).

Question #5: Why is ArrayList faster than LinkedList for random access?

What interviewers want to hear: Understanding of data structure fundamentals and memory.
Strong answer: ArrayList uses a contiguous array, so get(i) is a single memory access: elementData[i] - O(1). LinkedList must traverse from the head (or tail, whichever is closer), following next references one by one - O(n/2) average, O(n) worst case. Beyond algorithmic complexity, ArrayList has better cache locality: array elements are contiguous in memory, so accessing one often brings nearby elements into CPU cache. LinkedList nodes are scattered across the heap, causing cache misses on every hop.

Question #6: How does System.arraycopy() work and why is it fast?

What interviewers want to hear: Understanding of native optimization.
Strong answer: System.arraycopy() is a native method implemented in C/C++, not Java bytecode. It can use optimized CPU instructions like memcpy or memmove, and potentially SIMD (Single Instruction Multiple Data) for bulk operations. Unlike a Java loop that checks bounds on every iteration, arraycopy checks bounds once at the start then copies in bulk. Benchmarks show it's 5-10x faster than a manual loop. It's the engine behind ArrayList's resize operations and many other JDK internal operations.

๐Ÿ Conclusion

ArrayList is simple in concept but rich in optimization opportunities. Understanding its internals transforms how you write code:

  • Pre-size when you know the count
  • Trim after building large collections
  • Avoid middle insertions or use alternatives
  • Use removeIf() for bulk removal
  • Be cautious with subList() views
Key takeaways:
  1. Default capacity is 10, grows by 1.5x
  2. Pre-allocate for known sizes to avoid resize storms
  3. Add at end is O(1) amortized; middle insertion is O(n)
  4. subList() returns a view, not a copy
  5. System.arraycopy() makes bulk operations fast
  6. Cache locality makes ArrayList faster than LinkedList for most workloads
Your next step: Continue to Part 7 (LinkedList) to understand when linked structures actually win, or Part 8 (List Utilities) to master the Collections helper methods.

๐Ÿ“… Review Schedule for This Article

DayTaskTime
Day 1Review growth strategy and capacity planning5 min
Day 3Redo Exercise 1 (Capacity Benchmark)15 min
Day 7Answer interview questions without looking10 min
Day 14Redo Debug This exercise10 min
Day 30Review operation complexity table5 min

Next in series: [Part 7: LinkedList & When Linked Structures Win]