Bloom Filter
๐ Quick Reference
| Property | Value |
|---|---|
| Type | Probabilistic set membership |
| Add | O(k) where k = number of hash functions |
| Query | O(k) - may return false positives |
| Delete | Not supported (standard Bloom filter) |
| Space | Much smaller than storing actual elements |
| Best For | Fast membership checks with acceptable false positives |
๐ฎ Interactive Visualizer
Watch how Bloom Filter uses multiple hash functions:
Loading visualizer...
- Add elements and watch multiple bits get set
- Query existing elements (always found)
- Query non-existent elements (possible false positives)
- Observe how more elements increase false positive rate
๐ง How It Works
Core Concept
Bloom Filter = bit array + k hash functions Adding "hello": h1("hello") = 3 โ set bit[3] = 1 h2("hello") = 7 โ set bit[7] = 1 h3("hello") = 12 โ set bit[12] = 1 Querying "hello": Check bit[3] AND bit[7] AND bit[12] All 1? โ "Possibly in set" Any 0? โ "Definitely not in set"
Key Properties
JAVA(7 lines)CodeLoading syntax highlighter...
Optimal Parameters
Optimal number of hash functions: k = (m/n) * ln(2) โ 0.693 * (m/n) False positive probability: p โ (1 - e^(-kn/m))^k For 1% false positive rate with n elements: m โ 9.6n bits (about 1.2 bytes per element) k โ 7 hash functions
๐ Complexity Analysis
| Operation | Time | Space |
|---|---|---|
add(element) | O(k) | - |
contains(element) | O(k) | - |
| Space (total) | - | O(m) bits |
Space Comparison
| Structure | Space for 1M elements |
|---|---|
| HashSet | ~48 MB (avg 48 bytes/entry) |
| Bloom Filter (1% FP) | ~1.2 MB |
| Bloom Filter (0.1% FP) | ~1.8 MB |
โ When to Use Bloom Filter
Good Use Cases
- Cache filtering - avoid expensive lookups for non-existent items
- Spell checkers - quick "probably misspelled" check
- Network routers - packet filtering
- Database queries - skip disk reads for absent keys
- Web crawlers - avoid revisiting URLs
- Distributed systems - reduce unnecessary network calls
Avoid When
- Need exact membership - can't distinguish "definitely" from "maybe"
- Need deletion - standard Bloom filter doesn't support delete
- Very low false positive tolerance - might need more space than HashSet
- Need to enumerate elements - Bloom filter only answers "is X in set?"
๐ Bloom Filter Variants
| Variant | Feature | Trade-off |
|---|---|---|
| Standard | Simple, fast | No delete, false positives |
| Counting | Supports delete | 4x more space (counters vs bits) |
| Scalable | Grows dynamically | Multiple filters, slightly slower |
| Cuckoo Filter | Delete support, better space | More complex |
Counting Bloom Filter
JAVA(6 lines)CodeLoading syntax highlighter...
๐งฉ Implementation Patterns
Guava BloomFilter (Java)
JAVA(18 lines)CodeLoading syntax highlighter...
Redis Bloom Filter
BASH(10 lines)CodeLoading syntax highlighter...
Cache Pattern
JAVA(26 lines)CodeLoading syntax highlighter...
โ ๏ธ Common Pitfalls
1. Ignoring False Positive Rate
JAVA(5 lines)CodeLoading syntax highlighter...
2. Underestimating Element Count
JAVA(5 lines)CodeLoading syntax highlighter...
3. Treating "Maybe" as "Yes"
JAVA(11 lines)CodeLoading syntax highlighter...
4. Expecting Deletion Support
JAVA(4 lines)CodeLoading syntax highlighter...
๐ค Interview Tips
"A probabilistic data structure for set membership testing. It can tell you "definitely not in set" or "possibly in set". Used for caching, spell-checking, and avoiding expensive lookups when an element doesn't exist.
"False positive: filter says "maybe yes" but element was never added. False negative: filter says "no" but element WAS added. Bloom filters guarantee no false negatives - if it says "no", the element is definitely not in the set.
"Multiple hash functions reduce false positive rate. With k hash functions, an element sets k bits. For a false positive, ALL k bits must have been set by other elements, which is less likely with good hash distribution.
"Given expected elements (n) and desired false positive rate (p): bits m โ -n*ln(p)/(ln(2)ยฒ), hash functions k โ (m/n)*ln(2). For 1% FP rate, use about 10 bits per element and 7 hash functions.
๐ Series Navigation
BloomFilterVisualizer from @tomaszjarosz/react-visualizers