Algorithms

Graph Algorithms in Practice

πŸ“‹ At a Glance

AspectDetails
Time to Read35 minutes
PrerequisitesParts 11-14 (Graph algorithms)
Key ConceptsSocial Networks, PageRank, Community Detection, Recommendations
Difficulty⭐⭐⭐ (Intermediate-Advanced)
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           GRAPH ALGORITHMS IN THE REAL WORLD                    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  SOCIAL NETWORKS              SEARCH & RANKING                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”‚
β”‚  β”‚  U1 ─── U2      β”‚          β”‚    PageRank     β”‚               β”‚
β”‚  β”‚   β”‚ β•²   β”‚       β”‚          β”‚   β”Œβ”€β”€β”€β”         β”‚               β”‚
β”‚  β”‚   β”‚  β•²  β”‚       β”‚          β”‚  β†’β”‚ A β”‚β†’ 0.35   β”‚               β”‚
β”‚  β”‚  U3 ─── U4      β”‚          β”‚   β””β”€β”€β”€β”˜         β”‚               β”‚
β”‚  β”‚                 β”‚          β”‚    ↓ ↓          β”‚               β”‚
β”‚  β”‚ Friend-of-friendβ”‚          β”‚  β”Œβ”€β”€β”€β” β”Œβ”€β”€β”€β”    β”‚               β”‚
β”‚  β”‚ recommendations β”‚          β”‚  β”‚ B β”‚ β”‚ C β”‚    β”‚               β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚  β””β”€β”€β”€β”˜ β””β”€β”€β”€β”˜    β”‚               β”‚
β”‚                               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
β”‚                                                                 β”‚
β”‚  COMMUNITY DETECTION          RECOMMENDATIONS                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”          β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”               β”‚
β”‚  β”‚ β”Œβ”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β” β”‚          β”‚  User β†’ Item    β”‚               β”‚
β”‚  β”‚ β”‚     β”‚ β”‚     β”‚ β”‚          β”‚   ↓      ↓      β”‚               β”‚
β”‚  β”‚ β”‚  A  │─│  B  β”‚ β”‚          β”‚ Similar users   β”‚               β”‚
β”‚  β”‚ β”‚     β”‚ β”‚     β”‚ β”‚          β”‚ bought this     β”‚               β”‚
β”‚  β”‚ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜ β”‚          β”‚                 β”‚               β”‚
β”‚  β”‚   Clusters      β”‚          β”‚ Collaborative   β”‚               β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚ filtering       β”‚               β”‚
β”‚                               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🎯 What You'll Learn

After completing this article, you will be able to:

  1. Analyze social networks: Calculate centrality measures, find influencers
  2. Implement PageRank: Understand and code the algorithm behind Google
  3. Detect communities: Use modularity-based algorithms
  4. Build recommendation systems: Graph-based collaborative filtering
  5. Use graph databases: Basic Neo4j patterns

πŸ”₯ Production Story: The Friend Recommendation Engine

A social gaming platform grew to 50 million users. Their friend recommendation system was struggling - suggestions were irrelevant, and user engagement was dropping.

The Problem

❌ Original system (2018):
- Simple: "People who played same game"
- Result: Recommending strangers with one common game
- Click-through rate: 2.1%
- Friend request acceptance: 0.8%

The team realized they needed to understand the social graph structure.

The Investigation

JAVA(17 lines)
Code
Loading syntax highlighter...

The Solution

They implemented a multi-signal graph-based approach:

JAVA(54 lines)
Code
Loading syntax highlighter...

The Impact

MetricBeforeAfterImprovement
Click-through rate2.1%8.7%+314%
Friend request acceptance0.8%4.2%+425%
Daily active friendships formed12K89K+642%
User retention (30-day)34%47%+38%
Key insight: Social graph structure reveals relationship potential better than content similarity alone.

🧠 Mental Model: The Social Graph

Think of a social network as a city with different neighborhoods:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    SOCIAL GRAPH AS A CITY                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  CENTRALITY = IMPORTANCE                                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β”‚  Degree: How many direct connections (popular person)   β”‚   β”‚
β”‚  β”‚     πŸ‘€β”€πŸ‘€β”€πŸ‘€                                            β”‚   β”‚
β”‚  β”‚        β””β”€πŸ‘€                                             β”‚   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β”‚  Betweenness: How many paths go through you (broker)    β”‚   β”‚
β”‚  β”‚     πŸ‘€β”€β”€[πŸ‘€]β”€β”€πŸ‘€   (bridge between groups)             β”‚   β”‚
β”‚  β”‚     πŸ‘€     πŸ‘€                                          β”‚   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β”‚  Closeness: How quickly you can reach everyone          β”‚   β”‚
β”‚  β”‚     (central location)                                  β”‚   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β”‚  PageRank: Who points to you matters                    β”‚   β”‚
β”‚  β”‚     (endorsed by important people)                      β”‚   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                β”‚
β”‚  COMMUNITIES = NEIGHBORHOODS                                   β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”‚   β”‚
β”‚  β”‚    β”‚ Gaming  │──────│ Bridge  │──────│  Tech   β”‚        β”‚   β”‚
β”‚  β”‚    β”‚ Friends β”‚      β”‚ Person  β”‚      β”‚ Friends β”‚        β”‚   β”‚
β”‚  β”‚    β”‚ πŸ‘€πŸ‘€πŸ‘€  β”‚      β”‚   πŸ‘€    β”‚      β”‚ πŸ‘€πŸ‘€πŸ‘€  β”‚       β”‚   β”‚
β”‚  β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β”‚   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β”‚    Dense connections within, sparse between             β”‚   β”‚
β”‚  β”‚                                                         β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”¬ Deep Dive: Centrality Measures

Degree Centrality

The simplest measure - just count connections:

JAVA(34 lines)
Code
Loading syntax highlighter...
Use case: Finding popular users, identifying hubs.

Betweenness Centrality

Measures how often a node appears on shortest paths:

JAVA(90 lines)
Code
Loading syntax highlighter...
Visualization:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 BETWEENNESS CENTRALITY                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚     A ─── B ─── C                                               β”‚
β”‚     β”‚           β”‚                                               β”‚
β”‚     D ───[E]─── F     E has high betweenness                    β”‚
β”‚     β”‚           β”‚     (bridge between left and right)           β”‚
β”‚     G ─── H ─── I                                               β”‚
β”‚                                                                 β”‚
β”‚  Shortest paths through E:                                      β”‚
β”‚  A↔F, A↔I, B↔D, B↔G, C↔D, C↔G, etc.                             β”‚
β”‚                                                                 β”‚
β”‚  Betweenness(E) = Ξ£ (paths through E / total paths)             β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Use case: Finding brokers, identifying bottlenecks, detecting influential spreaders.

Closeness Centrality

How close a node is to all others:

JAVA(48 lines)
Code
Loading syntax highlighter...

πŸ”¬ Deep Dive: PageRank Algorithm

The Intuition

PageRank models a "random surfer" browsing the web:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    PAGERANK INTUITION                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  Random Surfer Model:                                           β”‚
β”‚  1. Start on a random page                                      β”‚
β”‚  2. With probability d (0.85), follow a random link             β”‚
β”‚  3. With probability (1-d), jump to random page                 β”‚
β”‚  4. PageRank = long-term probability of being on each page      β”‚
β”‚                                                                 β”‚
β”‚  Example:                                                       β”‚
β”‚       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                                       β”‚
β”‚       β”‚                 β”‚                                       β”‚
β”‚       β–Ό                 β”‚                                       β”‚ 
β”‚       A ────→ B ────→ C β”‚                                       β”‚
β”‚       β”‚       β”‚         β”‚                                       β”‚
β”‚       β”‚       β–Ό         β”‚                                       β”‚
β”‚       └─────→ D β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                       β”‚
β”‚                                                                 β”‚
β”‚  Page C: Only incoming from B                                   β”‚
β”‚  Page A: Incoming from C (sole outlink), D                      β”‚
β”‚  A likely has higher PageRank than C                            β”‚
β”‚                                                                 β”‚
β”‚  Formula:                                                       β”‚
β”‚  PR(A) = (1-d)/N + d Γ— Ξ£ PR(v)/out(v)                           β”‚
β”‚              for all v linking to A                             β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Implementation

JAVA(99 lines)
Code
Loading syntax highlighter...

Matrix Formulation

PageRank can be computed using matrix operations:

JAVA(45 lines)
Code
Loading syntax highlighter...

Personalized PageRank

For recommendations, bias the random jump toward specific nodes:

JAVA(35 lines)
Code
Loading syntax highlighter...
Use case: "Find pages similar to my favorites" - seed with user's liked pages.

πŸ”¬ Deep Dive: Community Detection

Modularity

Modularity measures the quality of a community partition:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    MODULARITY CONCEPT                           β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  Q = (1/2m) Γ— Ξ£ [A_ij - (k_i Γ— k_j)/(2m)] Γ— Ξ΄(c_i, c_j)         β”‚
β”‚                                                                 β”‚
β”‚  Where:                                                         β”‚
β”‚  - A_ij: adjacency matrix (1 if edge exists)                    β”‚
β”‚  - k_i: degree of node i                                        β”‚
β”‚  - m: total number of edges                                     β”‚
β”‚  - c_i: community of node i                                     β”‚
β”‚  - Ξ΄: 1 if same community, 0 otherwise                          β”‚
β”‚                                                                 β”‚
β”‚  Intuition: Q > 0 means more edges within communities           β”‚
β”‚            than expected by random chance                       β”‚
β”‚                                                                 β”‚
β”‚  Good partition: Q β‰ˆ 0.3 to 0.7                                 β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Louvain Algorithm

Fast and effective community detection:

JAVA(127 lines)
Code
Loading syntax highlighter...

Label Propagation

Simpler algorithm, often faster:

JAVA(86 lines)
Code
Loading syntax highlighter...

πŸ”¬ Deep Dive: Graph-Based Recommendations

Collaborative Filtering with Graphs

JAVA(95 lines)
Code
Loading syntax highlighter...

Random Walk Recommendations

JAVA(56 lines)
Code
Loading syntax highlighter...

πŸ”¬ Deep Dive: Graph Databases (Neo4j Patterns)

Basic Cypher Queries

CYPHER(30 lines)
Code
Loading syntax highlighter...

Java Integration with Neo4j

JAVA(51 lines)
Code
Loading syntax highlighter...

⚠️ Common Mistakes

Mistake 1: PageRank Dangling Nodes

JAVA(16 lines)
Code
Loading syntax highlighter...

Mistake 2: Community Detection Convergence

JAVA(15 lines)
Code
Loading syntax highlighter...

Mistake 3: Jaccard Similarity Division by Zero

JAVA(6 lines)
Code
Loading syntax highlighter...

πŸ› Debug This: PageRank Convergence Issue

This PageRank implementation doesn't converge properly. Can you find the bug?

JAVA(39 lines)
Code
Loading syntax highlighter...
πŸ” Click to reveal the bug
Bug: Using rank[v] instead of rank[u] in the contribution calculation.
JAVA(5 lines)
Code
Loading syntax highlighter...
Why this matters:
PageRank formula: PR(v) = (1-d)/n + d Γ— Ξ£ PR(u)/out(u) for all u linking to v
The contribution to v comes from u's rank, not v's own rank. Using rank[v] makes the update self-referential and prevents proper propagation of rank through the graph.
The symptom: Ranks don't stabilize properly, or all nodes end up with nearly equal rank regardless of graph structure.

πŸ’» Exercises

Exercise 1: Influence Maximization ⭐⭐

Find k nodes that, if they adopt a product, maximize spread through the network (greedy approximation).

JAVA(6 lines)
Code
Loading syntax highlighter...

Predict which edges are likely to form in a social network.

JAVA(9 lines)
Code
Loading syntax highlighter...

Exercise 3: Community Evolution ⭐⭐⭐

Track how communities change over time given snapshots of the network.

JAVA(7 lines)
Code
Loading syntax highlighter...

Exercise 4: Personalized Recommendations ⭐⭐⭐

Implement Personalized PageRank for item recommendations.

JAVA(7 lines)
Code
Loading syntax highlighter...

Exercise 5: Network Robustness ⭐⭐⭐⭐

Analyze how the network's connectivity degrades as nodes fail.

JAVA(7 lines)
Code
Loading syntax highlighter...

🎀 Interview Questions

Q1: "How would you find the most influential person in a social network?"

Answer: It depends on what "influential" means:
Degree centrality (most connections):
JAVA(4 lines)
Code
Loading syntax highlighter...
PageRank (endorsed by important people):
JAVA(3 lines)
Code
Loading syntax highlighter...
Betweenness centrality (controls information flow):
JAVA(3 lines)
Code
Loading syntax highlighter...
For viral marketing: Use influence maximization (greedy + Monte Carlo).

Q2: "Design a 'People You May Know' feature."

Answer:
JAVA(30 lines)
Code
Loading syntax highlighter...
Considerations:
  • Privacy: Don't reveal why someone is recommended
  • Diversity: Don't recommend only from one circle
  • Freshness: Boost recently active users

Q3: "Explain how PageRank handles the 'spider trap' problem."

Answer: A spider trap is a group of pages that link only to each other, which would accumulate all PageRank over time.
Spider trap:    Normal web:
A β†’ B β†’ C       A β†’ B β†’ C
↑     ↓         ↓   ↓   ↓
β””β”€β”€β”€β”€β”€β”˜         external links
Solution: The damping factor (typically 0.85):
PR(A) = (1-d)/N + d Γ— Ξ£ PR(v)/out(v)
         ↑                    ↑
    Random jump          Link following
    (15% chance)         (85% chance)

With probability 15%, the surfer teleports to a random page, escaping the trap. This ensures:

  1. All pages have minimum rank (1-d)/N
  2. Rank flows out of traps through random jumps
  3. Algorithm converges to unique solution

Q4: "How do you detect fake accounts in a social network?"

Answer: Graph-based features reveal anomalies:
JAVA(40 lines)
Code
Loading syntax highlighter...

Q5: "How would you implement real-time recommendations at scale?"

Answer:
Architecture:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                                                                 β”‚
β”‚  1. Offline Pipeline (daily/hourly)                             β”‚
β”‚     - Compute PageRank, communities                             β”‚
β”‚     - Build similarity indices                                  β”‚
β”‚     - Store in Redis/Cassandra                                  β”‚
β”‚                                                                 β”‚
β”‚  2. Near-real-time (minutes)                                    β”‚
β”‚     - Stream processing (Kafka + Flink)                         β”‚
β”‚     - Update user activity vectors                              β”‚
β”‚     - Incremental graph updates                                 β”‚
β”‚                                                                 β”‚
β”‚  3. Real-time (milliseconds)                                    β”‚
β”‚     - Pre-computed candidates from Redis                        β”‚
β”‚     - Light re-ranking with fresh signals                       β”‚
β”‚     - A/B test different algorithms                             β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
JAVA(23 lines)
Code
Loading syntax highlighter...

πŸ“‹ Quick Reference

Algorithm Complexity

AlgorithmTimeSpaceUse Case
Degree CentralityO(V)O(1)Find popular nodes
BetweennessO(VE)O(V)Find brokers
PageRankO(EΓ—iter)O(V)Rank importance
LouvainO(EΓ—iter)O(V)Community detection
Label PropagationO(EΓ—iter)O(V)Fast communities

Key Formulas

Degree Centrality:       C_D(v) = degree(v) / (n-1)
Closeness Centrality:    C_C(v) = (n-1) / Ξ£ d(v,u)
Betweenness Centrality:  C_B(v) = Ξ£ Οƒ_st(v) / Οƒ_st
PageRank:                PR(v) = (1-d)/n + d Γ— Ξ£ PR(u)/out(u)
Modularity:              Q = (1/2m) Γ— Ξ£ [A_ij - k_iΓ—k_j/2m] Γ— Ξ΄(c_i,c_j)
Jaccard Similarity:      J(A,B) = |A ∩ B| / |A βˆͺ B|

Decision Guide

Need to find important nodes?
β”œβ”€β”€ Most connected β†’ Degree centrality
β”œβ”€β”€ Information brokers β†’ Betweenness centrality
β”œβ”€β”€ Endorsed by important β†’ PageRank
└── Quick access to all β†’ Closeness centrality

Need to find groups?
β”œβ”€β”€ High quality β†’ Louvain (modularity optimization)
β”œβ”€β”€ Fast, large graphs β†’ Label Propagation
└── Overlapping communities β†’ BigCLAM, DEMON

Need recommendations?
β”œβ”€β”€ User-based β†’ Find similar users
β”œβ”€β”€ Item-based β†’ Find similar items
└── Graph-based β†’ Random walks, PPR

πŸ”— What's Next?

In Part 16: Dynamic Programming Foundations, we'll explore:
  • Optimal substructure and overlapping subproblems
  • Memoization vs tabulation
  • State definition techniques
  • Classic problems: Fibonacci, climbing stairs

πŸ“… Review Schedule

DayFocusTime
0Full read + exercises90 min
1PageRank implementation15 min
3Community detection code20 min
7Build simple recommender25 min
14Debug exercise + centrality15 min
30Interview questions10 min