📝 Problem Description
Design a system to compute and display the most shared articles in real-time. Handle millions of share events per minute and maintain accurate top-K rankings with minimal latency.
👤 Use Cases
1.
User wants to shares article so that share event recorded
2.
System wants to updates ranking so that top-K list refreshed
3.
User wants to views trending page so that sees current top articles
✅ Functional Requirements
- •Track article share counts in real-time
- •Compute top-K for multiple time windows (1h, 24h, 7d)
- •Handle different share types (FB, Twitter, Email)
- •Display trending articles with share counts
- •Decay older shares for freshness
⚡ Non-Functional Requirements
- •Process 1M shares/minute
- •Top-K update latency < 5 seconds
- •Serve trending page < 100ms
- •High accuracy (eventual consistency acceptable)
⚠️ Constraints & Assumptions
- •Exact counts expensive at scale
- •Top-K must be fresh
- •Millions of articles compete
📊 Capacity Estimation
👥 Users
100M daily shares
💾 Storage
10GB (article metadata, counts)
⚡ QPS
Shares: 15K/sec, Trending: 100K/sec
📐 Assumptions
- • 100M shares per day
- • 10M unique articles shared
- • Top 100 displayed
- • Multiple time windows
💡 Key Concepts
HIGH
Count-Min Sketch
Approximate counting with bounded error.
CRITICAL
Heap-based Top-K
Maintain top K using min-heap of size K.
HIGH
Sliding Windows
Time-decayed windows for freshness.
MEDIUM
Approximate Algorithms
Trade accuracy for speed/memory.
💡 Interview Tips
- 💡Start with the streaming constraint
- 💡Discuss Count-Min Sketch and Space-Saving
- 💡Emphasize the memory efficiency
- 💡Be prepared to discuss windowing
- 💡Know the accuracy vs memory tradeoff
- 💡Understand the heap-based top K extraction