โ† Back to All Questions
Very Hard~60 minMessaging Systems

Design Distributed Message Streaming Platform

LinkedInConfluentAWSGoogleUber

๐Ÿ“ Problem Description

Design a distributed message streaming platform like Apache Kafka. Support high-throughput message ingestion, durable storage, consumer groups, and exactly-once semantics.

๐Ÿ‘ค Use Cases

1.
Producer wants to publishes message so that message stored durably
2.
Consumer wants to reads messages so that receives in order
3.
Consumer Group wants to scales consumers so that partitions distributed
4.
Stream Processor wants to transforms data so that writes to new topic

โœ… Functional Requirements

  • โ€ขPublish messages to topics
  • โ€ขTopic partitioning for parallelism
  • โ€ขConsumer groups with rebalancing
  • โ€ขMessage retention (time or size based)
  • โ€ขSeek to offset/timestamp
  • โ€ขReplication for durability

โšก Non-Functional Requirements

  • โ€ขThroughput: 1M messages/sec
  • โ€ขLatency: < 10ms for publish
  • โ€ขDurability: No message loss
  • โ€ขOrdering: Per-partition guarantee

โš ๏ธ Constraints & Assumptions

  • โ€ขOrdering only within partition
  • โ€ขMust handle broker failures
  • โ€ขConsumer rebalancing can cause delays

๐Ÿ“Š Capacity Estimation

๐Ÿ‘ฅ Users
1000 producers, 10000 consumers
๐Ÿ’พ Storage
100TB (7-day retention)
โšก QPS
Writes: 1M/sec, Reads: 5M/sec
๐Ÿ“ Assumptions
  • โ€ข 1M messages/sec ingestion
  • โ€ข Average message: 1KB
  • โ€ข 7-day retention
  • โ€ข 10,000 partitions across 100 topics
  • โ€ข Replication factor: 3
  • โ€ข 5:1 read-to-write ratio (consumers replay)

๐Ÿ’ก Key Concepts

CRITICAL
Log-Structured Storage
Append-only log segments for sequential I/O.
CRITICAL
ISR (In-Sync Replicas)
Replicas that are caught up with leader.
HIGH
Consumer Offset
Track consumption position per partition.
HIGH
Partition Rebalancing
Redistribute partitions when consumers change.

๐Ÿ’ก Interview Tips

  • ๐Ÿ’กStart with the core concepts: topics, partitions, consumer groups
  • ๐Ÿ’กEmphasize the log-based architecture and its benefits
  • ๐Ÿ’กDiscuss exactly-once semantics and idempotent producers
  • ๐Ÿ’กBe prepared to explain consumer group rebalancing
  • ๐Ÿ’กKnow the tradeoffs between throughput and durability
  • ๐Ÿ’กUnderstand the difference between Kafka and traditional message queues