📝 Problem Description
Design a document management system like Google Docs or Notion that allows users to create, edit, organize, and share documents. Support real-time collaboration, version history, and access control.
👤 Use Cases
1.
User wants to creates a document so that new document is saved and accessible
2.
User wants to edits a document so that changes are auto-saved
3.
User wants to shares document with team so that team members can view/edit based on permissions
4.
Multiple users wants to edit same document so that changes merge in real-time
5.
User wants to views version history so that can see and restore previous versions
✅ Functional Requirements
- •Create, read, update, delete documents
- •Rich text editing (formatting, images, tables)
- •Organize documents in folders/workspaces
- •Share documents with permission levels
- •Real-time collaborative editing
- •Version history and restore
- •Full-text search across documents
- •Comments and suggestions
⚡ Non-Functional Requirements
- •Real-time sync latency < 100ms
- •Support 10M documents
- •Handle 100K concurrent editors
- •No data loss (strong durability)
- •99.9% availability
⚠️ Constraints & Assumptions
- •Documents can be up to 10MB
- •Version history kept for 1 year
- •Maximum 100 concurrent editors per document
📊 Capacity Estimation
👥 Users
1M total users, 100K DAU
💾 Storage
1TB (10M docs × 100KB avg)
⚡ QPS
Document reads: 1K/sec, Edits: 500/sec, Searches: 100/sec
📐 Assumptions
- • 10M total documents
- • Average document size: 100KB
- • 10 versions per document on average
- • 5% of documents edited per day
💡 Key Concepts
CRITICAL
Operational Transformation (OT)
Algorithm to merge concurrent edits from multiple users in real-time.
HIGH
CRDTs
Conflict-free Replicated Data Types - alternative to OT for eventual consistency.
HIGH
Version Snapshots
Periodic full snapshots + operation logs for efficient version restore.
HIGH
Access Control Lists
Fine-grained permissions (owner, editor, commenter, viewer).
💡 Interview Tips
- 💡OT/CRDT is the core complexity - explain at high level
- 💡Version history is important for enterprise
- 💡Permission model matters for sharing