← Back to All Questions
Medium~40 minFile Storage & Transfer

Design Large File Downloader

DropboxGoogleMicrosoftAppleBox

📝 Problem Description

Design a reliable file download service that handles large files efficiently. Support resume on network failure, parallel chunk downloads, and bandwidth optimization.

👤 Use Cases

1.
User wants to requests file download so that download begins with progress
2.
User wants to pauses download so that state saved, resume possible
3.
System wants to detects failure so that resumes from last chunk

✅ Functional Requirements

  • Download files up to 100GB
  • Resume interrupted downloads
  • Parallel chunk downloading
  • Show download progress
  • Verify file integrity (checksum)
  • Bandwidth throttling option

⚡ Non-Functional Requirements

  • Maximize throughput (saturate bandwidth)
  • Minimize storage for incomplete files
  • Handle millions of concurrent downloads
  • CDN integration for edge caching

⚠️ Constraints & Assumptions

  • Network can be unreliable
  • Large files don't fit in memory
  • Multiple users downloading same file

📊 Capacity Estimation

👥 Users
1M concurrent downloads
💾 Storage
100PB total files
⚡ QPS
Download requests: 10K/sec, Chunk requests: 1M/sec
📐 Assumptions
  • 1M concurrent downloads
  • Average file: 1GB
  • Chunk size: 5MB
  • Average download time: 10 minutes

💡 Key Concepts

CRITICAL
HTTP Range Requests
Download specific byte ranges, enabling resume.
HIGH
Parallel Chunk Download
Multiple connections increase throughput.
MEDIUM
Content-Addressable Storage
Dedupe chunks by hash.
HIGH
Signed URLs
Time-limited, secure download links.

💡 Interview Tips

  • 💡Start with the HTTP range request mechanism
  • 💡Discuss the chunk-based download strategy
  • 💡Emphasize the resume capability
  • 💡Be prepared to discuss integrity verification
  • 💡Know the tradeoffs between chunk sizes
  • 💡Understand the parallel download optimization