Design Instagram
Design a photo-sharing social media platform like Instagram. Users can upload photos, follow others, like posts, and view a personalized feed.
Instagram requires: (1) CDN for image storage, (2) News feed generation algorithm, (3) Follower/following graph storage, (4) Real-time notifications, (5) Scalable image processing pipeline.
Consider read vs write patterns - which is more frequent?
How do you handle celebrity users with millions of followers?
Think about image sizes - original, thumbnails, different resolutions
How would you ensure feed generation is fast for all users?
functional
- •Upload photos/videos
- •Follow/unfollow users
- •Like and comment on posts
- •Generate personalized news feed
- •Search users and hashtags
- •Real-time notifications
- •Stories feature (24hr expiry)
non functional
- •Handle 500M daily active users
- •Low latency feed generation (<300ms)
- •High availability for uploads
- •Consistent user experience globally
posts
100M photos uploaded per day
users
1B total users, 500M DAU
storage
2MB avg photo × 100M = 200TB/day
bandwidth
Upload: 23GB/s, View: 230GB/s (10:1 ratio)
feed reads
500M users × 20 feed loads = 10B requests/day
components
- •API Gateway - authentication, rate limiting
- •Upload Service - image processing, CDN upload
- •Feed Generation Service - fanout-on-write or read
- •Graph Service - follower relationships
- •Notification Service - push notifications
- •Search Service - Elasticsearch
- •Object Storage - S3/CDN for images
database schema
feed cache
user_id, post_ids[] (Redis sorted set by timestamp)
likes table
user_id, post_id, created_at
posts table
post_id, user_id, image_url, caption, created_at
users table
user_id, username, profile_pic, bio, created_at
follows table
follower_id, followee_id, created_at
feed generation
fanout on read
cons
Slow read (must query and merge)
pros
Fast write, saves storage
description
Compute feed when user requests it
implementation
Query recent posts from followed users, rank
fanout on write
cons
Slow write for users with many followers (celebrities)
pros
Fast read (already computed), simple
description
Pre-compute feed when post is created
implementation
Push post to all followers' feed cache
hybrid approach
Use fanout-on-write for regular users, fanout-on-read for celebrities
image upload flow
- •1. Client uploads image to upload service
- •2. Service validates, generates unique ID
- •3. Store original in object storage (S3)
- •4. Async: Create thumbnails (multiple sizes)
- •5. Upload processed images to CDN
- •6. Create post record in database
- •7. Trigger feed fanout or mark for on-read
ranking algorithm
- •Factors: Recency, likes count, commenter relationship
- •Machine learning: Personalized based on user interests
- •Real-time signals: Recent interactions boost score
- •Shard users table by user_id
- •Shard posts table by post_id
- •Replicate graph database for read scaling
- •Use CDN for global image delivery
- •Cache hot data in Redis (trending posts, user profiles)