MEDIUMDistributed Systems

Design a URL Shortening Service

Design a system like bit.ly that converts long URLs into short, memorable links. The system should handle billions of URLs with high availability and low latency.

Estimated Time: 45 minutes

#url-shortener#distributed-systems#caching#scalability

Solution Overview

A URL shortener requires: (1) Unique ID generation using base62 encoding, (2) Key-value store for URL mapping, (3) Caching layer for popular URLs, (4) Analytics tracking, (5) Rate limiting to prevent abuse.

Hints to Get Started

Start with a simple hash function, then consider collisions

Think about read vs write ratio - this is read-heavy

Consider what happens when a short URL is clicked millions of times

How would you prevent malicious users from exhausting your ID space?

Requirements

functional

•Generate unique short URL for given long URL
•Redirect users from short URL to original URL
•Custom short URLs (optional)
•Expiration time for links
•Analytics (click tracking, referrer data)

non functional

•High availability (99.99%)
•Low latency (<100ms redirect)
•Scalable to billions of URLs
•Handle 1000 requests/second

Capacity Estimation

cache

20% of URLs generate 80% of traffic - cache 20%

storage

500 bytes/URL × 100M = 50GB/month, 600GB/year

traffic

100M new URLs/month, 10B redirects/month

bandwidth

Read-heavy: 100:1 read-to-write ratio

High-Level Architecture

data flow

•1. User submits long URL
•2. Generate unique short code (base62)
•3. Store mapping in database
•4. Return short URL to user
•5. On redirect: Check cache → DB → Return original URL

components

•Load Balancer - distributes requests
•Application Servers - handle business logic
•Database Cluster - stores URL mappings
•Cache Layer (Redis) - fast lookups
•Analytics Service - track clicks

Deep Dive

analytics

storage

Time-series database (InfluxDB) or data warehouse

tracking

Click events, referrer, geo-location, device

real time

Kafka stream processing for live stats

rate limiting

limits

10 URL creations per hour per user

algorithm

Token bucket per user/IP

implementation

Redis with INCR and EXPIRE commands

database schema

indexes

•INDEX on created_at
•INDEX on user_id

partitioning

Shard by hash(short_code) for horizontal scaling

url mappings

user id

INTEGER

long url

TEXT NOT NULL

created at

TIMESTAMP

expires at

TIMESTAMP

short code

VARCHAR(10) PRIMARY KEY

click count

INTEGER DEFAULT 0

caching strategy

ttl

24 hours for popular URLs

cache layer

Redis with LRU eviction

cache warming

Preload trending URLs

what to cache

Top 20% most accessed URLs

cache invalidation

On URL deletion or expiration

unique id generation

base62

Uses [a-zA-Z0-9] = 62 characters

length

7 characters = 62^7 = 3.5 trillion URLs

approach

Base62 encoding of auto-incrementing counter or hash

collision handling

Use distributed ID generator (Snowflake) or check-and-increment

Trade-offs & Considerations

•MD5 hash vs auto-increment: Hash may collide, increment needs coordination
•SQL vs NoSQL: NoSQL better for key-value lookups at scale
•Cache size: Balance memory cost vs cache hit rate

Scalability Strategies

•Horizontal scaling: Add more app servers behind load balancer
•Database sharding: Partition by hash of short code
•Read replicas: For read-heavy workload
•CDN: Cache static content and popular redirects

Design a URL Shortening Service

functional

non functional

cache

storage

traffic

bandwidth

data flow

components

analytics

storage

tracking

real time

rate limiting

limits

algorithm

implementation

database schema

indexes

partitioning

url mappings

user id

long url

created at

expires at

short code

click count

caching strategy

ttl

cache layer

cache warming

what to cache

cache invalidation

unique id generation

base62

length

approach

collision handling

Continue Your Preparation