Skip to main content

System Design Interview: URL Shortener

October 14, 2025By CTO16 min read
...
questions

Complete system design interview question for building a URL shortener like bit.ly, including requirements gathering, architecture, and trade-offs.

Role: Senior Engineer / Staff Engineer
Level: Senior
Type: System Design

System Design Interview: URL Shortener

Design a URL shortening service like bit.ly or TinyURL. This is a classic system design interview question that tests understanding of scalability, distributed systems, and trade-offs.

Interview Format (45 minutes)

Time Allocation:

  • Requirements gathering: 5-8 minutes
  • High-level design: 10-15 minutes
  • Deep dive: 15-20 minutes
  • Scale and edge cases: 5-10 minutes

Step 1: Requirements Gathering (5-8 min)

A strong candidate will ask clarifying questions before jumping to solutions.

Functional Requirements

Good questions to ask:

  • What's the primary use case? (shortening URLs)
  • Do shortened URLs expire? (yes, configurable TTL)
  • Can users customize short URLs? (yes, but most are auto-generated)
  • Do we need analytics? (yes, click tracking)
  • Do we need user accounts? (optional, focus on core first)

Agreed requirements:

  1. Given a long URL, return a short URL
  2. Given a short URL, redirect to original long URL
  3. URLs expire after configurable time (default: 1 year)
  4. Track basic analytics (click count)
  5. Custom short codes (optional)

Non-Functional Requirements

Good questions to ask:

  • Expected scale? (100M URLs, 10B redirects/month)
  • Read/write ratio? (read-heavy, 100:1)
  • Latency requirements? (redirects {"<"}100ms)
  • Availability requirements? (99.9% uptime)
  • Geographic distribution? (global)

Agreed requirements:

  • High availability (redirects critical)
  • Low latency ({"<"}100ms for redirects)
  • Scalable (10B redirects/month = ~3,800 req/sec average, 20K peak)
  • Durable (URLs shouldn't be lost)

Calculations

Storage:

100M URLs/year
500 bytes per URL record (original URL + metadata)
500 bytes × 100M = 50GB/year
Over 5 years = 250GB (very manageable)

Bandwidth:

10B redirects/month
Assume 200 byte response
10B × 200 bytes = 2TB/month = 66GB/day
Peak: 20K req/sec × 200 bytes = 4MB/sec

QPS:

Average: 10B/month = 3,800 req/sec
Peak (assume 2x): ~8,000 req/sec

Red flags if candidate:

  • Doesn't ask clarifying questions
  • Jumps straight to implementation details
  • Doesn't think about scale
  • Ignores read/write ratio

Step 2: High-Level Design (10-15 min)

API Design

POST /api/shorten
Request:
{
  "url": "https://example.com/very/long/url",
  "customCode": "my-link",  // optional
  "expiresAt": "2026-01-01"  // optional
}

Response:
{
  "shortUrl": "https://short.ly/abc123",
  "code": "abc123",
  "expiresAt": "2026-01-01T00:00:00Z"
}

GET /{code}
Response: 302 Redirect to original URL

Good candidate discusses:

  • RESTful API design
  • HTTP 302 (temporary) vs 301 (permanent) redirect
  • Error handling (expired, not found)

Core Components

┌─────────────┐
│   Client    │
└──────┬──────┘
       │
   ┌───▼────────┐
   │  Load      │
   │  Balancer  │
   └───┬────────┘
       │
   ┌───▼──────────────┐
   │  API Servers     │
   │  (Stateless)     │
   └───┬──────────────┘
       │
   ┌───▼───────────────┐
   │    Database       │
   │  (URL mappings)   │
   └───────────────────┘

Tables:

sql
CREATE TABLE urls (
    id BIGSERIAL PRIMARY KEY,
    short_code VARCHAR(10) UNIQUE NOT NULL,
    long_url TEXT NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    expires_at TIMESTAMP,
    click_count INTEGER DEFAULT 0,
    user_id INTEGER  -- optional
);

CREATE INDEX idx_short_code ON urls(short_code);
CREATE INDEX idx_expires_at ON urls(expires_at) WHERE expires_at IS NOT NULL;

Step 3: Deep Dive (15-20 min)

Short Code Generation

Approach 1: Random + Collision Check

python
import random
import string

def generate_short_code(length=6):
    characters = string.ascii_letters + string.digits  # 62 characters
    while True:
        code = ''.join(random.choices(characters, k=length))

        # Check if exists
        if not db.exists(code):
            return code

Pros:

  • Simple to implement
  • Evenly distributed

Cons:

  • Requires database check (latency)
  • Collision rate increases over time
  • Not suitable for high-write scenarios

Math:

62^6 = ~57 billion possible codes
After 100M URLs (< 0.2% full), collision probability is low
But still requires DB check

Approach 2: Auto-incrementing ID + Base62 Encoding

python
BASE62 = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"

def encode_base62(num):
    if num == 0:
        return BASE62[0]

    result = []
    while num > 0:
        result.append(BASE62[num % 62])
        num //= 62

    return ''.join(reversed(result))

def decode_base62(code):
    num = 0
    for char in code:
        num = num * 62 + BASE62.index(char)
    return num

# Example
id = 12345
code = encode_base62(id)  # "dnh"

Pros:

  • No collisions
  • No database lookup needed
  • Deterministic

Cons:

  • Sequential (predictable)
  • Exposes number of URLs
  • Doesn't work with custom codes

Approach 3: Hash Function (MD5/SHA256)

python
import hashlib

def generate_short_code(url):
    # Generate hash
    hash_digest = hashlib.md5(url.encode()).hexdigest()

    # Take first 6 characters
    code = hash_digest[:6]

    # Handle collisions
    if db.exists(code):
        # Add counter to URL and retry
        code = hash_digest[:6] + str(counter)

    return code

Pros:

  • Same URL always gets same code
  • Can deduplicate URLs

Cons:

  • Collisions possible
  • Still need database check
  • Longer codes to reduce collisions

Recommended Approach: Combination

python
def generate_short_code(url, custom_code=None):
    if custom_code:
        # Validate and use custom code
        if db.exists(custom_code):
            raise ValueError("Custom code already exists")
        return custom_code

    # Use auto-increment + base62 for generated codes
    next_id = get_next_id()  # from ID generator service
    return encode_base62(next_id)

Strong candidate discusses:

  • Trade-offs of each approach
  • Collision handling
  • Scalability implications
  • Custom code support

ID Generation at Scale

Problem: Single database auto-increment doesn't scale

Solution: Distributed ID Generation

Option 1: UUID

python
import uuid

id = uuid.uuid4()  # 128-bit

Pros: No coordination needed Cons: 128-bit is too long for short URLs

Option 2: Twitter Snowflake

64-bit ID structure:
[1 bit unused][41 bits timestamp][10 bits machine ID][12 bits sequence]

Allows:
- 2^41 milliseconds (~69 years)
- 2^10 machines (1024)
- 2^12 IDs per millisecond per machine (4096)
python
class SnowflakeIDGenerator:
    def __init__(self, machine_id):
        self.machine_id = machine_id
        self.sequence = 0
        self.last_timestamp = -1

    def generate_id(self):
        timestamp = current_timestamp_ms()

        if timestamp == self.last_timestamp:
            self.sequence = (self.sequence + 1) & 0xFFF
            if self.sequence == 0:
                # Wait for next millisecond
                timestamp = wait_next_millis(timestamp)
        else:
            self.sequence = 0

        self.last_timestamp = timestamp

        return ((timestamp << 22) |
                (self.machine_id << 12) |
                self.sequence)

Option 3: Pre-generated Range

python
# Coordination service assigns ranges
# Server 1: 1-100,000
# Server 2: 100,001-200,000

class RangeIDGenerator:
    def __init__(self):
        self.current = get_range_from_coordinator()  # 1-100,000
        self.end = self.current + 100000

    def generate_id(self):
        if self.current >= self.end:
            # Get new range
            self.current = get_range_from_coordinator()
            self.end = self.current + 100000

        self.current += 1
        return self.current

Best for URL shortener: Snowflake or Range-based

Caching Strategy

Read-heavy workload (100:1 ratio) → Perfect for caching

Cache hit ratio target: >99%

Implementation:

python
def redirect(short_code):
    # Check cache first
    long_url = cache.get(f"url:{short_code}")

    if long_url is None:
        # Cache miss: fetch from database
        url_record = db.get_by_code(short_code)

        if url_record is None or url_record.expired():
            return 404

        long_url = url_record.long_url

        # Cache for 24 hours
        cache.set(f"url:{short_code}", long_url, ttl=86400)

    # Async analytics update
    queue.publish("click", {
        "code": short_code,
        "timestamp": now()
    })

    return redirect_to(long_url)

Cache invalidation:

python
# When URL expires or is deleted
def delete_url(short_code):
    db.delete(short_code)
    cache.delete(f"url:{short_code}")

Strong candidate discusses:

  • Cache-aside pattern
  • TTL strategy
  • Invalidation approach
  • Cache warming for hot URLs

Analytics

Challenge: Updating click_count on every request kills performance

Solution 1: Async Updates

python
# Write to queue instead of DB
def track_click(short_code):
    queue.publish("analytics", {
        "code": short_code,
        "timestamp": now(),
        "user_agent": request.user_agent,
        "ip": request.ip
    })

# Consumer batches updates
def analytics_consumer():
    while True:
        messages = queue.consume(batch_size=1000)

        # Batch update
        updates = Counter(msg["code"] for msg in messages)
        db.batch_update_click_counts(updates)

Solution 2: Write to Time-Series DB

python
# Use InfluxDB or TimescaleDB
def track_click(short_code):
    influxdb.write_point(
        measurement="clicks",
        tags={"code": short_code},
        fields={"count": 1},
        timestamp=now()
    )

# Query for analytics
def get_clicks(short_code, start_date, end_date):
    return influxdb.query(f"""
        SELECT SUM(count)
        FROM clicks
        WHERE code = '{short_code}'
          AND time >= '{start_date}'
          AND time <= '{end_date}'
    """)

Step 4: Scale and Edge Cases (5-10 min)

Scaling to Billions

Updated Architecture:

┌────────────────────────────────────────────────┐
│              CDN / Cloudflare                  │
└────────────────────┬───────────────────────────┘
                     │
┌────────────────────▼───────────────────────────┐
│           Load Balancer (Global)               │
└────────────────────┬───────────────────────────┘
                     │
        ┌────────────┼────────────┐
        │            │            │
   ┌────▼────┐  ┌───▼─────┐  ┌──▼──────┐
   │  API    │  │  API    │  │  API    │
   │Server 1 │  │Server 2 │  │Server 3 │
   └────┬────┘  └───┬─────┘  └──┬──────┘
        │           │           │
        └────────┬──┴───────────┘
                 │
        ┌────────▼──────────┐
        │   Redis Cluster   │
        │   (Cache Layer)   │
        └────────┬──────────┘
                 │
        ┌────────▼──────────┐
        │  PostgreSQL       │
        │  (Sharded)        │
        │  - Shard 1: a-m   │
        │  - Shard 2: n-z   │
        └───────────────────┘

Database Sharding:

python
def get_shard(short_code):
    # Hash-based sharding
    shard_count = 16
    shard_id = hash(short_code) % shard_count
    return f"shard_{shard_id}"

def get_url(short_code):
    shard = get_shard(short_code)
    return db_connections[shard].get(short_code)

Rate Limiting

python
from redis import Redis

def check_rate_limit(ip_address):
    key = f"ratelimit:{ip_address}:{current_minute()}"
    count = redis.incr(key)

    if count == 1:
        redis.expire(key, 60)

    if count > 100:  # 100 requests per minute
        raise RateLimitExceeded()

Security Considerations

1. Malicious URLs

python
BLACKLIST_DOMAINS = ["malware.com", "phishing.com"]

def validate_url(url):
    domain = extract_domain(url)
    if domain in BLACKLIST_DOMAINS:
        raise ValueError("URL not allowed")

    # Check against safe browsing API
    if not safe_browsing_check(url):
        raise ValueError("URL flagged as unsafe")

2. Abuse Prevention

python
# Limit URLs per IP
def check_creation_limit(ip_address):
    key = f"create:{ip_address}:{current_day()}"
    count = redis.incr(key)

    if count == 1:
        redis.expire(key, 86400)

    if count > 10:  # 10 URLs per day per IP
        raise TooManyURLs()

3. DDoS Protection

  • Rate limiting
  • CAPTCHA for suspicious traffic
  • WAF (Web Application Firewall)

Monitoring and Alerts

python
# Metrics to track
metrics = {
    "api.latency.p95": target_ms < 100,
    "api.error_rate": target < 0.1%,
    "cache.hit_rate": target > 99%,
    "db.query_time.p95": target < 50ms,
    "redirect.success_rate": target > 99.9%
}

# Alerts
if cache_hit_rate < 95%:
    alert("Cache hit rate dropped")

if p95_latency > 200ms:
    alert("Latency degraded")

Evaluation Rubric

Strong Performance (Hire)

  • ✅ Asks clarifying questions
  • ✅ Considers scale from the start
  • ✅ Discusses multiple approaches with trade-offs
  • ✅ Designs scalable solution
  • ✅ Handles edge cases
  • ✅ Thinks about operations (monitoring, debugging)
  • ✅ Clear communication and diagrams

Adequate Performance (Maybe)

  • ✅ Functional design
  • ✅ Some scaling considerations
  • ⚠️ Limited exploration of alternatives
  • ⚠️ Misses some edge cases
  • ✅ Can be guided to better solutions

Weak Performance (No Hire)

  • ❌ Jumps to solution without requirements
  • ❌ Doesn't consider scale
  • ❌ Can't explain trade-offs
  • ❌ Monolithic thinking
  • ❌ Poor communication

Follow-up Questions

For senior candidates:

  • How would you handle 1 trillion URLs?
  • Design the analytics system in detail
  • How would you migrate this system with zero downtime?
  • How would you handle geographic distribution?

For staff+ candidates:

  • Design the organization structure for this service
  • How would you ensure 99.99% uptime?
  • Design the disaster recovery plan
  • How would you handle compliance (GDPR, data retention)?

This question tests foundational distributed systems concepts: caching, sharding, ID generation, and trade-offs. A strong candidate will navigate these topics methodically while engaging in technical discussion.

Want more insights like this?

Join thousands of CTOs and technical leaders getting weekly insights on leadership and system design.

No spam. Unsubscribe anytime.