URL Shortener

A URL shortener takes a long URL like https://www.example.com/articles/how-to-design-scalable-systems-2026 and converts it to a compact alias like https://s.dev/x7k2pQr. When someone visits the short URL, the service redirects them to the original.

You've used this before — bit.ly, TinyURL, and x.com's link shortener (t.co) are all URL shorteners processing millions of redirects per day. Designing one from scratch is the classic "intro to databases and hashing" problem because it teaches you how to think about storage, encoding algorithms, caching, and database trade-offs in a system that is small enough to understand completely.

Every case study in this section follows the same framework used in real system design interviews. This case study works through eight focused steps that map to four core phases:

Clarify constraints (Steps 1–2) — What does the system need to do, and how much traffic must it handle?
High-level design (Step 3) — What are the major components and how do they connect?
Deep dives (Steps 4–7) — How do the trickiest parts actually work?
Trade-offs (Step 8) — What did we give up, and when would we choose differently?

Step 1: Clarify Requirements#

Before drawing any architecture diagrams, always start by asking: what problem are we actually solving? A URL shortener seems obvious, but there are important choices hiding in the details.

Functional Requirements#

These describe what the system does.

Feature	Description	Priority
Shorten a URL	Accept a long URL and return a unique short alias (7 characters)	Core
Redirect	When a user visits a short URL, redirect them to the original long URL	Core
Custom alias	Allow users to choose their own short code (e.g., `/my-brand`)	Optional
Expiration	Allow links to expire after a specified date and time	Optional
Delete	Allow the link creator to delete a short URL	Optional
Click analytics	Track click counts, timestamps, referrers, and geography	Optional

Non-Functional Requirements#

These describe how well the system works.

Property	Requirement	Why It Matters
Availability	99.99% uptime on the redirect path	A down shortener means every shared link on the internet is broken
Latency	Redirects complete in < 10ms at the server layer	Users feel redirects that take > 200ms end-to-end — it erodes trust
Durability	URL mappings must never be lost	Losing a mapping permanently breaks every link that used it
Scalability	Handle a 100:1 read-to-write ratio at peak traffic	Shortening is rare; redirecting is constant — optimize for reads
Uniqueness	Every short code maps to exactly one long URL	Collisions would silently route users to the wrong destination
Security	Rate-limit creation; validate and block malicious URLs	Shorteners are frequently abused to disguise phishing and malware links

The single most important non-functional requirements is availability on the redirect path. A slow shorten operation is annoying. A broken redirect is catastrophic — every published link containing that short URL stops working immediately. Design the redirect path to be fast, simple, and resilient above all else.

Step 2: Back-of-the-Envelope Estimation#

Back-of-the-envelope math validates your assumptions before you commit to a design. At a medium scale, assume 100 million new URLs are created per month.

Metric	Calculation	Result
Write QPS (average)	100M URLs/month ÷ 2.6M seconds/month	~40 writes/second
Write QPS (peak)	40 × 2 (2× peak factor)	~80 writes/second
Read:Write ratio	Assumed 100:1 (redirects far outnumber creations)	—
Read QPS (average)	40 writes/s × 100	~4,000 reads/second
Read QPS (peak)	80 writes/s × 100	~8,000 reads/second
Storage per URL	200 bytes long URL + 7 bytes short code + 50 bytes metadata	~257 bytes/record
URLs over 5 years	100M/month × 60 months	6 billion URLs
Total raw storage	6B × 257 bytes	~1.5 TB
With 3× replication	1.5 TB × 3	~4.5 TB total

To verify the seconds-per-month figure: 30 days × 24 hours × 3,600 seconds = 2,592,000 ≈ 2.6 million seconds. The ÷ 2.6M step simply divides total monthly creations by total monthly seconds to get average per-second throughput.

Key insight from the math: This is a heavily read-dominated system. The redirect path runs 100× more often than the shorten path. Every major architectural decision should optimize for fast reads.

What is Base62? Base62 is a number system that uses 62 symbols instead of the 10 digits we normally use. The 62 symbols are: uppercase A–Z (26), lowercase a–z (26), and digits 0–9 (10). Just as our decimal (Base10) system represents any number with 10 digit symbols, Base62 represents any integer with 62 — much more compactly. A large integer like 1,000,000 encodes to just 5 Base62 characters. At 3.5 trillion, you're still only at 7 characters.

How many unique short codes do we need? A 7-character Base62 string uses 62 characters (A–Z, a–z, 0–9) and can represent 62⁷ ≈ 3.5 trillion unique combinations. At 100 million new URLs per month, that capacity lasts over 2,900 years. Seven characters is more than enough.

Why Base62 and not Base64? Base64 adds two extra symbols — + and / (or - and _ in the URL-safe variant) — to reach 64 characters. Those symbols are either unsafe in a URL path as-is, or require a URL-safe variant that still looks visually noisy to users. Base62 sticks to alphanumeric characters only, which are universally safe in URLs, easy to select and copy, and readable without confusion. The capacity difference is negligible: Base64 gives 64⁷ ≈ 4.4 trillion codes vs. Base62's 3.5 trillion — both are effectively unlimited at realistic scale.

Step 3: High-Level Design#

With requirements and scale understood, here is the high-level architecture.

Rendering diagram...

What each component does:

Load Balancer — Distributes incoming traffic across app servers. Also the right place to enforce rate limits on how many short URLs a single IP or API key can create per hour, blocking abuse before it reaches application code.
App Servers (stateless) — Handle both the shorten and redirect operations. They are stateless — no in-memory session data — so any server can handle any request. This makes horizontal scaling trivial: add more servers when traffic grows without any coordination between them.
Redis Cache — Stores the most frequently accessed short_code → long_url mappings in memory. Since roughly 20% of URLs receive 80% of traffic, caching those hot links serves the vast majority of redirects without touching the database.
Database — The durable, permanent store of all URL mappings. Receives writes (new short URLs) and handles the cache misses that Redis could not serve.
ID Generator — The component responsible for producing unique short codes. We'll deep dive into how this works in Step 4.
Analytics Queue — Click events are enqueued here asynchronously. The redirect itself does not wait for analytics to complete — this decoupling keeps the redirect path fast and simple. Common choices: Kafka (high-throughput, durable log; preferred at large scale), Redis Streams (lightweight, good if Redis is already in the stack), or a managed queue like AWS SQS. A separate analytics consumer service reads from the queue and writes to a time-series or analytics database.

API Design#

Endpoint	Method	Request	Response
`POST /api/v1/shorten`	POST	`{ long_url, custom_alias?, expires_at? }`	`{ short_url }` — HTTP 201 Created
`GET /{short_code}`	GET	Short code in URL path	HTTP 302 redirect to long URL (or 404/410 if not found/expired)
`DELETE /api/v1/{short_code}`	DELETE	API key in Authorization header	HTTP 204 No Content

Why POST for shorten, not GET? Creating a short URL is a write operation with a side effect (inserting a new database record). GET requests must be idempotent — safe to repeat with no side effects, which is how browsers, proxies, and crawlers treat them. POST is the correct HTTP verb for any operation that changes state.

Database Schema#

CREATE TABLE urls (
  id           BIGINT PRIMARY KEY,          -- the counter value (e.g. 1000001)
  short_code   VARCHAR(16) UNIQUE NOT NULL, -- Base62(id) for generated codes; up to 16 chars for custom aliases
  long_url     TEXT NOT NULL,               -- original full URL (validated before insert)
  user_id      BIGINT,                      -- creator (NULL for anonymous)
  custom_alias BOOLEAN DEFAULT FALSE,       -- was this alias user-chosen?
  created_at   TIMESTAMP NOT NULL,
  expires_at   TIMESTAMP,                   -- NULL means no expiry
  is_deleted   BOOLEAN DEFAULT FALSE        -- soft delete flag (rows are never physically removed)
);

How id and short_code relate: When using the counter-based approach, id holds the raw integer assigned by the counter (e.g. 1000001) and short_code is simply base62_encode(id) — a compact string representation of that same integer. For custom aliases, short_code holds whatever the user typed, and id is still a unique counter value (used internally but not exposed). The column is widened to 16 characters to accommodate user-chosen aliases longer than 7 characters.

The primary lookup pattern is SELECT * FROM urls WHERE short_code = ?. An index on short_code (automatically created by the UNIQUE constraint) makes this a fast point lookup even with billions of rows. The query selects all columns — not just long_url — so the application can check is_deleted and expires_at in the same round trip before deciding whether to redirect or return 410.

Step 4: Deep Dive — URL Generation#

This is the core algorithmic challenge of the system. When a user shortens a URL, how do you produce a unique 7-character code?

Approach A: Hash and Truncate#

Hash the long URL with a cryptographic hash function (MD5 or SHA-256), then take the first few bytes of the output and encode them in Base62.

Rendering diagram...

Why 42 bits? A 7-character Base62 string can represent 62⁷ ≈ 3.5 trillion values. To express any number up to 3.5 trillion in binary, you need log₂(3.5 trillion) ≈ 41.68 bits — so you must take at least 42 bits from the hash output before encoding in Base62. Taking fewer bits would give you a smaller integer range and map to fewer than 62⁷ possible codes, artificially increasing collision probability.

The problem with truncating a hash is collision risk: two different long URLs can produce the same 7-character prefix after truncation. The retry loop handles collisions, but it adds latency on each retry and complexity to the code. Hash-based approaches also produce the same short code for the same long URL on every call — which means two users shortening the same URL would get the same short code, which may or may not be desirable behavior.

Approach B: Counter + Base62 (Recommended)#

Maintain a global auto-incrementing counter. For each new URL, increment the counter and convert the integer to Base62 to produce the short code.

Rendering diagram...

The counter approach is collision-free by design. Each URL receives a unique integer. The short code is simply a compact representation of that integer. No collision detection loop is needed.

The problem with a single global counter: It becomes a bottleneck. Every app server must atomically increment the same counter for every new URL creation. At high write throughput, this becomes a single point of contention — and if the counter service is down, URL creation is down.

Approach C: Distributed Counter (Production Scale)#

The simplest fix for the single-counter bottleneck is to use Redis atomic INCR. Redis processes all commands single-threaded and guarantees that INCR (increment and return) is atomic — no two app servers will ever get the same integer back, even under concurrent load. Each app server calls INCR url_counter on the shared Redis instance and immediately gets a unique ID without any locking logic of its own. This is fast (sub-millisecond round trip on a local network) and handles tens of thousands of writes per second on a single Redis node.

When Redis itself becomes a bottleneck — or when you need fault-tolerance if Redis restarts — use a coordination service such as Zookeeper to pre-assign a range of counter values to each app server. Zookeeper is a distributed coordination service — think of it as a reliable, shared registry that multiple servers can consult atomically. Each time a server needs a new ID range, it contacts Zookeeper, which records the assignment and hands back a non-overlapping block of integers. This ensures no two servers are ever allocated the same range, even under concurrent requests.

App Server	Assigned Range	Current Local Counter	Status
Server 1	1,000,000 – 1,999,999	1,000,047	47 URLs created; 999,953 remaining
Server 2	2,000,000 – 2,999,999	2,000,012	12 URLs created; 999,988 remaining
Server 3	3,000,000 – 3,999,999	3,000,000	Just assigned a fresh range

Each server increments its own local counter independently, with no network coordination per request. When a server exhausts its range, it contacts Zookeeper for a new range. This removes the per-request bottleneck while still guaranteeing globally unique IDs — because no two servers are ever assigned overlapping ranges.

The trade-off of sequential IDs: Counter-based short codes are predictable. Someone who receives the short code x7k2pQr can easily guess that x7k2pQs and x7k2pQq also exist, enabling enumeration attacks. If your use case requires non-guessable short codes (for example, private document sharing), use a randomly generated Base62 string of sufficient length (10+ characters), with collision checking on write.

Step 5: Deep Dive — The Redirect Path#

The redirect path is the most performance-critical part of the system. It runs 100× more often than the shorten path and must be as fast and simple as possible.

Rendering diagram...

HTTP 410 Gone vs. 404 Not Found: When a link has expired (expires_at is in the past) or been soft-deleted (is_deleted = TRUE), return 410 Gone rather than 404 Not Found. The semantic difference matters: 404 means "I don't know what you're looking for," while 410 means "this resource existed but has been intentionally removed." Search engines treat 410 as a permanent signal to deindex the URL; 404 may trigger re-crawl attempts. Because deleted links still exist as rows in the database (soft delete), the application must check both conditions — expiry and the deleted flag — before deciding to redirect.

301 vs. 302: A Critical Design Decision#

301 Permanent vs. 302 Temporary Redirect

The HTTP status code you return on a redirect has major consequences for analytics, link expiration, and browser behavior. This is an easy decision to get wrong and a hard one to change after deployment, because browsers permanently cache 301 redirects.

Rendering diagram...

Step 6: Deep Dive — Database Choice#

SQL vs. NoSQL for URL Storage

The data access pattern for a URL shortener is almost perfectly simple: look up a short code, get back a long URL. No joins. No aggregations. No complex queries. This shape directly informs the database decision.

Rendering diagram...

Step 7: Deep Dive — Caching the Read Path#

Since reads (redirects) outnumber writes 100:1, the read path deserves dedicated optimization. Multiple caching layers work together to absorb the vast majority of traffic before it reaches the database.

Layered Caching for Redirects

The 80/20 rule applies strongly to URL shorteners: roughly 20% of short codes — the recently shared, viral, and frequently bookmarked ones — receive 80% of all redirect traffic. Caching that hot 20% in memory is extremely cheap and eliminates most database load.

Rendering diagram...

Step 8: Trade-offs & Production Reality#

Every design decision involves a trade-off. Here is a consolidated view of the choices made in this design and when you would choose differently.

Decision	Choice	What We Gave Up	Choose Differently When...
URL generation	Counter + Base62	Sequential IDs are guessable — adjacent short codes can be enumerated	Private document sharing requires non-guessable codes; use random Base62 with collision detection
Redirect type	HTTP 302 Temporary	One extra server hop per redirect vs. 301	Use 301 only for permanent, non-expiring, non-trackable links where you explicitly want browser caching
Database	Start with SQL; migrate to NoSQL at scale	SQL requires manual sharding past ~1–2 billion rows	Start NoSQL directly if you are certain of internet scale from day one
Cache eviction	LRU (Least Recently Used)	Cold but recently expired links may linger briefly	Use TTL-based expiry in Redis keyed to the link's `expires_at` for accurate expiry enforcement
Analytics	Async via message queue	Analytics data has slight delay; eventual consistency	Use synchronous analytics only when real-time accuracy is a hard business requirement
Custom aliases	Allowed, stored with a flag	Users can squat on valuable slugs; uniqueness checks are more complex	Restrict or charge for custom aliases to prevent abuse and namespace exhaustion

What this design intentionally excludes#

Real production URL shorteners include additional layers that are out of scope for this case study:

URL reputation scanning — checking new URLs against malware and phishing databases (via Google Safe Browsing API or similar) before creating the short code
Multi-tenancy — per-organization namespacing (e.g., company.short.domain/slug) with separate access controls and billing
Link-in-bio pages — a single short URL that resolves to a curated landing page of links (the common Instagram "link in bio" pattern)
A/B redirect testing — routing a configurable percentage of clicks to URL A vs. URL B for experimentation

Each of these adds new systems without fundamentally changing the core redirect architecture described here.

Summary#

Component	Design Decision	Key Reasoning
URL generation	Counter + Base62 encoding, distributed via Zookeeper ranges	Collision-free by design; 7 Base62 chars = 3.5 trillion codes — enough for thousands of years
Short code length	7 characters	62⁷ ≈ 3.5 trillion unique codes; never needs to increase at realistic scale
Database	SQL for early/medium scale; NoSQL (DynamoDB/Cassandra) at internet scale	Simple key-value access pattern is ideal for NoSQL; SQL is a correct and simpler starting point
Cache	Redis with LRU eviction on the redirect path	~95% cache hit rate from the hot 20% of links; extremely cheap in memory relative to impact
Redirect status	HTTP 302 Temporary	Preserves analytics, supports link expiry and deletion, allows destination updates at any time
Read/write separation	Read replicas for redirect traffic; primary for writes	100:1 read ratio means the read path must not compete with writes on the same DB node
Analytics	Asynchronous via message queue	Decouples click tracking from the latency-critical redirect path — analytics must not slow down redirects
Scaling	Stateless app servers + Redis cluster + DB sharding/replication	Each layer scales independently; no single component is a mandatory bottleneck

The URL shortener is deceptively simple on the surface — it's just a lookup table. But the design process surfaces the same decisions that appear in every system you will ever build: how to generate unique IDs at scale, when to use SQL vs. NoSQL, why cache invalidation is a first-class concern, and how HTTP semantics affect your ability to evolve a live system. Master these decisions here, and they become pattern recognition everywhere else.

Sources:

PreviousOperational Excellence

NextGlobal Rate Limiter

URL Shortener

301 Permanent vs. 302 Temporary Redirect

SQL vs. NoSQL for URL Storage

Layered Caching for Redirects

Arch Advisor