Clarifying Requirements: Ask First, Design Second

There is one mistake that causes more wasted effort in software engineering than any other: starting to design — or worse, starting to code — before you truly understand what you are building and why.

Before drawing a single diagram or writing a single line of code, the most important thing you can do is ask: "What problem are we actually solving?"

This is not a warm-up exercise. It is the foundation of every decision that follows. The architecture you choose should be a direct response to your requirements — and if your requirements are wrong, vague, or missing, your architecture will be wrong too. No amount of clever code can fix a system designed to solve the wrong problem.

Why Requirements Drive Everything#

Every architectural decision you make — which database to use, whether to add a cache, whether to split into microservices — is a trade-off: choosing one benefit often means sacrificing another. And you cannot evaluate a trade-off without knowing what you are optimizing for.

Consider this: "Design a chat application." Without further context, this prompt could describe:

A real-time customer support widget for a SaaS product, used by thousands of support agents
A private messaging app for a small team of 20 people
A live chat system for a gaming platform with millions of concurrent users

These three systems share the same description but require radically different architectures, because they differ fundamentally in scale, concurrency demands, and data requirements. The first might need persistent message history and read receipts. The second might work fine as a simple REST API with no real-time requirement. The third needs WebSocket scaling, message fan-out across data centers, and careful capacity planning.

The requirements determine which world you are in.

Rendering diagram...

Requirements fall into three categories. Functional requirements define what the system does — the features users interact with. Non-functional requirements define how well the system must perform — its speed, reliability, and security. Constraints define the boundaries you must operate within — budget, team size, existing technology, legal compliance. All three feed into the design.

Functional Requirements: Scoping What You Build#

Functional requirements describe user-visible behavior. The core question is: what does the system need to do, and what is explicitly out of scope?

The most important skill here is ruthless scoping. When faced with an open-ended system ("build a social media platform"), the right move is to identify the 3–5 core features that define the minimum viable system and explicitly defer everything else.

Key questions to ask:

What are the core actions a user can perform?
What data is created, read, updated, or deleted?
Are there multiple user roles with different permissions?
Is the workload primarily reads, writes, or balanced?
What does success look like for the first version?

Example: "Design Twitter" → Functional scope might be: users can post tweets (280 characters), follow other users, and see a feed of tweets from people they follow. Out of scope for now: direct messages, media uploads, trending topics, advertising.

Defining scope explicitly does two things: it tells you what to design, and it tells you what not to design. Every component you don't add is complexity you don't have to operate, debug, or scale.

Non-Functional Requirements: Quantify or They Are Meaningless#

Non-functional requirements (NFRs) describe the quality attributes of the system. The critical rule is: always quantify them. "The system should be fast" is not a requirement. "p99 read latency under 200ms" is a requirement.

A quick note on latency percentiles: p50 means the median — 50% of requests complete within this time. p99 means 99% of requests complete within this time, so only 1 in 100 requests is slower. Engineers care about p99 because it captures the worst experience real users actually encounter. A system where the average response is fast but the slowest 1% of requests take 5 seconds will still frustrate users.

Without quantification, you cannot make trade-off decisions objectively — you will end up making architectural choices based on gut feeling rather than data. You can't choose between two database designs without knowing whether you need 10ms response times or 500ms is acceptable. You can't decide whether to add a caching layer without knowing the read volume. Vague requirements lead to architectures that seem sound in theory but fail in production.

NFR	Vague (Useless)	Quantified (Actionable)
Availability	The system should be highly available	99.9% uptime = max ~8.7 hours downtime/year
Latency	Responses should be fast	p99 read latency under 200ms; p50 under 50ms
Throughput	Should handle lots of traffic	Must sustain 10,000 writes/second at peak load
Durability	Data should not be lost	Zero data loss; writes acknowledged only after persisted to disk
Consistency	Data should be accurate	Reads must reflect the most recent write (strong consistency required)
Security	The system should be secure	All PII encrypted at rest (AES-256); GDPR-compliant data residency in EU

The four NFRs that matter most for most systems, and the questions that quantify them:

Availability: What is the acceptable downtime per year? Is this a life-critical system where even minutes of downtime cause significant harm, or is brief unavailability tolerable?

Latency: What response time do users expect? Interactive features (search, checkout) require fast responses because users wait for the result. Background jobs like sending emails or generating reports can tolerate delays of seconds or minutes. As a general guideline, interactive features typically need p99 latency under 200ms.

Scale: How many users do you have today, how many in 12 months, and what does peak traffic look like relative to average? A steady workload and a spiky one (such as flash sales or viral moments) require very different approaches to capacity.

Consistency: When two users read the same data simultaneously, do they need to see exactly the same value? Strong consistency means every read reflects the most recent write — essential for financial transactions where reading a stale balance is dangerous. Eventual consistency means different users might see slightly different values for a short time, but the system guarantees they will converge to the same value eventually — acceptable for a social media "like" count, where a brief discrepancy of a few seconds causes no real harm. This question drives your database choice more than almost any other requirement.

The Four Non-Functional Properties Every System Must Define

These four properties are the inputs to your architecture's most consequential decisions. Every property has a cost — optimizing for one often makes another harder. Defining them upfront lets you make those trade-offs deliberately instead of by accident.

Rendering diagram...

Scale: The Requirement That Changes Everything#

Scale is the most commonly misunderstood requirement — and the one that causes the most premature architectural complexity.

The single most important thing to understand about scale is this: the architecture that is correct at 100 users is not the architecture that is correct at 100 million users — and building for 100 million users when you have 100 is not ambitious, it is wasteful.

Scale	Appropriate Architecture	What Over-Engineering Adds
1–1,000 users	Single server, single PostgreSQL database, no caching layer, no queue	Redis, CDN, read replicas, message queues — all operational overhead with zero benefit
1,000–100,000 users	Horizontal app scaling, database indexes, basic CDN, maybe one cache layer	Sharding, microservices, distributed tracing — complexity the team cannot operate
100,000–10M users	Read replicas, caching layer, async job queues, database connection pooling	Multi-region deployment, custom consensus protocols — engineering months ahead of actual need
10M+ users	Horizontal sharding, multi-region, specialized storage per workload, CDN at edge	This architecture is justified only here — and most systems never reach this scale

The questions that define your scale requirement:

Current state: How many users does the system serve today?
Growth trajectory: What is the realistic user count in 6 and 12 months?
Traffic pattern: Is load roughly constant, or does it spike (e.g., flash sales, event-driven)?
Read/write ratio: Is the workload mostly reads (news feed), mostly writes (logging), or balanced (social posting)?
Data volume: How many records today, and how fast does that grow?

Once you have these numbers, you can do a simple back-of-the-envelope calculation to sanity-check your architecture:

Step	Calculation	Example Result
Start with Daily Active Users (DAU)	Given: 50,000 DAU	50,000 users/day
Estimate requests per user per day	50,000 × 30 requests	1.5M requests/day
Convert to average requests per second (RPS)	1.5M ÷ 86,400 seconds/day	≈ 17 RPS average
Design for peak traffic (3–5× average)	17 × 4	≈ 68 RPS peak
Size application instances	68 RPS ÷ ~200 RPS/instance	1 instance is sufficient with headroom

At 68 peak RPS, a single application instance handles the load comfortably. A single-instance system has zero distributed complexity — no need to manage multiple servers, coordinate between them, or debug the class of failures that only appear across a network. There is no case for load balancing, horizontal scaling, or read replicas at this scale. Adding those components now creates operational overhead — more things to configure, monitor, and debug — with no performance benefit.

The practical rule: Design for roughly 10× your current scale. If you have 1,000 users, design for 10,000. This gives you room to grow without requiring a redesign, but does not force you to build infrastructure for a scale you may never reach.

The Cost of Skipping Requirements: YAGNI#

The YAGNI principle — You Aren't Gonna Need It — is the antidote to over-engineering. It comes from Extreme Programming and states: implement only what is required right now. Do not add components because they might be useful later.

YAGNI is not about being short-sighted. It is about recognizing that software requirements change, and that code written for hypothetical future requirements is often ill-suited for those requirements when they actually arrive — because the future was not what you imagined.

The Over-Engineering Trap

Every component added before it is needed adds operational overhead: it must be deployed, monitored, debugged, and secured. The trap is that each addition seems individually reasonable — it is only when you look at the full picture that the cost becomes clear.

Rendering diagram...

Real-world case studies confirm the cost. In 2018, TSB Bank's IT migration failed catastrophically — customers were locked out of accounts for weeks, costing over £500 million in remediation. Post-mortems identified that new features were added during the migration, requirements were not frozen, and there was no documentation of the architectural design of the platform — meaning there was no way to verify whether the delivered system matched the intended design.

The lesson is not unique to banking. Architecture documentation and frozen requirements are not bureaucratic overhead. They are the mechanism by which you verify the system you built is the system you intended to build.

A Requirements Checklist#

Before designing any system, answer each of these questions. If you cannot answer a question, that is a gap in your requirements — not something to assume.

Category	Question	Why It Matters
Functional scope	What are the 3–5 core features? What is explicitly out of scope?	Defines what you are building and prevents scope creep
Users	How many users today? What is the realistic 12-month projection?	Drives scale decisions; determines appropriate architecture tier
Traffic pattern	Is load constant or spiky? What does peak look like relative to average?	Determines whether you need auto-scaling, queuing, or burst capacity
Read/write ratio	Is the workload read-heavy, write-heavy, or balanced?	Drives database choice, caching strategy, and replica topology
Availability	What is acceptable downtime per year?	Determines redundancy requirements and failover design
Latency	What are the p50 and p99 latency targets for key operations?	Determines whether synchronous or asynchronous patterns are needed
Consistency	Can users tolerate briefly stale data, or must every read reflect the latest write?	Drives database selection and replication strategy
Data durability	What happens if a write is lost? Is any data loss acceptable?	Determines persistence guarantees and backup strategy
Compliance	Are there regulatory requirements (GDPR, HIPAA, PCI-DSS)?	May mandate data residency, encryption, audit logs, or access controls
Team & operations	What is the team's operational expertise? What is the sustainable on-call burden?	A technically superior architecture the team cannot operate is a bad architecture
Budget	What is the monthly infrastructure budget?	Eliminates architectures that are technically valid but economically infeasible

Applying Requirements Clarification with AI Coding Agents#

Requirements clarification is more important in the age of AI agents, not less. An AI agent that starts in the wrong direction amplifies that mistake at high speed — generating thousands of lines of code for the wrong architecture before anyone notices. The cost of a wrong direction scales with the speed of the tool.

The effective pattern for working with AI agents is: plan first, code second.

Rendering diagram...

Practical rules for directing AI agents with requirements:

Always specify constraints explicitly. AI agents fill gaps in requirements with defaults from their training data — defaults that may not match your context. An agent will reach for PostgreSQL, synchronous HTTP calls, and a monolithic structure unless you specify otherwise. If you need eventual consistency, say so. If you need to stay within a $50/month infrastructure budget, say so. If you need HIPAA compliance, say so. The agent cannot infer constraints it has not been told about.

Ask the agent to ask you questions first. Before asking an agent to implement anything non-trivial, prompt it: "Before you write any code, what clarifying questions do you have about the requirements?" A well-prompted agent will surface ambiguities — authentication mechanism, expected request volume, whether multi-tenancy is needed — that you may have overlooked. This is the same conversation a senior engineer would have before starting a feature.

Scope each session to one requirement. Agent output quality degrades as prompt complexity increases, because the agent has more implicit choices to make and less focus on each individual requirement. A single focused requirement — "implement user authentication using NextAuth with GitHub OAuth, storing sessions in the Postgres schema in prisma/schema.prisma, with no other changes to existing code" — produces more reliable results than a multi-requirement prompt. Break your requirements checklist into individual implementation units and deliver them sequentially.

Use CLAUDE.md to persist requirements across sessions. AI agents have no memory between sessions. A requirement you communicated in Monday's session is unknown on Tuesday unless it is in the project's instruction file. The CLAUDE.md pattern encodes project-level requirements — architecture decisions, forbidden patterns, required test coverage, compliance constraints — so that every session starts from the same foundation.

Review architecture, not just code. When the agent produces a plan or implementation, evaluate it against your requirements checklist, not just whether the code runs. Does the data model support the stated consistency requirement? Does the API design handle the stated throughput? Does the caching strategy introduce consistency bugs you said you cannot tolerate? The questions come from the requirements you defined before the agent started.

Summary#

Principle	What It Means in Practice
Ask before you design	Requirements are not a warm-up — they are the inputs to every architectural decision. Without them, trade-offs are arbitrary
Functional requirements = scope	Identify the 3–5 core features and explicitly defer everything else. Scope definition is as important as scope itself
Non-functional requirements must be quantified	'Fast' and 'reliable' are not requirements. 'p99 < 200ms' and '99.9% uptime' are requirements
Scale defines your architecture tier	100 users and 100M users require different architectures. Build for your actual scale, not for the ceiling
YAGNI: don't build what you don't need yet	Every component added before it is required adds operational overhead — and is often ill-suited for the actual future requirement when it arrives
AI agents amplify both speed and mistakes	Wrong requirements at AI speed produce wrong code at scale. Requirements clarification is a multiplier on agent effectiveness
Plan first, code second	Write requirements.md, get the agent to produce a plan, review the plan before any code is written

The most valuable skill in system design is not knowing every architecture pattern. It is knowing which questions to ask before you start, and having the discipline to answer them fully before drawing a single box.

Sources:

PreviousObservability

NextBuild vs. Buy

Clarifying Requirements: Ask First, Design Second

The Four Non-Functional Properties Every System Must Define

The Over-Engineering Trap

Arch Advisor