Non-functional Requirements in System Design Interviews

Functional requirements tell us the features we need to implement. Non-functional requirements, on the other hand, describe how the system should behave and the constraints under which it must operate. Non-functional requirements are critical aspects that determine how well a system operates under specific conditions, such as a high number of users. Handling a smaller user base, like 100 users, would require a simpler design than managing a system used by a million users. For instance, the design of Twitter's timeline for 100 users might involve pulling data from a database each time it's needed. However, this approach can quickly create a bottleneck when scaled to a million users, necessitating pre-fetching of data into the cache before a user accesses their timeline.

The most common non-functional requirements are availability, scalability, performance (latency and throughput), and consistency.

Availability

Definition: Availability refers to the degree to which a system is operational and accessible when needed. It's typically expressed as a percentage of uptime over the total time.

How to Achieve This in Design:

Use load balancers to distribute network traffic evenly across servers.
Implement a health monitoring system to detect failures promptly, and set up automated processes for failover and recovery.
Implement redundant hardware and software components, which can include multiple servers in different geographic locations, also known as availability zones or regions.

Examples:

Amazon S3 achieves high availability by using redundant storage and automatic failover mechanisms.
Google Cloud Spanner uses replication and synchronous writes across multiple zones to ensure high availability.

Scalability

Definition: Scalability is a system's ability to handle increased load without a significant drop in performance.

How to Achieve This in Design:

Utilize stateless servers, allowing for the addition of more servers as demand increases (also known as auto-scaling and horizontal scaling).
Optimize database performance and use caching to reduce load.
Implement message queues for asynchronous processing, helping to manage the processing of tasks in the background and bridge the gap in processing speed between services.

Examples:

Twitter uses a message queue system called Kestrel to help handle high volumes of tweets, showcasing effective scalability.
Netflix uses a combination of caching, partitioning, and load balancing to handle the massive load of streaming requests.

Latency

Definition: Latency is the delay before a data transfer begins after a request has been made.

How to Achieve This in Design:

Optimize database queries and employ efficient algorithms and data structures.
Use caching to store and quickly retrieve frequently accessed or recently accessed data.
Implement Content Delivery Networks (CDNs) to serve static content closer to users, reducing the delay caused by the physical distance between the server and the client.
Optimize network performance and employ edge computing where appropriate to reduce the round trip time of requests.

Examples:

Cloudflare uses a global CDN to reduce latency for its users, delivering content faster by serving it from locations closer to the end-user.
Google Search uses a variety of techniques including caching, efficient data structures, and algorithms to provide low latency results.

Consistency

Definition: Consistency ensures that all nodes see the same data at the same time in a distributed system.

How to Achieve This in Design:

Selecting the proper level of consistency. Depending on the system's needs, opt for a stronger or weaker consistency model. Strong consistency guarantees that all nodes see the same data at the same time, while eventual consistency allows for temporary inconsistencies between nodes.
Use database transactions or consensus algorithms in distributed systems, ensuring all nodes agree on the state of the system. However, this comes at a cost of higher complexity.

Examples:

Distributed databases like Apache Cassandra can be configured for strong or eventual consistency depending on the needs of the application. This flexibility enables applications to balance consistency needs with performance and availability considerations.
Amazon DynamoDB uses eventual consistency by default but also offers strong consistency options depending on application requirements.

Here’s a table summarizing what we have learned:

Non-functional Requirement	Definition	Technologies to Achieve
Availability	Operational and accessible when needed. % uptime/total time.	Load Balancers, Data Replication, Availability Zones, Monitoring
Scalability	Handle increased load without performance drop.	Stateless Servers, Caching, Message Queues, Data Partitioning
Latency	Time taken to respond to requests.	DB Query Optimization, Algorithms, Caching, CDNs, Edge Computing
Consistency	All nodes see the same data at the same time.	Consistency Level, Distributed DB Transactions, Consensus Algos

System Design Components and Non-functional Requirements

Here’s another view, mapping non-functional requirements to commonly used system design components. We will dive deep into each of the non-functional requirements in the following articles.

Non-functional Requirement	Load Balancing	Caching	Partitioning	Replication	Message Queue	Batch Processing
Availability	X	X	X	X	X
Scalability	X	X	X	X	X	X
Latency	X	X	X	X
Consistency (affected by)		X	X	X

Note that for each of the rows, if there’s an ‘x’, it means the technologies can help with the non-functional requirement except the last row “consistency”. The consistency requirement is really about picking the right level of consistency and then choosing technologies to build it to satisfy the consistency requirement. Checking this row means whether the technology affects consistency.