Software Development Engineering Cheat Sheet
- Authors
- Name
- Justin Kek
- @justin_kek
- Developer @ Theodo Group
- Published on

Table of Contents
Cheat sheets for software development and engineering. If software were dishes, developers are the chefs who prepare them and engineers are the architects who design the kitchen.
1. Engineering
How to design software that is robust, scalable, and efficient.
1.1. Data Structures & Algorithms
How to solve problems with code.
1.1.1. Methods to Reinterpret Problems
- Create formula and see if shifting variables around can simplify solution
1.1.2. Modulo
Application | Modulo by | Example |
---|---|---|
Get n trailing digits | 10^n | 1234 % 100 = 34 |
Check even/odd | 2 | isEven = x % 2 == 0 |
Get value of bit after addition | 2 | (1 + 1) % 2 = 0 |
(0 + 1) % 2 = 1 | ||
(0 + 0) % 2 = 0 | ||
Check divisible by n | n | isXDivisibleByN = x % n == 0 |
1.1.3. Floor Division
Application | Denominator | Example |
---|---|---|
Remove n trailing digits | 10^n | 12345 // 100 = 123 |
Get carry over bit after addition | 2 | (1 + 1) // 2 = 1 |
(0 + 1) // 2 = 0 | ||
(0 + 0) // 2 = 0 | ||
Get midpoint of any array ([0,1,2] [0,1,2,3]) | 2 | midpoint = len(arr) // 2 |
1.1.4. Binary Trees
Sizes
- no. of nodes:
- height of tree: ,
- where is for a -ary tree
- width of tree:
- where is the level of the tree for which you want the width
How to navigate a Tree
There are two methods of navigating a tree: Depth-First Search (DFS) and Breadth-First Search (BFS)
DFS
There are three ways to perform traversal:
- In-Order Traversal (IOT) -> left, node, right
- Pre-Order Traversal (PreOT) -> node, left, right
- Post-Order Traversal (PostOT) -> left, right, node
There are two ways to implement DFS:
'''
1. Recursively
- Adv.: Clean and intuitive
- Disadv.: Limited by recursion depth, stack overflow risk
'''
def recursive(root):
iot(root)
preOT(root)
postOT(root)
def iot(node):
if node is None:
return
iot(node.left)
process(node)
iot(node.right)
def preOT(node):
if node is None:
return
process(node)
preOT(node.left)
preOT(node.right)
def postOT(node):
if node is None:
return
preOT(node.left)
preOT(node.right)
process(node)
'''
2. Iteratively
- Adv.: Robust for large or unbounded inputs
- Disadv.: Less intuitive and readable
'''
def iot(root):
if root is None:
return
stack = []
node = root
while stack or node:
go left as far as possible
while node:
stack.append(node)
node = node.left
node = stack.pop()
process(node)
stack.append(node.right)
def preOT(root):
if root is None:
return
stack = [root] switching this to a queue changes the DFS to BFS
while stack:
node = stack.pop()
process(node)
push right first so left is processed first
if node.right:
stack.append(node.right)
if node.left:
stack.append(node.left)
def postOT(root):
if root is None:
return
stack = []
lastNode = None
node = root
while stack or node:
go left as far as possible
if node:
stack.append(node)
node = node.left
continue
at leftmost node, if candidate has right and is not the last visited node, check right subtree
at
candidateNode = stack[-1]
if candidateNode.right and lastNode != candidateNode.right:
node = candidateNode.right
continue
node = stack.pop()
process(node)
lastNode = node
node = None do not process node again
BFS
There are two ways to perform traversal:
- Flat Traversal (FT)
- Level-Order Traversal (LOT)
BFS is primarily done iteratively - it can be implemented recursively but there is no practical benefit.
def ft(root):
if root is None:
return
queue = deque([root])
while queue:
node = queue.popleft()
process(node)
if node.left is not None:
queue.append(node.left)
if node.right is not None:
queue.append(node.right)
def lot(root):
if root is None:
return
queue = deque([root])
while queue:
for LOT, we just need to wrap the flat traversal logic in a for loop with levelSize iterations
levelSize = len(queue)
for _ in range(0,levelSize):
same as flat traversal
Note:
- You can also add metadata for each node by appending tuples
(node, metadata)
to the queue instead of just nodes
1.1.5. Array
How many times can I slide a window over an array?
- Intuition
- Start from the base case - window size 1
- How many times can you slide it?
- Increase window size
- Start from the base case - window size 1
- Formula
len(array) - windowSize + 1
1.1.6. Bitwise Operations
Operation | Application | Example |
---|---|---|
AND & | Get carry for binary addition of two numbers | 1 & 1 = 1 |
AND & | Get last bit | 10 & 1 = 0, 11 & 1 = 1 |
XOR ^ | Get sum without carry for binary addition of two numbers | 1 ^ 1 = 0 |
0 ^ 1 = 1 | ||
1 ^ 0 = 1 | ||
XOR ^ | Find differences between two bit patterns | 0110 ^ 1010 = 1100, i.e. different in first two bits |
Bit Shift | Multiply/divide by 2 | x = 2, x << 1 = 4, x >> 1 = 1 |
1.1.7. Dynamic Programming
- Caching results for fibonacci-style recurrence
1.1.8. Binomial Theorem
Theory
- The Binomial Theorem describes how to expand binomial expressions without brute force
- Binomial Expression:
- An expression formed from two terms,
- e.g.
- Binomial Theorem Formula:
- where is the binomial coefficient a.k.a. combinations
- Binomial Expression:
Applications
- The binomial coefficient can be used to describe symmetric number sequences, e.g. 1 4 6 4 1
1.1.9. Describing Symmetry
- Linear Symmetry
- Combinations / Binomial Coefficient
- Modulus
- Even Functions
- Cosine
- Rotational Symmetry
- Odd Functions
- Sine
1.2. System Design
How to design scalable and efficient systems.
1.2.1. Encryption / Decryption with Keys
There are two types of encryption/decryption patterns
Key Type | Description | E.g. | Adv | Disadv | Use Case |
---|---|---|---|---|---|
Symmetric | Private key is shared, i.e. one key for both encryption and decryption | AES | Computationally faster | Hard to distribute | Bulk data ancryption (disks, HTTPS session data, VPNs) |
Asymmetric | Public/private key is set up, i.e. two keys | RSA, ECDSA | Easier to distribute | Computationally slower | Key exchange, digital signatures, SSL/TLS handshake, email encryption |
Public and private keys are used for two main purposes:
Key Use Case | Private Key | Public Key / Shared Private Key |
---|---|---|
Message Authentication and Integrity (Digital Signatures) | Sign message | Verify message came from sender (authentication) + Ensure message wasn't modified in transit (integrity) |
Message Confidentiality | Decrypt message | Encrypt message |
1.2.2. API Architectural Styles
The three main type of API architectural style are REST APIs, RPC APIs, and GraphQL APIs.
Style | Description | Use Case | Adv | Disadv |
---|---|---|---|---|
REST, e.g. express.js, Spring Boot, Flask, Fast | Perform HTTP verbs on resources. Entity based, e.g. POST /users | Most common | Universally understood + docgen tools e.g. Swagger, OpenAPI | Slowest - One request for each entity unlike GraphQL + less space efficient than RPC |
GraphQL, e.g. Apollo | Query or mutate entities. Entity based, e.g. mutation CreateUser() {...} | APIs for FE | Faster - One request for multiple entities | More setup e.g. defining the schema, resolvers + less standardised docgen e.g. GraphiQL |
RPC, e.g. gRPC + Protobuf | Call functions remotely. Action based, e.g. await client.createUser() | Internal APIs | Fastest and most space efficient because it uses binary instead of text payloads | Only for internal use unless public clients uses same set up |
1.2.3. Databases
Paradigm | Examples | Use Case | Adv | Disadv |
---|---|---|---|---|
SQL | PostgreSQL, MySQL, MSSQL | Structured relationships + strong consistency e.g. financial data | Powerful Querying + ACID | Slower writes due to B-Trees, slower reads/writes due to stronger consistency/locks, |
Key-Value | RocksDB, DynamoDB, Cassandra | High-throughput writes, caching | Extremely fast writes + BASE | Slower writes due to LSMT |
Document | MongoDB, Firestore | Semi-structured JSON-like data, e.g. mobile/web apps | Flexible schema + BASE | Slower writes due to LSMT |
Columnar | Cassandra | Time series data, e.g. analytics, event logging | Fast on columnar queries, aggregations | Slower writes due to LSMT |
Graph | Neo4j | Social graphs, recommendation engines | Optimised for graph traversal and relationship modeling | Limited for heavy aggregations |
TypeDB | Complex knowledge graphs, strongly typed and structured relationships | Small eco system |
1.2.4. Scaling
Type | Principle | Use Case | Adv | Disadv |
---|---|---|---|---|
Vertical | Upgrading CPU/RAM/Storage | Small to medium apps, monolithic systems, startups | No code change + lower latency | Limited by hardware ceilings + expensive at scale + SPOF |
Horizontal | Adding more servers | Distributed systems | Fault tolerance via redundancy + Infinite scalability | Network latency + Higher complexity |
Types of horizontal scaling:
- Database Horizontal Scaling
- Compute Horizontal Scaling
Database Horizontal Scaling, i.e. sharding
Type | Principle | Use Case | Adv | Disadv |
---|---|---|---|---|
Directory/Lookup-based | Shard where data belongs depends on manually maintained directory | Frequently changing shards / manual control | Easy to add / remove shards | Directory is a SPOF, lookup adds latency |
Range-based | Shard where data belongs depends on which contiguous key ranges (e.g. A-F, G-L, ...) | Time-series data, ordered data, range queries | Efficient for range queries + simple to implement | Data skew possible, hotspots risk |
Hash-based | Shard where data belongs depends on hash of key | High-write, evenly distributed workloads | Good load balancing, no need to manage ranges | Range queries inefficient, rebalancing expensive |
Compute Horizontal Scaling
Type | Principle | Use Case | Adv | Disadv |
---|---|---|---|---|
Centralised Load Balancing / Orchestrator-based Scheduling | Requests are routed based on a load balancer or scheduler | Workloads are heterogeneous, resource usage unpredicatable, fine-grained control over task placement | Assign request based on compute needs + Easy to add/remove nodes + Supports complex scheduling policies | Orchestrator / scheduler is SPOF + can be bottleneck |
Static Partitioning | Requests are routed based on predefined ranges or affinity rules, e.g. ID range, location | Tasks are grouped logically | Low latency as no lookup is needed | Hotspots + manual rebalancing + difficult to add/remove nodes |
Consistent Hashing | Requests are routed based on hash of request key | Stateless workloads, e.g. microservices, serverless, API gateways | Automatic load balancing + no load balancing | Range based tasks difficult + rebalancing required when nodes are added/removed |
1.2.5. CAP Theorem
Consistency: All nodes in the system see the same data at the same time
Availability: System remains operational even if some nodes fail
Partition Tolerance: System remains operational even if network communication with some nodes fail
In a distributed system, you can only achieve two out of CAP.
P isn't optional because networks are not reliable, so the tradeoff is usually between C and A.
C is usually preferred for financial systems
A is usually preferred for social media / streaming apps
1.2.6. Authentication
- There is a trade-off betweeen safety and convenience
- Best practise to use a pre-built library, but understanding the principles is helpful in system design
- Authentication: verifying identity
- Authorisation: checking permissions
1.2.6.1. Authentication
Transporting Passwords
- Use HTTPS for password submissions
- Avoid logging raw credentials
1.2.6.2. Authentication Methods
Method | Use Case |
---|---|
Username + Password | |
Username + Password + 2FA | |
SSO | |
Custom-built SSO | |
Securing Passwords |
- Hashing
- Passwords should be stored as irreversible cryptographic hashes
- Salting
- A random, user-specific unique value (salt) is added to the plain-text password before hashing, which is stored in plaintext in the database
- Prevents
- two users with the same passwords from getting the same hash
- hackers using rainbow tables (precomputed mappings of common passwords -> hashes)
- Peppering
- A random, global value (pepper) is added to the plain-text password before hashing, which is stored as an env variable on the server
- An additional layer of security on top of salting
1.2.6.3. Proof of Authentication a.k.a access tokens
- After a user is authenticated, a token needs to be stored on the client
- There are two main types of tokens used: session tokens and JWTs
Session Token | JWTs | |
---|---|---|
Structure | Random opaque string, e.g. b8c9d7f1e6a24f38b1d80b7d849d3e4e | Structured base64-encoded JSON object e.g. <header hash>.<payload hash>.<signature hash> |
Data access | Client cannot read it, server must retrieve data for client | Client can decode payload easily, e.g. { "email" : "...", "iat": 1665385660, "roles": ["admin"] } |
Where data lives | In the backend (server/db/cache) alongside the token | Inside the token |
Generation | Server uses cyrpotgraphically secure RNG | Builds JSON payload and signs it |
Verification | Server checks that client token string matches | Server verifies signature with public key |
Revocation | Easy - Delete from backend (server/db/cache) | Hard - Blacklist / short expiry |
Transport | Authorization header + HttpOnly + Secure + SameSite=Strict | Authorization header + HttpOnly + Secure + SameSite=Strict |
Client-side Storage | Cookies | Cookies |
Server-side Storage | In the backend (server/db/cache) | n.a. |
Use Case | Monolithic app | Distributed services, OAuth |
1.2.6.4. Refresh Tokens
- Clients can be provided with a refresh token that is used to refresh access tokens
- Access tokens should be short-lived (minutes)
- Refresh tokens can be long-lived (hours/days/weeks)
- Adv
- Reduced exposure
- Centralised control if using JWT access tokens and session refresh tokens
1.2.7. Authorisation
Access Control Approach | Principle | Use Case |
---|---|---|
Role-Based (RBAC) | Users -> Roles -> Permissions | Easiest to implement / reason about |
Attribute-Based (ABAC) | Permission based on user attributes, e.g. user.department == doc.department and time < 18:00 | Highly customisable |
Relationship-Based (ReBAC) | Permissions via graph relations, e.g. editor of project X | Collaboration apps |
Scope-Based (SBAC) | Users -> Scope -> Permissions, e.g. contacts.read | OAuth |
1.2.8. Where to authenticate and authorise
Authenticate token in | Adv | Disadv | Use Case |
---|---|---|---|
App | Most flexible (custom logic, fine grained checks) | Adds latency per request | Authorisation |
Gateway | Offloads auth early, blocks bad traffic before app | Less flexible | Basic checks |
Load Balancer | Centralised | Limited to basic checks (signature, expiration) | Basic checks |
1.2.9. Tenancy
A tenant is a customer/organisation space with its own users, data, config
Single-tenant | Multi-tenant | |
---|---|---|
Definition | One tenant per isolated stack | Multiple tenants per stack |
Isolation | Strong | Weak |
Per-tenant customisation | Easy | Harder |
OpEx | Higher | Lower |
Scale | Worse (under-utilised) | Better (pooling) |
Compliance / Data residency | Easier | Harder (needs partitioning) |
Onboarding Speed | Slower | Faster |
1.2.10. Logging
- Avoid auto logging POST bodies and GET parameters
- If the auto logging runs on auth endpoints, passwords could be written in plaintext to logs
1.2.11. Sandboxing
1.2.12. Encoding
Encoding is used to serialise user facing data (text/image/audio/video) for storage / transport over the network.
Type | Description | Use Case | E.g. |
---|---|---|---|
Base32 | 32-character set encoding (A-Z, 2-7) | QR codes, OTP secrets | JBSWY3DPEBLW64TMMQ====== |
Base64 | Represents binary data in ASCII | Images, API keys, JWT segments | SGVsbG8gd29ybGQ= |
Base85 | Represents binary data in ASCII | <~87cURD_*#TDfTZ)+T~> | |
URL | Makes data safe for URLs | URLs | %20 -> spaces |
Hex | Represents binary as hex strings | 0x12ab | |
ASCII / UTF-8 | Maps chars as numeric codes | Text | 65 -> "A" |
Unicode (UTF-16, UTF-32) | Maps characters to numeric codes | Text (International) | U+4F60 -> "你" |
1.3. Concrete Knowledge
- BFF: Backend for Frontend
GET /dashboard
instead ofGET /users
+GET /orders
+GET /recommendations
1.3.1. JavaScript
Engines
- V8 (Chrome)
- SpiderMonkey (Firefox)
- JavaScriptCore (Safari)
- Hermes (React Native)
Runtimes
- Node
- V8 engine
- Adv.
- Mature ecosystem
- Safest bet
- Disadv.
- Slower
- Security via containers/OS policies
- Deno
- V8 engine
- Adv.
- Like node but faster
- Disadv.
- Mostly compatible with node modules
- Security via containers/OS policies
- Bun
- JavaScriptCore engine
- Adv.
- Sandboxed
- Disadv.
- Least compatibility with node modules
1.3.2. CPU Optimisations
- Branch Prediction
- Variable reassignment
- CPU Pipelining
- CPU Preloading
- CPU Prefetching
- Cache Locality
- Memory Access Patterns
1.3.3. Language Optimisations
- Peephole Optimisations
- Inline
- Unroll
1.3.4. Operating Systems
- Stack Size
- Linux: 8MB
- macOS: 8MB
- Windows: 1MB
1.3.5. Recursion Depth Limits
- C++: 100,000
- Depends on frame size + OS stack size
- Dart: 10,000
- Set by default
- JS: 10,000
- (V8 engine/chrome)
- Depends on
- Java: 1,000
- Depends on frame size + OS stack size
- Python: 1,000
- Set by default
1.3.6. Typical Cloud Infrastructure
Layer | Component | Description | Use Case | E.g. |
---|---|---|---|---|
Edge | DNS | Resolves domain name | AWS Route53, GCP DNS | |
CDN | Caches static content for low-latency | AWS CloudFront | ||
WAF / DDos Protection | Protect from malicious acts | AWS WAF | ||
Application Load Balancer | Distributes traffic to apps using HTTP info | AWS ALB | ||
Network Load Balancer | Distributes traffic to apps using TCP/UDP info | AWS NLB | ||
Gateway Load Balancer | Distributes traffic to third party security/network applicances using TCP/UDP info | AWS GWLB | ||
Global Load Balancer | Distributes traffic geographically | AWS ELB | ||
Gateway | Gateway | Routing to different services, security | AWS API Gateway | |
Protocol Translation (HTTP to gRPC, REST to GraphQL) | ||||
Aggregation (Compose multiple backend calls into one) | ||||
App | App Servers | Steady high throughput, long-lived connections, heavy local state, custom networking, predictable workloads, higher memory/CPU/GPU, strict latency floors | AWS ECS, AWS EKS | |
Serverless | Spiky demand, low ops overhead, pay-per-use | AWS Lambdas | ||
Data Proxy | Database Proxy | Manages a pool of persistent connections to the DB | >10k client connections | AWS RDS Proxy |
Data | Relational DBs | AWS RDS, AWS Aurora (Serverless RDS) | ||
Document DBs | DynamoDB | |||
KV DBs / Caches | AWS ElastiCache (Redis) | |||
Object Storage | AWS S3 | |||
DSQL | Distributed SQL Query Engine | Query large-scale data across object storage/ data lake with SQL | AWS Athena | |
Data Lake | Centralised storage for raw data | analytics, ML workloads, batch processing | AWS Lake Formation, Iceberg on S3 | |
Networking | VPC | Isolated virtual network for cloud resources | Define public/private subnets, control routing, isolation, multi-tier deployments | AWS VPC |
Subnets | Segments inside a VPC | Controls traffic flow and exposure of resources (e.g. public ALB, private DB) | AWS Subnets | |
Security Groups | Virtual firewalls attached to resources | Control traffic at instance/service level | AWS Security Groups | |
Observability | Logging | Collect, aggregate and index logs from all services | AWS CloudWatch | |
Monitoring / Metrics | Monitor resource usage, uptime, etc. | AWS CloudWatch | ||
Tracing | Traces request flow across different services | AWS X-Ray | ||
DevOps | CI/CD | AWS CodeBuild, GitHub Actions |
1.3.7. Networking Model
There are two main models that are used in the industry today:
- Open Systems Intercommunication (OSI) model
- Abstract: Typically used to discuss concepts
- TCP/IP model
- Concrete: This is what is used in the internet today
OSI Layer | Name | Purpose | TCP/IP Layer | Data Unit | Examples |
---|---|---|---|---|---|
7 | Application | User Apps | Application | Data | Zoom, WhatsApp, Teams |
App Protocols | HTTP, WebSockets, WebRTC, SIP, DNS, WebRTC API, WebRTC Signaling, DNS, gRPC, RTP/SRTP | ||||
6 | Presentation | Data formatting | JSON, XML Protobuf | ||
6 | Presentation | Encoding & Compression | JPEG, MP3, H.264, gzip | ||
6 | Presentation | Encryption | TLS, DTLS, SSL, SRTP, | ||
5 | Session | Manage session lifecycle | NetBIOS, RPC, WebRTC session setup | ||
4 | Transport | Reliable/unreliable delivery, multiplexing, manage connections | Transport | Segment (TCP) / Datagram (UDP) | TCP, UDP, QUIC |
3 | Network | Routing, addressing | Internet | Packet | IP, ICMP, BGP |
2 | Data Link | Framing, error detection | Link | Frame | Ethernet, Wi-FI MAC, PPP, 5G NR |
1 | Physical | Raw bits over a medium | Bits | Fiber, RF, copper, modulation |
1.3.8. Sessions and connections
Definition
Connection | Session | |
---|---|---|
Layer | Transport | Application |
Definition | A channel between two peers | A context between two peers |
Lifespan | Exists only while data flows on the transport | Can span multiple connections, until either peer terminates the session |
Signaling: Session Management Signaling is the process of setting up, managing, and tearing down a communication session before real-time data flows. Signaling encompasses multiple processes:
- Session Setup
- Codec Negotiation
- Process where two peers agree on a common codec for audio/video during signaling
- NAT Traversal
- Techniques + Protocols that allow devices behind NAT to communicate directly
- There are three main techniques
- Session Traversal Utilities for NAT (STUN)
- Device asks STUN server "What's my public IP:port?"
- Device shares info with other peer (P2P)
- Works only if NAT keeps mappings stable
- Traversal Using Relays around NAT (TURN)
- Both devices send media to a TURN server
- Used as fallback if direct P2P fails
- Higher latency + server bandwith cost
- Interactive Connectivity Establishment (ICE)
- Gathers candidates
- Private IP:port
- Public IP:port from STUN
- Relay addresses from TURN
- Tries all possible paths
- Picks the fastest, lowest-latency route
- Gathers candidates
- Session Traversal Utilities for NAT (STUN)
- Encryption keys exchange
- Exchange session metadata
1.3.9. HTTP
There are 3 main versions of HTTP being used
Version | Description | Adv | Disadv | Use Case |
---|---|---|---|---|
1.1 | Most widely supported | Simple, easy to debug, universally compatible | One request per connection -> head-of-line blocking -> higher latency, more open connections = higher infra cost | Legacy, IoT |
2 | Multiplexed streams over one TCP connection | Big improvements in latency and throughput over HTTP/1, fewer connections per client, required for gRPC | Head-of-line blocking if packet loss occurs, more complex load balancing | gRPC |
3 | Runs over QUIC (UDP) | Lowest latency | Less mature, harder debugging, firewalls may block UDP | Mobile / unstable networks |
- Modern clients auto-negotiate best protocol via Application Layer Protocol Negotiation (ALPN)
- client says “I support h2, http/1.1, h3”, server picks one
1.3.10. Transmission Control Protocol (TCP)
- Lossless
1.3.11. User Datagram Protocol (UDP)
- Lossy
1.3.12. Quick UDP Internet Connections (QUIC)
- UDP at Transport Layer + Reliability at App Layer
1.3.13. Which transport protocol
1.3.14. WebRTC
Frameworks
- Web Real-Time Connection (WebRTC)
- Open source framework for P2P RTC
- Components
- Signaling
- Media Capture
- Media Transport
- Encryption
- NAT Traversal
- Adaptive Quality
- Data Channels
- Web Real-Time Connection (WebRTC)
Signaling Protocols
- Session Initiation Protocol (SIP)
- Set up, modify, tear down real-time sessions for voice/video/messaging
- Session Initiation Protocol (SIP)
Monitoring Protocols
- Real-time Transport Control Protocol (RTCP)
- Measures network performance metrics for RTP
- Real-time Transport Control Protocol (RTCP)
Security Protocols
- Transport Layer Security (TLS)
- Secures TCP
- Datagram Transport Layer Security (DTLS)
- Secures UDP
- i.e. TLS for UDP
- Transport Layer Security (TLS)
Transport Protocols
- Real-time Transport Protocol (RTP)
- Transports real-time media (audio/video)
- Rides on UDP, sometimes TCP
- Secure Real-time Transport Protocol (SRTP)
- Encrypted RTP
- Uses DTLS for key exchange
- RTCP
- Real-time Transport Protocol (RTP)
Network Address Translation (NAT)
- NAT Devices
- Home Routers
- Corporate Firewalls
- Vanilla NAT
- 1:1 mapping between private IPs to public IPs (e.g. 192.168.0.1 (private) : 203.0.113.1 (public))
- Provides control over private IP ranges
- Single source of truth for configuring public/private IP mappings (e.g. ISP changes IP allocations)
- Port Address Translation (PAT) a.k.a NAT Overload
- 1:many mapping between private IPs to public IPs by using ports as well
- e.g.
- 192.168.0.10:52301 -> 203.0.113.7:40001
- 192.168.0.11:52301 -> 203.0.113.7:40002
- e.g.
- Workaround to IPv4's small address space, not needed in IPv6 where 1:1 mappings are encouraged
- 1:many mapping between private IPs to public IPs by using ports as well
- NAT Devices
Firewall
- Decides which packets are allowed/blocked
- Lives between private network and public internet
- Typically blocks incoming connections, not outgoing
- Corporates typically block UDP entirely because the lack of handshakes make it hard for firewalls to understand the session state If asked: “How would you design WhatsApp voice calls?” • Signaling: WebSockets (or SIP for enterprise VoIP). • Transport: RTP/SRTP for media. • NAT traversal: STUN + TURN fallback. • Encryption: SRTP end-to-end. • QoS handling: Adaptive bitrate, jitter buffer.
If asked: “How does WebRTC work?” • WebRTC = framework, uses: • Signaling (custom, often WebSocket) • RTP/SRTP for audio/video streams • STUN/TURN for NAT traversal • DTLS/SRTP for security • Adaptive bitrate + codec negotiation.
If asked: “How does VoLTE differ from WhatsApp?” • VoLTE → Managed SIP + RTP inside carrier network, guaranteed QoS, low jitter. • WhatsApp → WebRTC over the public Internet, no QoS guarantees.
1.3.15. Performance Metrics
Metric | Description | Layer | Units | E.g. |
---|---|---|---|---|
Bitrate | Rate at which app encodes and sends data | Application | bits/s | Voice: 10 kbps 2G, 64kbps 3G, 64 kbps LTE, 12-64 kbps VoLTE, 128 kbps Vo5G |
Video: 1 Mbps (360p), 2 Mbps (720p), 5 Mbps (1080p), 15 Mbps (4K) | ||||
Throughput | Rate at which data is sent over the network | Network | bits/s | Zoom bitrate 2Mbps, network throughput only 1.5Mbps due to packet loss |
Available Bandwitdh | Rate at which a network link can support data transfer | Network | bits/s | Wi-Fi: 5Mbps |
Latency / Round Trip Time (RTT) | Time taken for packet to go to peer and back | ms | <150ms before humans detect delay | |
Packet Loss | % of dropped packets between nodes in one direction | % | <1% before choppy/freezing videoaudio | |
Jitter | Variability in packet arrival time in one direction | ms | <30ms before video stutters \ audio cracks |
1.3.16. Adaptive Performance Strategies
Strategy | Description | Layer | Use Cases |
---|---|---|---|
Jitter Buffer | Temporary storage in receiver's app that smooths out variations in packet arrival times before playback | Application | Jitter |
Bitrate | Bitrate Reduction + ... | ||
Bitrate Reduction | Reducing the encoding and sending of data | Application | Packet Loss |
1.3.17. Network Protocols
Application Layer
Signaling Layer
- Voice over Public Switched Telephone Network (PSTN)
- Dedicated E2E path between landlines/mobile phones using circuit switchers
- Transmits uncompressed voice using Pulse Code Modulation (PCM) at 64 kbps per call
- Used in landlines and mobile phones when on connections of < 4G
- >4G and above
- Carrier provides QoS guarantees
- Voice over IP
- Transmits voice using IP
- No QoS guarantees, call quality depends on network connection
- Video over IP
- Transmits video using IP
1.3.18. Wireless Systems
- Application
- Transport / IP
- Radio Resource Control (RRC): Manages radio resources and connection states between base station and user device
- Types of radio resources:
- Time
- Frequency
- Power
- Modulation & Coding
- Bearer
- Control
- Random access
- Beamforming
- Types of connection states:
- RRC_IDLE
- RRC_INACTIVE (5G)
- RRC_CONNECTED
- Types of radio resources:
- PDCP
- RLC
- Medium Access Control (MAC) Layer: Decides who gets to transmit, when, and how much bandwidth
- Physical (PHY) Layer: Deals with actual signal transmission over radio waves (modulation, power levels etc.)
1.3.19. Telco 101:
- Cell Tower
- Software Components i.e. Base Station Software Stack
- Radio Access Network (RAN) Software
- Handles communication between mobile devices and cell tower, e.g.
- Handover Control: Deciding when phone switches from one tower to another
- Radio Resource Control (RRC): managing spectrum and assigning frequencies to devices
- MAC & PHY Scheduling: Deciding which user gets how much bandwidth every millisecond
- Security & Authentication: Encrypting radio traffic before it hits the core
- Quality of Service: Prioritising latency-sensitive traffic like voice and video
- Handles communication between mobile devices and cell tower, e.g.
- Cell Tower OS
- Manages hardware scheduling, memory and task prioritisation
- Management Software
- For engineers to monitor and configure the cell tower
- Radio Access Network (RAN) Software
- Hardware Components
- Antennas: Send/receive radio signals
- Remote Radio Unit (RRU): Converts radio waves to/from digital data
- Baseband Unit (BBU): Runs the base station software stack
- In 5G, BBUs are
- centralised in regional data centers
- serve dozens of towers
- do not exist on the cell tower
- In 5G, BBUs are
- Backhaul: Connection to core network via
- Fiber (Most common)
- Microwave (rural areas)
- Satellite (remote locations)
- Software Components i.e. Base Station Software Stack
1.3.20. Scheduler
Scheduler
- Does
- Assign task to node
- Does not
- Start or manage the workload
Orchestrator
- Does
- Scheduler
- Provisioning and starting workloads on nodes
- Scaling workloads up/down based on demand
- Health monitoring and self-healing
- Rolling updates and rollback management
- Managing networking, storage and service discovery
1.3.21. Firewalls
Type | E.g. | Layer | Found In | Checks | Use Case |
---|---|---|---|---|---|
Web-Application (WAF) | AWS WAF | Application | CDNs, gateways, load balancer | Examines HTTP payload for attack detection | Web app / API protection against SQLi, XSS, bots, malicious patterns |
Proxy | Nginx reverse proxy | Application | Proxy servers, gateways | Examines payload for access control and anonimisation | |
Packet Filtering | iptables (basic rules) | Network & Transport | Routers | Examines packets based on source/destination IP, port, protocol | Simple allow/deny rules, port blocking |
Host-Based | Windows Firewall, iptables | Network & Transport | Individual servers / VMs | Examines traffic per host | Protects single servers, last line of defense |
1.3.22. Optimising for reads/writes
Read Optimisation Strategy |
---|
CDN caching |
The disadvantages in general are:
- Higher storage
- Stale data
- Additional complexity with invalidation strategy
Write Optimisation Strategy |
---|
The disadvantages in general are:
- More complex read paths
- Additional complexity with background preprocessors
Balanced Approach |
---|
CQRS + messaging |
per-endpoint SLAs with targeted caching |
tiered storage (hot cache -> primary DB -> datalake) |
2. Development
Software development revolves around turning ideas into working software. Developers are the chefs who prepare the dishes, organise the kitchen with the goal of getting the best tasting food to customers with the least amount of wastage with time and ingredients.
2.1. Delivery
Delivery is about delivering value to users with minimal waste using business processes.
2.1.1. Tickets
Tickets are the backbone of software delivery. They help track work, manage priorities, and ensure that the team is aligned on what needs to be done. This is similar to how chefs use order tickets in a restaurant to manage customer orders.
2.1.2. Typical stages a ticket flows through
Tickets go through multiple stages, just like how an order ticket in a restaurant go through multiple stages, e.g. waiter takes the order from the customer, sends it to the kitchen, chefs prepare the different components of the dish, head chef does the final check, waiter brings the dish to the customer.
- Epic Refinement
- Functional Design
- BPMN Diagrams
- High Level User Stories
- e.g. "I want spagbol"
- Technical Design
- High-level answer to the question "What do we need to do?"
- e.g. "We need to buy tomatoes, mince, etc."
- Avoid diving too deep into "How do we need to do it?"
- e.g. "We need to cook the tomatoes for x mins"
- High-level answer to the question "What do we need to do?"
- Ticket Refinement
- Business Refinement
- Validation Steps
- e.g. "There should be mirepoix, arrabiata, browned mince, cooked spaghetti..."
- Validation Steps
- Technical Refinement
- Tech Steps
- e.g. "We need to chop the tomatoes, celery, onions into squares, cook them for x mins"
- Delivery
- Development
- Do the tech steps
- e.g. Chefs carrying out the recipe steps
- Code Review
- Functional Review
- e.g. Head chef checking the food
- Validation
- e.g. Waiter asks the user "How's the food?"
2.1.3. Poke Yoke
- What was the root cause of the issue?
- How could we have detected this issue earlier?
- How can we prevent this issue from happening again?
2.2. Maintainability
How to deliver value to users with minimal waste using code.
- Single Layer of Abstraction Principle (SLAP)
- Dependency Injection
- Clean Conditionals
- Conventional Commits
- Early Returns / Continues
- Prefer for loops over while
2.3. Testing
- E2E
- Main user stories, happy paths
- Integration
- Edge cases not caught by E2E
- Unit
- Small functions
2.4. Concrete Knowledge
2.4.1. Types of Development
- Web
- Frontend
- Backend
- Mobile
- Game
- Desktop
- Embedded
- DevOps
- Data
- ML / AI
- Security
2.4.2. Choosing a language for mobile app development
2.4.3. Choosing a language for frontend web development
Language | Use Case | Adv. | Disadv. |
---|---|---|---|
JS | Default | Natively supported - browsers come with JS engine | Single-threaded by default |
Dart (compiled to JS) | Cross-platform | No UI interactivity | |
C/C++/Rust (through WASM) | 3D graphics, gaming, video editing (e.g. Figma, Canva, AutoCAD Web) | High performance | No UI interactivity |
Python (through WASM) | AI/ML in the browser | High performance, mature AI/ML ecosystem library | No UI interactivity |
C (through Blazor WASM) | Existing .NET implementation | UI interactivity | Young ecosystem, large initial payload (downloads 6MB .NET runtime) |
JS is the default choice as it is the only language that has direct access to the DOM to render UI.
2.4.4. Choosing a language / framework for backend web development
The choice of language for backend web development is tightly coupled to the language's runtime, libraries and frameworks as they provide key tradeoffs.
Language | Use Case | Adv. | Disadv. |
---|---|---|---|
Javascript | Real-time apps, typically preferred over php these days | Mature ecosystem, same language for FE and BE, great for concurrency (<10k users) | Not typed |
PHP | Wordpress, CMS, e-commerce | Huge CMS ecosystem, powers wordpress | Process-per-request model limits real-time apps without extra tooling, js is typically preferred |
Python | ML / AI | Huge AI/ML ecosystem | |
Java | Enterprise, finance | Strict typing, battle tested | Heavier setup |
C | Enterprise with Microsoft eco-system | Great integrations with Microsoft / Azure | Tied to Microsoft eco-system |
Go | Microservices, cloud-native, high-concurrency APIs | Extremely fast, great concurrency with goroutines | Less suited for CMS, e-commerce |
Rust | High-performance APIs | ||
Ruby | Replaced by JS | - | Declining in popularity due to memory usage, scaling, and struggling with concurrency |
2.4.5. Choosing an Infrastructure as Code (IaC) framework for cloud
Framework | Description | Use Case | Adv. | Disadv. |
---|---|---|---|---|
AWS | ||||
SST (Serverless Stack) | Third party abstraction on top of CDK | Small projects | Ultra-fast local lambdas with hot reload, DevX | Less flexible than CDK, third party solution, risky with breaking changes |
CDK | AWS high-level code-first framework built on CloudFormation | Best all round-choice for AWS | Common programming languages supported | Steep learning curve, no local emulators for lambdas and API gateways |
SAM (Serverless Application Model) | AWS high-level serverless-first legacy framework built on CloudFormation | Prefer CDK | DevX with emulators for local lambdas/API gateways | YAML config, serverless projects only |
CloudFormation | AWS low-level framework | Low-level control | Access to L1 constructs for high customisability | JSON/YAML config, verbose |
Azure | ||||
Bicep | ||||
ARM Templates | JSON Config | |||
GCP | ||||
Deployment Manager | YAML | |||
Multi-vendor | ||||
Terraform | ||||
Pulumi | ||||
Serverless Framework | Legacy vendor agnostic framework | Do not use, it is dead | Supports AWS, Azure, GCP | YAML config, mocking AWS locally required |
2.4.6. Choosing a library for local dev of cloud resources
AWS
Library / Tool | Description | Use Case | Adv. | Disadv. |
---|---|---|---|---|
LocalStack | Full AWS service emulator in Docker | Best library to start with before using other libraries for specific functionality | Broad AWS coverage, runs in one container | Slower than service-specific emulators, partial coverage of some services |
MinIO | S3 compatible object store | Local S3 | Fast | S3 only, some S3 features differ |
ElasticMQ | SQS emulator | Local SQS | Fast | SQS Only |
DynamoDB Local | DynamoDB emulator | Local KV | Fast | DynamoDB only |
SAM CLI | Lambdas / API Gateway emulator | Local lambdas / API Gateway | Fast | Serverless services only |
SST | Lambda emulator with hot reload | Extremely fast local lambda dev | Extremely fast | Need to use SST |
2.4.7. Browser Storage
Storage Type | Description | Set by | Access via | Lifetime | Access scope | Capacity | Use Cases | Security Notes |
---|---|---|---|---|---|---|---|---|
Cookies | KV pairs | Responses (Set-Cookie) + JS (document.cookie) | Requests (auto-sent) + JS | Configurable to clear after session / expiry datetime | Browser + domain | 4KB each, 50 per domain | Auth, prefs | Use HttpOnly, Secure, SameSite flags |
Session Storage | KV pairs | JS | JS | Cleared on tab close | Tab / Session | 5MB | Temporary UI state, multi-tab separation | Accessible to JS -> XSS risk |
Local Storage | KV pairs | JS | JS | Persistent until cleared | Browser + Origin | 10MB | App state, non-sensitive prefs | Accessible to JS -> XSS risk |
Extension Storage | ??? | JS (Extensions only) | JS (Extensions only) | Persistent until cleared | Extension | 5MB (sync), 10MB (local) | Extension settings, sync across devices | |
IndexedDB | NoSQL DB | JS | JS | Persistent until cleared | Browser + Origin | xGB, depending on disk space | PWAs, offline apps, large structured data | Origin-scoped, but XSS risk |
2.4.8. Request/Response Flags
Flag | Purpose | Use Case |
---|---|---|
HttpOnly | Prevents JS from reading cookies | Protect tokens from XSS |
Secure | Cookie only sent over HTTPS | Protect plaintext cookies from being leaked |
SameSite | Controls if cookies are sent on cross-site requests (Strict/Lax/none) | CSRF protection / cross-site marketing |
Cache-Control | Controls caching of resposne data (no-store, max-age etc.) | Ensure sensitive data isn't cached |
CORS headers | Control which domains can make cross-origin requests | APIs that need controlled access |
2.4.9. Response Codes
Code | Meaning | When to use | Benefit of using |
---|---|---|---|
Informational | Request received, continuing process | Rare in practice, mostly for protocol-level interactions | |
100 | Continue | Client should continue sending request body (after headers OK) | Saves bandwidth if request is rejected early |
101 | Switching Protocols | Used for HTTP to WebSocket upgrade or HTTP/1 to HTTP/2 switch | Necessary to start persistent connections |
Success | Request succeeded | ||
200 | OK | Standard response for successful request (e.g. GET, POST when no resource creation) | |
201 | Created | New resource created successfully (e.g. POST /users) | |
202 | Accepted | Request accepted for async processing but is not done yet | |
204 | No Content | Success, but no response body (e.g. DELETE) | |
Redirection | Further action needed | ||
301 | Moved Permanently | Resource permanently moved | Tells crawlers to update their search index, better SEO |
302 | Found (Moved Temporarily) | Temporary redirect (historically used like 303) | |
303 | See Other | Redirect after POST -> GET (common for web forms), e.g. ??? | |
304 | Not Modified | Used with caching | Client can use cached response, lowers latency and bandwidth does not need to wait for body to arrive |
Client Error | Problem with request | ||
400 | Bad Request | Malformed syntax, invalid patterns | |
401 | Unauthorized | Missing/invalid authentication | |
403 | Forbidden | Authenticated but not authorised | |
404 | Not Found | Resource doesn't exist, or if you don't want malicious actors to know your API endpoints if they are not authenticated/authorised | Security through obscurity + clear feedback |
409 | Conflict | Resource conflict (e.g. duplicate unique field) | |
429 | Too Many Requests | Rate limiting / throttling | |
Server Error | Problem on server side | ||
500 | Internal Server Error | Generic server crash/error | |
502 | Bad Gateway | Upstream server error (e.g. reverse proxy can't reach backend) | |
503 | Service Unavailable | Server overloaded, down for maintenance | |
504 | Gateway Timeout | Upstream service didn't respond in time |
2.4.10. Web Identifiers
Term | Definition | E.g. |
---|---|---|
Domain | Registrable name of a website / portion of host | example.com |
Host | Network address (domain name / IP) in a request | example.com , shop.example.com |
Scheme | ??? | http:// , ws:// |
Port | ??? | 443 |
Origin | Scheme + Host + Port | https://example.com:443 |
Fragment | ??? | #reviews |
Uniform Resource Name (URN) | Name of a resource, not how to locate it | urn:isbn:0451450523 (book ISBN), urn:uuid:6fa459ea-ee8a-3ca4-894e-db77e160355e (UUID) |
Uniform Resource Locator (URL) | How to locate a resource | https://shop.example.com:443/products?id=10#reviews |
Uniform Resource Identifier (URI) | URL / URN | - |
2.4.11. React
- Avoid useEffect if there are no external deps (source)
2.4.12. Database Terminology
- Statement
- A single command
- e.g. SELECT, UPDATE, FROM, WHERE
- Read / Query / Data Query Language (DQL)
- A complete set of statements
- Ends with a semicolon
- e.g. SELECT * FROM fooTable;
- Write / Update / Data Modification Language (DML)
- A complete set of statements
- Ends with a semicolon
- e.g. UPDATE fooTable SET colName = x;
- Read Result Set
- Data returned from a query
- Update Acknowledgement
- Confirmation returned from a query
- e.g. x rows inserted
- Transaction
- A group of queries executed as a single unit
- e.g. BEGIN / START TRANSACTION -> COMMIT / ROLLBACK
- Session
- A client's connection to the DB
- Database Object
- Anything defined in a DB
- e.g. Tables, Views, Indices, Stored Procedures, Triggers, Functions
- Schema
- Logical grouping of DB objects
- Execution Plan
- The strategy the DB optimiser chooses to run your query
- e.g. index scan vs full scan, hash join
2.4.13. Database Data Persistence
Data in Tables (Persistent)
- Base/Regular Table
- Data stored in disk
- Data is persistent across sessions
- Temporary Table
- Data stored in disk
- Data exists only in session
- Data can exist across sessions if cached
Data in Queries (In Memory)
- Result Set
- Data stored in memory
- Data exists onl
- Derived / Subquery e.g. FROM
- Data stored in memory
- Data exists only in query
- Common Table Expression (CTEs) e.g. WITH
- Same as subquery, but provides syntactic alias for reusing subqueries
Named Queries
- View/Virtual
- Query definition stored in disk
- Data only stored
- Materialised View
- Data stored in disk
- Manual/scheduled refresh
- Stored Procedure
- Data stored in disk
- ???
2.4.14. Database Isolation Levels
| Isolation Level | Dirty Reads |
2.4.15. Testing Frameworks
Frontend
- Web
- Playwright (purpose built from the ground up)
- Cypress (multiple packages patched together)
- Cross Platform
- integration_test (flutter)
- Mobile
- Maestro (js)
- Supports OS level interaction, e.g. going to system settings
- Maestro (js)
2.4.16. Cold/Warm/Hot Starts on Mobile
- Cold Start
- binary not in memory
- e.g. launching app after killing it
- Warm Start
- binary in memory, app process in background
- e.g. when switching between apps
- Hot Start
- binary in memory, app process in foreground
- e.g. when locking and unlocking the screen momentarily, or switching between apps briefly
- This occurs because the Android and iOS give apps a grace period (~2s) before backgrounding
- App still has GPU and CPU priority
2.4.17. Splash Screen
Splash screens are only shown for cold start
Phase | Native iOS | Native Android | React Native | Flutter |
---|---|---|---|---|
Process Startup | OS launches app process | Same | Same | Same |
Show OS-level Splash | Launch splash | Same | Same | Same |
Runtime Init + Framework Boostrap | Initializes iOS runtime + UIKit, sets up main run loop, prepares initial UIViewController | Init Android Runtime + base Activity, inflates first layout | Native layer starts JS engine, loads JS bundle, sets up React tree & JS x native bridge | Native layer starts Flutter engine, loads Dart VM, initializes widget tree & Skia renderer |
App Init | Set up SDKs, DB, config etc. | Same | Same | Same |
Remove Splash | OS removes splash once first UIViewController is ready | OS removes splash once Activity content is ready | Native splash removed after JS bundle + RN root view are mounted | Native splash removed after Flutter engine renders first frame |
First Frame Rendered | First frame is rendered | Same | Same | Same |
3. Resources
- DS&A
- LeetCode
- System Design
4. TODO
- Designing Data–Intensive Applications: Big Ideas Behind Reliable, Scalable, and Maintainable Systems
- Cheatsheet
- 2's complement
- Hypergeometric/Binomial: Given 100 faulty in sample size of 1000, what is the probability of getting
- Prefer Playwright (purpose built from the ground up) over Cypress (multiple packages patched together)
- Cross Platform
- Flutter
- integration_test
- Flutter