Five Books — Software Architecture

Nilendu Misra
10 min readOct 23, 2023

Architecture essentially is what Jeff Bezos calls “Type 1 Decision”. They are typically Zero Day stuff that, once decided, are extremely difficult if not impossible to change. Second way to look at architecture is with the highest level structure. Once you move to a Craftsman house, you cannot easily change it to a Ranch style one. The final way to look into it is shared vocabulary and meaningful information packing in a collaborative field. “Chimney” has fewer bytes and far easier recall than “that thing that sticks out of the roof”.

i.e., Primacy — to decide, Structure — to layout, and Abstraction — to communicate — is all there is to Software Architecture.

The challenge arises from abstraction. Often, mangled metaphors, misunderstanding or FOMO clouds the concrete understanding of a concept. Microservices — a term that became very popular in past decade — is essentially distributed system that aims to decouple components of a codebase to facilitate rapid delivery. To intuit Microservices, therefore it pays off to learn all the complexities, nuances, and trade-offs of a distributed system. Unlike a house, that somehow needs to comport to the usual style of a neighborhood, a technical system is only answerable to the business it serves.

The following five books should help in five areas —

  1. What exactly is architecture, and how it changes with technology capability and time.
  2. Programming is “Hello World” running in your computer plus-plus. Architecture is when code runs on someone else’s computer plus-plus.
  3. It is easy to get lost with so many terms these days. Learn in a conceptual broad brush what they mean and, most importantly, their trade-offs.
  4. ~20 things we learned from successes and failures of last 96 years.
  5. You do not even need to decide if you have ample time in hand. Few billion years and random walks can design most elegant — and very vulnerable — systems as well.

1. ABC of Architecture

Link

Why — What exactly is architecture

Software Architecture is what Amazon calls “Type 1 Decision”. They are one way doors that are — for all practical purposes — irreversible. Once you build an “event-driven system” based on lambda/serverless technology with a persistent message queue (say, Kafka), you will have to live there. This short book could help incepting the metaphor and finding concrete parallels. It stays true to the principle — “incompetence will show in the use of too many words”.

Code embodies the system, design represents our good faith “digitization” of the system, and architecture is the structural metaphor of the system itself. After you see a house, the very first thing that comes to your mind — “craftsman”, “Bronx Apartments” — is the architecture. Architecture, therefore, offers a shared and common vocabulary that compresses the essence of the system. “That thing sticking out of the roof” is a “chimney”.

ABC…shares how traditional architecture was shaped with the advance in technology. Egyptians built giant monoliths — robust but not resilient. Greeks built post and beam — rectangular large space enclosed by relatively larger mass. Romans added arches and vaults — gave it all “angles”. Industrial revolution introduced steel and reinforced concrete — to achieve a much larger scale, and utilization with huge savings in materials. Extend that metaphor — and you can write in one line of Go what it would take you 100s of lines of assembly language!

Vitruvius framed architecture on a trifecta of function, structure and beauty (Utilitas, Firmitas and Venustas). This Vitruvian triangle, moving through time, creates architectural imprints. The difference between Stonehenge and the Empire State Building is, therefore, basically historical rather than statical. The posts of the latter are made of steel, and not of stones.

One thing from the book

“The arch shows technological advance over post and lintel because it avoids the limitations of the one available stone horizontal member by using relatively small stones in combination to achieve relatively large spans….and it is easier to man-handle than a large lintel”.

Could very well be how larger monolithic codebases were broken into more manageable “microservices” each of which was far easier to handle!

Similar
101 Things I learned in Architecture School” is another delightful little book with aphorisms like “An architect knows something about everything. An engineer knows everything about one thing.”

2. Designing Data-Intensive Applications

Link

Why — code written on your computer finally run on many other computers you have no clue about when you wrote it.

Modern software design is essentially building distributed systems with libraries and tools hoping the latter will do the “heavy lifting”. From microservices (or micro-frontends) to multi-region active-active cloud topologies, the goal is to focus on developing our own business logic and use just enough practices to avoid “8 fallacies of distributed computing”. It works till it does not.

This book is a relatively accessible and interesting introduction to distributed computing. Also, the author does a great job of articulating the “systems” aspects of data engineering. He starts from a functional 4 lines code to build a database to the way one can interpret and implement concurrency, serializability, isolation and linearizability (the latter for distributed systems). His book also has over 800 pointers to state of the art research as well as some of the computer science’s classic papers.

One thing from the book
Safety
means that nothing bad happens (for example, wrong data is not written to the database), and liveness means that something good eventually happens (for example, after a leader node fails, eventually a new leader is elected).

Similar
Guide to Reliable Distributed Systems is a more in depth study of the same.

3. Microservices Patterns

Link

Why — most modern terms, what they mean and their trade-offs

Wittgenstein famously said, “limits of my language are the limits of my world”. For the majority of technology leaders, sadly, “limits of their silence are the limits of their understanding”. The moment a jargon is casually verbalized, it stops being understood well. Just ask someone about how to address transactional complexity in so-called microservices land, where domain entities are somewhat scattered across various services. Anyone who just read a blog post would likely say “Saga”. Anyone who built it would probably say “We used a messaging tool to transfer data”. Anyone who’d built it and scaled it would take a pause and will go over how this is essentially a 2-phase commit problem, except to be solved with code rather than in database. Therefore, after two years of struggle with “replication lag” at scale they just refactored some objects and accepted about 0.1% failures addressed out-of-band with compensating transactions. Hire the third person!

Best modern software architecture book out there. Competence is reflected in rational negation — buzzwords are cool but what is the catch? If someone just excitedly lays out positives of a certain technology principle — say, microservices, or react — s/he could be a salesperson or naive or both! Neither would survive the long-term impact of a critical decision taken with, basically, “mimetic copying” — just because others were doing it. You would trust your leader more when she can contextualize the real-life challenges of a proposal. This is even more relevant for any _recent_ trend embraced suddenly.

This book thoughtfully compiles, explains a set of techniques you need to know as a CTO/VPE to build modern apps. Other arch books are too high level, or do not cover the breadth or preaches without practicing data. This has concrete design and code examples.

Most importantly, it shares where something is not a good fit. This is critical as we engineers choose shiny tech, the usage profile changes with time, and median tenure of engineers is 4 years. The choice, say — frontend-for-backend with 3 different API gateways, becomes someone else’s problem down the lane. That person now hires 15 platform engineers to redo it. There goes the entire future cash flow.

My fav interview question is “What is the drawback of your favorite tech/framework/tool, and share a real-life example with how you conquered it”. Every technology has its ‘Annus horribilis’. Best engineers may not know Lamport Clock, but empirically or passionately answer how they successfully overcame the challenge of their go to technology in a real-life, complex domain. This book follows that philosophy and teaches how to do that with at least 20 or more modern technology primitives.

One thing from the book

It shines the brightest especially in three areas — one, microservices essentially as a distributed architecture — and all the associated challenges thereof; two, how to minimize overhead-per-service — e.g., by, first, deploying a API gateway, and then, if polyglot, using a “service mesh”; three, how to deploy and manage with modern primitives like deploy-as-container and why that minimizes the bootstrap time. The chapter on observability/monitoring and looking at it from six different angles — distributed tracing to aggregated log and exception reports — alone is worth the price of entry. I have not come across a single book that walks through the ENTIRE lifecycle — with honest trade-off analysis and working domain model/code.

“How” and “why this is better” for the identity solution (authentication & authorization) for BOTH API- clients and login-based apps was done exceptionally well. I have not seen a better top-down explanation of it — starting from canonical JSESSIONIDs/server-stored context to OAUTH 2.0 with access & refresh tokens. With clear sequence diagrams too.

Simply brilliant.

4. Principles of Computer System Design

Link

Why — timeless wisdom of success and failures from computing universe

Brevity is the output of confidence. Insight is the output of competence. This unique book — while not really small — is full of insights. It offers a panoramic tour of the computing universe — from naming, storage, network, persistence, performance to naming things, performance, fault tolerance — and distills the wisdom in a set of about 20 timeless “Design Principles” that hold incredibly well in real-life and at large systems.

An odd characteristic of a really good book is — after finishing it — it feels such a good extension of few insights that you might wonder why you even read the 300 pages! As it is said — “Education is what remains after you’ve forgotten what was taught”. For example — “ Incommensurate scaling rule — Changing a parameter by a factor of ten requires a new design.” — is an almost canonical digital rule that holds perhaps as good as Moore’s law. “Software becomes slower faster than hardware becomes faster”.

This excerpt from the security chapter is yet another marvel -

Security has a negative goal. Having a narrow view of security is dangerous because the objective of a secure system is to prevent all unauthorized actions. This requirement is a negative kind of requirement. It is hard to prove that this negative requirement has been achieved, for one must demonstrate that every possible threat has been anticipated. Therefore, a designer must take a broad view of security and consider any method in which the security scheme can be penetrated or circumvented.

An example from the field of biology illustrates nicely the difference between proving a positive and proving a negative. Consider the question “Is a species (for example, the Ivory-Billed Woodpecker) extinct?’’ It is generally easy to prove that a species exists; just exhibit a live example. But to prove that it is extinct requires exhaustively searching the whole world. Since the latter is usually difficult, the most usual answer to proving a negative is “we aren’t sure”.*

One thing from the book

The chapter on “Fault Tolerance”, especially its focus on (a) baking fault tolerance into design (e.g., centralization is anti-robust), (b) considering people within the fault domain of technical systems (complex systems fail with, or because of, people) and (c) distinguishing between “fault” and “failure” and how to decrease the probability of faults resulting in failures. This should be in the yearly reading list of every leader!

5. Immune

Link

Why — great systems don’t even need architects if they have billions of years to iterate

We learn the best from adjacencies. It is surreal to see how much design of our immune system resembles to have been built grounds up from the “first principles” of system design. Nature has no ultimate goal — just infinite time, somewhat randomized data transfer between consecutive generations and a selection process where better randomizations have higher probability to thrive. Give it a billion or so years and everything works!

To analogize with three core architectural principles -

Function — it exists to ‘distinguish the other from self’.

Structure — it is layered (innate vs. adaptive), modular with specialization (skin, macrophage, antibodies, T cell etc), self-healing (inflammation, cell suicide etc), messaging relevant information (cytokines), scale (30 proteins, think of these as analog of ‘functions’ in programming, essentially do it all)

Beauty — just an example — there is a “mini me” Natural Selection implemented within the immune system. Thymus is the “murder university” that every T cell has to pass. If they do not — they get killed.

This is a vast complex system architecture with emergent behavior as the intended unintentional outcome. Like all systems, its gift is its shadow as we learned during COVID-19.

This book is a fun way to learn system design at scale from a designer that is either absent or whose presence cannot be proved yet.

One thing from the book

Autoimmune diseases are essentially system bugs that arise out of “combinatorial explosion” — a scale problem, if you will. Each T cell has a “receptor” for an antigen and nukes it (i.e., bad stuff that could harm us). With the vast number of T cells, there are inevitably a few that happen to have receptors for proteins from within our cells. This is against the immune system spec — “distinguish the other from self”, but is way too expensive to eliminate at scale. Almost 1 in 10 suffers from such diseases.

Similar

This 10 minute video from the author is a masterclass of communicating succinctly and interestingly.

--

--

Nilendu Misra

"We must be daring and search after Truth; even if we do not succeed in finding her, we shall at least be closer than we are at the present." - Galen, 200 AD