Mastering Distributed Applications - Setting the bases

It's been quite some time Software Architectural patterns have veered towards Distributed Models. It's unthinkable that any modern stack does not include at least some MicroServices or FaaS elements... or is it not? Engineers have dealt with Distributed Systems for a very long time, but with the growth of Cloud computing services, particularly Serverless services, and the rise of popularity of MicroServices patterns the impact this concept has in our day to day has peaked dramatically. Our understanding of what a Distributed Application has also changed in these last years, distancing itself from its original meaning.

Sadly, not all engineers have adapted fast or good enough to the changes this paradigm brings into the equation, and more often than not these changes result in more problems than solutions.

I have been asked several times to cover the subject for our dev teams, so I finally decided to take on it in a post series (number of posts still unknown, it's a vast topic...), but the first thing I need to settle in this first post in the series is expectations:

What this post series wants to be...

  • a general overview on core aspects related with distributed applications design
  • a starting point for your journey into the complex (extremely...) world of Modern Distributed Application Design
  • a gentle but somewhat thorough introduction to the subject
  • a biased series more aligned to http/web based distributed applications rather than a truly generic one

What this post series is not...

  • a training or course on Software Architecture Design
  • a formal guide to Distributed Applications Patterns
  • a set of recipes to succeed at building your first Distributed Application

# What is a Distributed Application?

You can probably find a trillion different answers for this (well google hints to about 652M results for "what is a distributed application", so maybe not a trillion...), but the core aspect that joins all those answers is probably the fact that a distributed application (or a distributed system) must be running on multiple computers within a network at the same time. Now that sounds easy enough, but at the same time is probably quite distant to what came to your mind when you faced the question.

With that concept in mind, what we currently call Monolithic or Non-Distributed applications are in fact Distributed applications if they use a Database, that probably is running on a different server, or if it includes any 3rd party service like f.e. Google Analytics or any API Driven service... Even without those, any web based application is by definition a distributed Client/Server application just because of the nature of the web itself (Browser client handling rendering and inputs, Server handling all/part of the logic). But don't worry, we will put most of the focus on what you probably had in mind when you started to read this article, which probably is more akin to a Microservices or Service Mesh application architecture, even though that's just one of the available patterns.

Let's do a fast trip back in time and have a look on how computer systems have evolved so we can understand why we ended up where we are now:

The first computer systems were bulky, extremely expensive and highly specialized systems, designed many times to handle a single particular problem. That combination, in particular the "expensive" part of it, brought to existence the first Distributed Application Architecture, usually referred to as Mainframe Computers . In this Model, some "dumb" (and far cheaper...) terminals connected to a central computer that did all the computing, and left to the terminals the task to present the info and capture user inputs to send to the central computer.

With the evolution of cheaper, compact and more powerful computers Mainframes were gradually replaced by a Client/Server Model where the terminals were replaced by full computers that had enough capacity to handle part/most of the work and the servers assumed the role of shared storage and coordinators. With the arrival of public networks (internet) this model became the basis of Web Applications. The rise of popularity and usage of the internet, and the creation of data centers with thousands of reasonably-priced commodity servers brought up the next step and set up the foundation to nowadays distributed architectural models. In the early days you had a few machines in a data center (or even just one if you had not that much load) on the same network and you created what we now call monolithic applications: a back-end server handling requests through a web server, with potentially a sidekick database server, and maybe even multiple instances to balance your workloads or database read replicas to mitigate database locking.

What we have today is an evolution of those data centers to an extreme where servers are not even relevant anymore (Virtualized Servers, Serverless computing,...), and our applications have transformed into a Service Oriented Architecture (SOA, MicroServices, Service Mesh, FaaS... pick your flavor) that forces us to consider a lot more things in our Designs than what we used to keep in our minds before.

# Reasons to go Distributed

My first advice would be to don't go distributed if you really don't need to... Distributed Systems are far more complex and require a much more advanced engineering skill set to be built and operated than Monoliths or Single-machine applications. The cost in complexity is quite high, so you really should evaluate your options before moving full ahead with going Distributed.

There's situations where there's no other feasible option or the benefits are so high that you must take on this path:

  • Parallelization: When you are running long operations or your application needs to handle a big volume of requests you can solve things by running multiple processes in parallel or even decomposing a task in steps and parallelizing the most critical/c
  • Isolation: Sometimes it is important to isolate parts of your system for different reasons like securitization or distributed maintenance by different teams. It could even be needed because of the physical placement for that piece in a particular server while the rest runs on another one, f.e. for critical computations requiring a particular server infrastructure that cannot run along other software.
  • Externalization: you need to integrate a 3rd party service that lives outside of your controlled infrastructure. In this case you handle only the "Client" side of the problem, so this case is kind of a particular isolation scenario.
  • Responsibility Segregation: When the complexity of your system is too big there's a decent chance the complexity of going distributed is compensated by the simplification it brings in other places.
  • Cost: horizontal scaling or moving into Cloud Computing can be driven by cost, even though on-premise is essentially cheaper in direct cost (but more expensive in dedicated specialist employees).

# Core aspects to consider when designing Distributed Systems

Quote

A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable.

– Leslie Lamport

There's a group of aspects you really need to consider thoroughly when designing distributed systems, no matter the approach, infrastructure, language, framework or tooling you are using. Any of these aspects will derail your project in no time if not correctly handled:

  • Communication: Distribution implies a separation, servers or nodes will need to communicate with each other through one or even many networks. This has an implication in terms of f.e. latency (traveling time is not 0 even if it is very small), availability (network is not always available) and consistency (maybe your message gets scrambled going through a faulty intermediary losing/corrupting data)

  • Coordination: Sometimes it's relevant that certain actions take place only once, in a given order, at a given time or by a particular system. Also ACID operations involving multiple systems are far more difficult to implement.

  • Scalability: Systems have a limited capacity, distributed computing is one way to cost-effectively increase capacity but comes at a different cost: complexity. Also when running through a complex service mesh, each service has its own capacity and response to load that you will need to fine-tune and balance to prevent one of the services bringing down all your system.

  • Resiliency: Your system will be resilient to its capacity to keep operating when there are failures. What happens if a node crashes or a particular service takes longer to answer (or does not answer at all...). Also consider failure points increase dramatically as we add more components to a distributed system, and each failing service impacts the probability other systems fail in cascade.

  • Operations: All the elements in your Distributed System will need to be deployed, tested and maintained in an organized/consistent way. Deploys and Roll Backs need to be safe and keep the full system stable to prevent downtimes. Every piece conforming your Distributed Application will also need to be observable and thoroughly monitored, firing alerts so any issue can be addressed before the impact brings everything down.

# Service Orientation

To close this first initial post I want to get a quick overview on Service Orientation and the reason why SOA is important to Distributed Application design. There's plenty of valid reasons to go Service Oriented, and to a point, Service Orientation is on its own a driver for distributed design. A Service is an atomic logic block that is offered by a piece of software so other software consumes it, f.e. delegating a particular operation fully to it, moving all responsibility on that particular operation to that Service.

Services provide a way to segregate responsibility and to decouple explicit knowledge from different parts of your application. As a pattern, SOA can be applied to monolithic applications as well as it can be applied to distributed applications, but by the very nature of distributed applications, SOA is ideally suited for them. Many of the most modern approaches just rely on some kind of SOA design: MicroServices, Service Mesh, Web Services (like in Amazon Web Services) are just a few examples of this symbiotic pairing. In SOA, a bunch of Services collaborate to meet a goal, communicating over some channel. I am quite sure this rings a bell and you can map it easily to how a distributed application works.

# Recap

In this first article in the series I just wanted to focus on key concepts that I will be covering in more depth on future posts, but giving a global picture on everything that is relevant when designing your Distributed Applications.

Next in the series, I will analyze some common Distributed Application Architectural patterns, and after that one we'll dive into each of the core aspects mentioned in this post.

By the way, if by any chance you are looking for a job and you like how we work, we are always looking for good engineers. Check our open roles.