Part 1 – Microservices: It’s not (only) the size that matters, it’s (also) how you use them
Part 2 – Microservices: It’s not (only) the size that matters, it’s (also) how you use them
Part 4 – Microservices: It’s not (only) the size that matters, it’s (also) how you use them
Part 5 – Microservices: It’s not (only) the size that matters, it’s (also) how you use them
Part 6 – Service vs Components vs Microservices
In Microservices: It’s not (only) the size that matters, it’s (also) how you use them – part 2, we again discussed the problems with using (synchronous) 2 way communication between distributed (micro) services. We discussed how the coupling problems caused by 2 way communication combined with micro services actually result in the reinvention of distributed objects. We also discussed how the combination of 2 way communication and the lack of reliable messaging and transactions cause complex compensation logic in the event of a failure.
After a refresher of the 8 fallacies of distributed computing, we examined an alternative to the 2 way communications between services. We applied Pat Hellands “Life Beyond Distributed Transactions ? – An Apostate ‘s Opinion” (PDF format) which takes the position that Distributed transactions are not the solution for coordinating updates between services. We discussed why distributed transactions are problematic.
According to Pat Helland, we must find the solution to our problem by looking at:
- How do we split our data / services
- How do we identify our data / services
- How do we communicate between our data / services
Section 1 and 2 were covered in Microservices: It’s not (only) the size that matters, it’s (also) how you use them – part 2 and can be summarized:
- Our data must be collected in pieces called entities or aggregates (in DDD terminology).
- Each aggreate is uniquely identifiable from an ID (for example a UUID / GUID).
- These aggregates need to be limited in size, so that they after a transaction are consistent.
- The rule of thumb is: 1 use case = 1 transaction = 1 aggregate.
In this blog post we will look at section 3 “How do we communicate between our data / services”
How should we communicate between our data / services?
As we mentioned several times before, using 2 way (synchronous) communication between our services causes hard coupling and other annoyances:
- It results in communication related coupling (because data and logic are not always in the same service )
- It also results in contractual-, data- and functional coupling as well as high latency due to network communication
- Layered coupling (persistence is not always in the same service )
- Temporal coupling (our service can not operate if it is unable to communicate with the services it depends upon)
- The fact that our service depends on other services, undermines its autonomy and makes it more unreliable
- All of this results in the need for complex logic compensation due to the lack of reliable messaging and transactions.
If the solution is not synchronous communication, the answer has got to be asynchronous communication?
Yes, but it depends on …… 🙂
Before we dive into the details of what it depends on, let’s first look at the characteristics of synchronous and asynchronous communication:
Based on these characteristics, we briefly categorize the forms of communication as follows:
Synchronous communication is two way communication
The communication pattern visualized in the drawing above is called the Request/Response and is the typical implementation pattern for Remote Procedure Calls (RPC).
With the Request/Response pattern a Consumer sends a Request message to a Provider. While the Provider processes the request message, the Consumer can basically only wait* until it receives a Response or an error (* some might point out that consumer can take advantage of asynchronous platform features so that it e.g. can perform several calls in parallel while waiting, etc. This unfortunately does not solve the temporal coupling between the Consumer and Provider – the Consumer simply cannot continue its work BEFORE it has received its Response from the Provider). The typical execution path flow of Request / Response or RPC is visualized in the diagram below.
As shown in the drawing, there is a strong coupling between the Consumer and the Provider. The Consumer can not perform its job if the Provider is unavailable. This type of coupling is called temporal coupling or runtime coupling and is something we should minimize between our services.
Asynchronous communication is one way communication
With asynchronous communication, the Sender transmits a message to a Receiver over a (transport) channel. The Sender waits briefly for the channel to confirm the receipt of the message (so seen through the eyes of the Sender sending the message to the channel often occurs synchronously), after which the Sender can continue its work. This is the essence of one way communication and the typical execution path flow is visualized below:
Asynchronous communication is also often called messaging. The transport channels in asynchronous communication are responsible for receiving the messages from the sender and responsible for delivering the messages to the recipient (or recipients). The transport channel, so to speak, assumes responsibility for the message exchange. Transport channels can be both simple (eg. supported with Sockets ala 0MQ) or advanced distributed solutions with durable Queues / Topics (as e.g. supported by ActiveMQ, HornetQ, queuing, Kafka, etc.). Messaging and asynchronous communication channels offer different guarantees that govern the message exchange: Guaranteed delivery and Guaranteed message ordering.
Why isn’t asynchronous communication the full solution?
In reality it is, but sadly it is not as easy as asynchronous versus synchronous. The integration pattern used between our services determines the actual coupling level.
If we’re talking true asynchronous one way communication then we are on target with regards to most cases.
The challenge is that two way communication comes in several forms /variations:
- Remote Procedure Call (RPC) – synchronous communication
- Request / Response – synchronous communication
- Request / Reply – also known as synchronous over asynchronous communication.
I have repeatedly seen projects that used Request/Reply (synchronous over asynchronous) to ensure that their services were not temporally coupled. The devil is, as always in the details.
Seen from the Consumer, and for most applications of the Request/Reply pattern, there is not a big difference in the degree of temporal coupling between RPC, Request/Response or Request/Reply as they are all variants of two way communication that forces our sender to wait for a response before it can proceed:
So what is the conclusion?
The conclusion is that 2 way communication between services is the root of many problems and these problems are not getting any smaller by making our services smaller (microservices).
We have seen that asynchronous communication can break the temporal coupling between our services, but ONLY if it takes place as a true one-way communication.
The question is how do you design services (or microservices) that basically only need asynchronous one way communication between each other (communication between UI and services is another matter, which we will soon enter)?
In the next blog post we will look at how we can divide up our services and how they can communicate with each other via asynchronous one way communication.
Appendix over Message guarantees
Guaranteed delivery covers the degree of certainty that a messages will be delivered from the sender to the receiver. These delivery guarantees are typically implemented by a reliable messaging infrastructure.
At Most Once
With this delivery guarantee the Receiver will receive a message 0 or 1 time. The Sender guarantees that message is sent only once. If the Recipient is not available or able to store data related to the message (e.g. due to an error), the message will NOT be resent.
At Least Once
With this delivery guarantee, a message is received 1 or more times (i.e. at least once). The message will be resent until the channel has received an receive-acknowledgment from the Receiver. This means that the message can be received more than once. Lack of acknowledgment from the Receiver can be due to unavailability or failure. Repeated delivery of the same message can be handled by making the recipient’s handling of the message idempotent (idempotence describes the quality of an operation in which result and state does not change, if the operation is performed more than 1 time).
This delivery guarantee ensures that messages are received exactly once . If the recipient is not available or is not able to store data related to the message (e.g. due to an error ), the message will be resent until an acknowledgment of receipt has been received. The difference from “At least once” is that the delivery mechanism is controlled through a coordinating protocol that ensures that duplicate messages are ignored .
Some of the protocols used to implement Exactly Once delivery guarantee is WS- ReliableMessaging and various implementations of 2 Phase Commit .
Another way to obtain the same as qualities, is to use operations that are idempotent and combine this with the At Least Once delivery guarantee.
Supporting idempotence almost always requires that there is a unique identifier / message ID in every message, such that the receiver can use this ID to verify if the message has already been received and processed. The unique identifier could for instance be a GUID or timestamp. Some operations may satisfy idempotence without requiring a unique identifier (such as Deleting an Aggregate).
Guaranteed message ordering
Guaranteed message ordering, also known as “In order delivery ” ensures that messages are received in the order they were shipped. Guaranteed message ordering can be combined with above mentioned delivery guarantees.
Where guaranteed delivery focuses on delivery of each message, guaranteed message ordering is concerned with the coupling or the relationship between several messages.
Challenges around the message ordering will occur if one or more of following circumstances occur:
- There are multiple paths through the channel ( multipath ) e.g. introduced by clustering, a load balancer and other fault tolerance built into the network and infrastructure.
- Dead letter queues. If a message is placed on a Dead Letter queue, due to problems with delivery or handling, it gives us a challenge as to how this error should be handled and what should happen with subsequent messages until the failed message is delivered.
- Clustering of sender or receiver poses the challenge that messages might not be delivered to the channel in the right order or not processed by the receiver in the same order they were delivered.