Saturday, February 8, 2020

The use case for a stream storage in a Message Broker:
A message broker is a software that finds universal appeal to relay messages from one or more publishers to subscribers. Some examples of the commercially well-known message brokers are RabbitMQ, ZeroMQ, SaltStack and others. These message brokers can be visualized as a set of logical queues supported by a distributed cluster that can scale out.
Traditionally, storage for message brokers have been local file systems. Lately, object storage or any form of cloud native queue storage have gained popularity. This article advocates the use of stream storage for message brokers.
Queue services have usually maintained ordered delivery of messages, retries, dead letter handling, along with the journaling. Incoming messages and caches have been mostly write-throughs which reach all the way to the disk. This made periodic asynchronous batching of writes possible with flush to disk being replaced with object storage which was better suited for this kind of a schedule. Object storage was also known to support Zookeeper like paths even with its tri-level Namespace, Bucket and Object hierarchy. S3 worked well even for the processors of the queue as intermediate web accessible storage.
The Queue service needs a cluster-based storage layer or a database server and have traditionally been long standing products in the marketplace. These products brought storage engineering best practice along with their features for geographical replication of objects, content distribution network and message passing algorithms such as Peer to Peer networking.  As long as this queuing layer establishes sync between say a distributed or cluster file system and object storage with duplicity-tool like logic, it can roll over all data eventually to persistence, so the choice of storage does not hamper the core functionality of the message broker.
A stream storage can be hosted on any Tier2 storage whether it is files or blobs. The choice of tier 2 storage does not hamper the functionality of the stream store. In fact, the append only unbounded data nature of messages in the queue is exactly what makes the stream store more appealing to these message brokers.
Overall, there is a separation of concerns in storage and networking in each of these cluster-based products and although message brokers tend to position themselves as peer-to-peer networking over storage in local nodes, it is the exact opposite for the data stores. Since they are independent, the message broker together with a stream store can bring in alternate layers of networking and storage in an integrated and more powerful way that can make the message processors take advantage of stream-based programmability while preserving the exactly once and consistency models for the data.

No comments:

Post a Comment