This article is in continuation of a previous post. We were referring to the design of message queues using object storage.
Most message queues scale by virtue of the number of nodes in a cluster based
deployment. Object Storage is accessible over S3 APIs to each of these nodes.
The namespaces and buckets are organized according to the queues so that the
messages may be looked up directly based on the object storage conventions.
Since the storage takes care of all ingestion related concerns, the nodes
merely have to utilize the S3 APIs to get and put the messages. In addition, we
brought up the availability of indigenous queues to be used as a background
processor in case the data does need to be sent deep into the object storage.
This has at least two advantages. First, it is flexible for each queue to
determine what it needs to do with the object. Second the scheduled saving of all
messages into the object storage works well for the latter because it is
continuous feed with very little read access.
This prompted us to separate this particular solution in its
own layer which we called the cache layer so that the queues may work with the
cache or with the object storage as required. The propagation of objects from
cache to storage may proceed in the background. There are no mandates for the
queues related to the cache to serve user workloads. They are entirely internal
and specific to the system. Therefore
the schedule and their operation can be set as per the system configuration.
The queues on the other hand have to implement one of the
protocols from AMQP, STOMP or so on.
Also, customers
are likely to use the queues in one of the following ways each of which implies
a different layout for the same instance and cluster size.
- The queues may be mirrored across
multiple nodes – This means we can use a cluster
- The queues may be chained where
one feeds into the other – This means we can use federation
- The queues may be arbitrary depending on application needs – This means we build our own aka the shovel work
Consequently the queue layer can be designed independent of
the cache and the object storage. While Queue services are available in the
cloud and so are the one-stop—shop cloud databases, this kind of stack holds a
lot of promise in the on-premise market.
While the implementation of the queue layer is open, we can
call out what it should not be. The queues should not be implemented as
micro-services. This fails the purpose of the message broker as a shared
platform to alleviate the dependencies that the micro-services have in the
first place. Also the Queues should not be collapsed into the database or the
object storage unless there is runtime to process the messages and the
programmability to store and execute logic. With these two extremes, the queue
layer can be fashioned as an api gateway, switching fabric and anything that
can handle retries, poison queue, dead letters and journaling. Transactional
semantics are not the concern here since we are relying on versioning. Finally,
the queues can use existing products such as ZeroMQ, RabbitMQ if they allow
customizations for on-premise deployment of this stack.
No comments:
Post a Comment