Cluster computing

Friday, February 21, 2020

Thursday, February 20, 2020

We were discussing the use case of stream storage with message broker. We continue the discussion today.

A message broker can roll over all data eventually to persistence, so the choice of storage does not hamper the core functionality of the message broker.

A stream storage can be hosted on any Tier2 storage whether it is files or blobs. The choice of tier 2 storage does not hamper the functionality of the stream store. In fact, the append only unbounded data nature of messages in the queue is exactly what makes the stream store more appealing to these message brokers

While message broker lookup queues by address, a gateway service can help with translation of address. In this way, a gateway can significantly expand the distribution of queues.

First, the address mapping is not at site level. It is at queue level.  

Second, the address of the queue – both universal as well as site specific are maintained along with the queue as part of its location information 

Third, instead of internalizing a table of rules from the external gateway, a lookup service can translate universal queue address to the address of the nearest queue. This service is part of the message broker as a read only query. Since queue name and address is already an existing functionality, we only add the ability to translate universal address to site specific address at the queue level.  

Fourth, the gateway functionality exists as a microservice.  It can do more than static lookup of physical location of an queue given a universal address instead of the site-specific address. It has the ability to generate tiny urls for the queues based on hashing.  This adds aliases to the address as opposed to the conventional domain-based address.  The hashing is at the queue level and since we can store billions of queues in the queue storage, a url shortening feature is a significant offering from the gateway service within the queue storage. It has the potential to morph into other services than a mere translator of queue addresses. Design of a url hashing service was covered earlier as follows. 

Fifth, the conventional gateway functionality of load balancing can also be handled with an elastic scale-out of just the gateway service within the message broker.  

Sixth, this gateway can also improve access to the queue by making more copies of the queue elsewhere and adding the superfluous mapping for the duration of the traffic. It need not even interpret the originating ip addresses to determine the volume as long as it can keep track of the number of read requests against existing address of the same queue.  

These advantages can improve the usability of the message brokers and their queues by providing as many as needed along with a scalable service that can translate incoming universal address of queues to site specific location information. 

Wednesday, February 19, 2020

We were discussing the use case of stream storage with message broker. We continue the discussion today.

A message broker can roll over all data eventually to persistence, so the choice of storage does not hamper the core functionality of the message broker.

A stream storage can be hosted on any Tier2 storage whether it is files or blobs. The choice of tier 2 storage does not hamper the functionality of the stream store. In fact, the append only unbounded data nature of messages in the queue is exactly what makes the stream store more appealing to these message brokers

A message broker is supposed to assign the messages If it sends it to the same single point of contention, it is not very useful When message brokers are used with cache, the performance generally improves over what might have been incurred in going to the backend to store and retrieve That is why different processors behind a message broker could maintain their own cache. A dedicated cache service like AppFabric may also be sufficient to distribute incoming messages. In this case, we are consolidating processor caches with a dedicated cache. This does not necessarily mean a single point of contention. Shared cache may offer at par service level agreement as an individual cache for messages. Since a message broker will not see a performance degradation when sending to a processor or a shared dedicated cache, it works in both these cases. Replacing a shared dedicated cache with a shared dedicated storage such as a stream store is therefore also a practical option.
Some experts argued that message broker is practical only for small and medium businesses which are small scale in requirements. This means that they are stretched on large scale and stream storage deployments which are not necessarily restricted in size but message brokers are also a true cloud storage. These experts point out that stream storage is useful for continuous workload. Tools like duplicity use S3 API to persist in object storage and in this case, message brokers can also perform backup and archiving for their journaling. These workflows do not require modifications of data objects and this makes stream storage perfect for them. These experts argued that the problem with message broker is that it adds more complexity and limits performance. It is not used with online transaction processing which are more read and write intensive and do not tolerate latency.

Tuesday, February 18, 2020

Monday, February 17, 2020

We were discussing the use case of stream storage with message broker. We continue the discussion today.

A message broker can roll over all data eventually to persistence, so the choice of storage does not hamper the core functionality of the message broker.

A stream storage can be hosted on any Tier2 storage whether it is files or blobs. The choice of tier 2 storage does not hamper the functionality of the stream store. In fact, the append only unbounded data nature of messages in the queue is exactly what makes the stream store more appealing to these message brokers

Chained message brokers
This answers the question that if the mb is a separate instance, can message broker be chained across object storage. Along the lines of the previous question, if the current message broker does not resolve the queue for a message located in its queues, is it possible to distribute the query to another message broker. These kinds of questions imply that the resolver merely needs to forward the queries that it cannot answer to a default pre-registered outbound destination. In a chained message broker, the queries can make sense simply out of the naming convention and say if a request belongs to it or not. If it does not, it simply forwards it to another message broker. This is somewhat different from the original notion that the address is something opaque to the user and does not have any interpretable part that can determine the site to which the object belongs. The linked message broker does not even need to take time to local instance is indeed the correct recipient. It can merely translate the address to know if it belongs to it with the help of a registry. This shallow lookup means a request can be forwarded faster to another linked message broker and ultimately to where it may be guaranteed to be found. The Linked message broker has no criteria for the message broker to be similar and as long as the forwarding logic is enabled, any implementation can exist in each of the message broker for translation, lookup and return. This could have been completely mitigated if the opaque addresses were hashes and the destination message broker was determined based on a hash table. Whether we use routing tables or a static hash table, the networking over the chained message brokers can be its own layer facilitating routing of messages to correct message broker.

Sunday, February 16, 2020

Saturday, February 15, 2020

We were discussing the use case of stream storage with message broker. We continue the discussion today.

A message broker can roll over all data eventually to persistence, so the choice of storage does not hamper the core functionality of the message broker.

A stream storage can be hosted on any Tier2 storage whether it is files or blobs. The choice of tier 2 storage does not hamper the functionality of the stream store. In fact, the append only unbounded data nature of messages in the queue is exactly what makes the stream store more appealing to these message brokers

The message broker is more than a publisher subscriber. It is the potential for a platform for meaningful insight into events.

Message broker is also a networking layer that enables a P2P network of different local storage which could be on-premise or in the cloud. A distributed hash table in this case determines the store to go to. The location information for the data objects is deterministic as the peers are chosen with identifiers corresponding to the data object's unique key. Content therefore goes to specified locations that makes subsequent requests easier. Unstructured P2P is composed of peers joining based on some rules and usually without any knowledge of the topology. In this case the query is broadcast and peers that have matching content return the data to the originating peer. This is useful for highly replicated items. P2P provides a good base for large scale data sharing. Some of the desirable features of P2P networks include selection of peers, redundant storage, efficient location, hierarchical namespaces, authentication as well as anonymity of users.  In terms of performance, the P2P has desirable properties such as efficient routing, self-organizing, massively scalable and robust in deployments, fault tolerance, load balancing and explicit notions of locality.  Perhaps the biggest takeaway is that the P2P is an overlay network with no restriction on size and there are two classes structured and unstructured. Structured P2P means that the network topology is tightly controlled and the content is placed on random peers and at specified location which will make subsequent requests more efficient.