Cluster computing

Wednesday, April 30, 2014

Today I'm going to elaborate on a noninvasive method of reading MSMQ to trigger alerts. Let us say we keep a round robin of buffers to read from different queues - one round robin per queue or one round robin per rule/application. We will discuss more about the consumers of the round robin buffers shortly. But first we mention what they are. Basically we are reading the queues just as TCP maintains a sliding window. The queues are read in the order of the messages being processed. As each message arrives, it is evaluated against the rules to invoke the corresponding action. The same could have been done by the queue processor. The only difference is that this is now handled externally to the queue processor. The user for the queue alerts module could directly subscribe to the events and provide the delegates as necessary. There is no need for a singleton delegate in such a case. However, the queue alerts module facilitates the subscription to the events by letting the user focus exclusively on the rules. In addition, the queue alerts module provides more functionalities. For example, the queue alerts module translates all the rules registered to filter out the messages efficiently. Secondly, the queue alerts module manages the lifetime of the messages and the actions performed on it. Thirdly, the queue alerts module makes the packets available in a streaming mode to the applications.
The rules to events mapping is avoided by having the rules evaluate against each of the message. This means that all the rules registered by a single application are evaluated at once on the arrival of a message. If the rules evaluate positively, the message is copied to the buffer for the application to read. The messages are copied only as fast as the applications are reading. The messages are not copied to the applications if they are smaller than a size. It is better to provide a delayed write in such a case and this can be configurable. If the application provides a small buffer, the messages can be copied more often as if real-time. There can also be a timeout value specified which handles the case when messages are not available. The flow of data is unidirectional from the source to the application. The queue alerts module focuses on the buffers and the operations. If the buffers are per queue then it can handle bursts in traffic. As each message is pulled from the buffers, it is evaluated and acted upon either by the alerts module or by the application.
In both invasive (queue processor calls delegates) and non-invasive mode ( queue alerts module calls delegates ), the queue processor raises alerts. Additionally, the queue alerts module may mix and match delegates from different applications to each of the queues. As applications update the delegates or the conditions, the queue alerts module reassigns the delegates across the queue buffers. Otherwise it has to evaluate the conditions for the delegate from all applications for every message.

Tuesday, April 29, 2014

Today we will refine the solution to the scalable queue alert design. If we take the approach that we want to subscribe to the events generated by the queue processor, then we must connect the events to the delegate. The event handling mechanism works by subscribing to the events with the Observer design pattern. The observers to the events provide a callback called Notify() and the subject for the observers has a method call NotifyObservers() that calls different notify on all the observers. Delegates are these callbacks. When the queue processor finds an event to raise at any point during its execution, the subscribers to the event know that the state changed because the raise method notifies all the subscribers registered.This is a behavior or interface that the events implement.
The events are not hierarchical. They are discrete. The events are encapsulated in the queues so that they are raised when the state of the queue or the message changes.
The delegates at this level of handling can then invoke the user level delegates that are registered for certain events. Events are generic but the queues that they belong to are specific. When the user specifies a filter, it may apply to one or more queues. Delegates may need to be added to all these queues. If the mapping between delegates and queues may not be clear from the filter such as when the filter is based on a message attribute, it is added to all the queues and the delegates then decide based on the messages whether to take any action. In other words, the user level delegates may subscribe to as many or all events and then take the appropriate action given the queue and the message and the state. This means there is a single user level delegate that can take different actions based on different rules. In such a delegate, there would be several successive conditional checks involved.
We say that the rules are encapsulated in a single user level delegate, this is wired to all the events raised. When the event is raised we have the queue information, the message that it was acting on and the state such as arrival, process begin, process complete, depart etc.
In the queue alerts module, if we take the approach that we select the messages and the queues and store the individual rules to map against, we have a different data structure altogether. Here we get all the messages in a delta sweep that are of interest to the rule evaluation and their corresponding actions, So we store a copy of the messages and queues outside of the queue processor. The mapping between different sets of messages for different rules is the purpose of this data structure. As such we could use different lists for each of the rules.

The design criteria for the queue alert module mentioned in the previous post include :

Support for retries by the queue processor: The design of a queue alert module must consider the retries by the queue processor for the same message. All exception paths including dead letter queues should be considered for the same treatment.

Non-invasive: When possible, we should consider a non-invasive approach that doesn’t require instrumentation of the queues. In other words, it can work for any version of the queues and doesn’t affect the queue processing. It could work by sniffing the data changes or the logs.

Polling: Any polling approach must be robust and rely on relieving high CPU usages during its processing.

Support for transactional as well as non-transactional messages: The alerts module must work for both kinds of messages so that the user can specify the criteria and not be limited to only a set of messages. Concurrent processing of both messages must be supported.

Support for distributed transactions: When transactions involve messages across queues, this alert module should enable evaluating those messages as part of the transaction or at least log the transaction and the actions taken with the transactions so that the final state can be determined by the alerts module.

Support for clusters: The queues may not all be local to a machine and could be distributed on different nodes in a cluster or they may all be in a failover cluster. Alert module should target the messages and the queues and even the machines.

Scoping of alerts: Alerts need not be registered at the message level. They could be registered at the queue level or at the machine level. Whatever the hierarchy chosen would take care of all the alerts at the inner scope by the outer scope. This means that the CRUD on the alerts at a queue scope automatically performs the same at the message scope.

Changes to the rules or registration of alerts: Alerts registered with the alerts module will not take effect until the system reconfigures. This enables the changes to the alerts to be picked up for processing by the module and gives time for setup and cleanup operations by the module.

Deployment: The alerts module should come in a standalone service or executable so that it can be an add-on to existing queue processing. The module itself could be deployable by copying or via an installer.

Automatic administration of rules, actions, messages and queues could be performed where possible.

The use of message format: When interacting with the queues to read the messages, the alerts module will evaluate the fields of the messages against the criteria specified in the rules by the user. The message format should not be opaque and as in the case of MSMQ should expose known fields for evaluation against the rules.

Control of concurrency: The alerts module could make the invocation of actions registered with the rules as concurrent so that the evaluation of an action for a message does not block other.

Full-text or key-value search over message body: The expressions to search over the text of the messages could be resource intensive and optionally enabled. Rules to perform such search could be outside the alerts mechanism and done with the help of an indexer. As such this may not be in scope for the alerts module.

Text messages versus binary messages: The alerts module should support both formats. The module should rely on the fields versus the contents. Subsequent processing of say JSON vs. XML text could be offloaded to other systems.

Asynchronous communication mechanism: This could be enabled between producers and consumers so that they don’t block each other.

Performance: Volumes of hundred thousand transactions per submission that reach millions of transactions per day and involve several messages across different queues should be targeted. Working on a set of few messages or queues or rules or alerts at a time could enable this.

Monday, April 28, 2014

Today we will look at scalable queue alert design I describe a method to evaluate multiple queues for processing. Queues can have multiple messages. Messages can have different attributes. User would like to author rules for actions on queues based on attributes of both queues and messages. If a message arrives in a queue, it is evaluated against all the rules authored by the users for the corresponding action to be taken. Rules comprise of conditions and actions. Conditions are expressions based on attributes and logical operators. Action can be any one of predetermined actions such as running a script or logging. The rules are specified in a user defined function. This helps the user to manage the rules. The rules are all evaluated against each message of each queue. This means that the attributes have to be deterministic, with no side effects and easy to lookup.
When we scale out the queues, we are going to evaluate these rules based on each of the queues. When we process the messages we may do one after the other across queues. This means the user defined rules can work the same across queues and messages.
The rules evaluation for any message in any queue will evaluate to one or more of the actions. The default action is a no-op which is not specified explicitly. No-op here in this case means that no additional actions will be triggered other than the default message processing by the queue. The alerting mechanism is independent of the queue processing and is checked right after the message is processed. This could be done right before the message processing but its only when the message is processed do we know that the current message has been handled.
The queue alert mechanism can live outside the queue processing service. This implies that the queue alert mechanism can be used for journaling in a non-invasive manner. The action corresponding to the queue processing could be to log the messages.
Another use of the queue alert mechanism is to enable different actions to be specified for these messages. For example, action could be to launch a script for selected messages instead of all messages. Scripts could trigger additional workflows
Trigger mechanism needs to be up-to-date with the queues. If the queues are added or deleted, then the rules may need to be re-defined. Evaluation of stale rules should default to no-op. This ensures execution of the messages.

Sunday, April 27, 2014

Today we look at some more usages of random variables. We mentioned so far that random variables can be combined. We know that random variables can be independent and for the different values that it can take, the average value gives a good indication of the summary. Let us take an interesting application of this technique in a hiring problem. Let us say you wanted to hire an office assistant. We can use indicator random variables with this. Let us first describe the problem. When you hire an office assistant, you may have to interview some candidates. You want to hire a suitable candidate but to actually hire somebody you have more costs. You have to fire the current candidate and you must pay a large hiring fee to the employment agency that is sending the candidates. You are interested in estimating this price This is the hiring problem.
This is written as follows:
int Hire-Assistant(int[] candidates)
{
int best = -1; // least - qualified dummy candidate
foreach (var candidate in candidates)
{
if (interview(candidate) > best )
    {
   best = candidate;
    }
}
hire(best);
return best;
}
We now use probability to analyze this problem. In order to do that, we must use the assumptions about the distribution of inputs. Then we analyze our algorithm and compute an expected run-time.
Since we take the distribution over the inputs, we are averaging the running time over all possible inputs.
We use probabilistic analysis when we can make assumptions about the distribution of inputs i.e. we can assume something about the set of all possible inputs and for designing an efficient algorithm and as a means for gaining insight into the hiring problem. For the hiring problem, we can assume that the candidates are coming in random order. This means we can compare any two candidates in any order and decide who's best.In fact, we can use this fact to establish a distinct ranking of the candidates.
An indicator random variable associated with an event A is defined as 1 if the event occurs and 0 otherwise.
Let us determine the expected number of successes for the interviews. Our sample space is S = {S, F} and we define a random variable which can take one of the two values of Success or Failure with equal probability. We can then define an indicator random variable which we can express as the event Y = Success. The expected number of successes obtained in one interview is simply the expected value of our indicator variable.

Saturday, April 26, 2014

In today's post we discuss discrete random variables from the textbook we have been referring. A random variable X is a function from a finite or countably infinite sample space S to the real numbers. It associates a real number with each possible outcome of an experiment, which allows us to work on probability distribution induced on the resulting set of numbers. These variables can also be defined for uncountably infinite sample spaces but we will only look at random variables that are discrete.
For a random variable X and a real number x, the event X = x to be such that {s belongs to S : X(s) = x } thus Pr[ X = x] = Sum Pr[s]
The function f(x) = Pr[X = x] is the probability density function of the random variable X
Per the definitions of probabilities we know that Pr[X = x] >= 0
and that the sum of the individual probabilities is equal to 1.
If we take the example of a pair of dice with six possible outcomes each and we define a random variable X to be the maximum of the two values showing on the dice, then we have
Pr[X = 3] = 5/ 36
because there are 36 possible outcomes when we take the values in pairs
and the value that X assigns is 3 since
it has 5 possible outcomes (1,3), (2,3), (3,3), (3,2), (3,1)
It is common for several random variables to be defined on the same sample space.
If there are two random variables defined on the same sample space, say X and Y
then their co-occurrence has a probability distribution function that is
Pr [ X = x and Y = y] which is the joint probability distribution.
If we fix one of the values, we can vary the other and this can be summed.
For a fixed value y, Pr[Y = y] = Sum of all x Pr[X=x and Y = y]
The same goes for a fixed value of x, where we can vary y.
We can extend this to conditional probabilities as well. For example,
Pr[X = x | Y = y] = Pr [ X = x and Y = y] / Pr [Y = y]
We can say that two random variables x and y are independent if for all x and y
the events X = x and Y = y are independent which we can express as
Pr[ X = x and Y = y] = Pr [X = x].Pr[Y = y]
The simplest summary of the distribution of a random variable is the average of the values it takes.

Friday, April 25, 2014

Today we look at some counting theories:
Counting theories explain how many without actually enumerating how many. This is very helpful when it is not only daunting to count a set of items but also when it is difficult to make the set.
Consider for example how many different ways can we arrange n distinct elements ?
We review some of the elements of counting theory.
A set of items that we wish to count can sometimes be expressed as a union of disjoint sets or as a Cartesian product of sets.
The rule of sum says that the number of ways to choose an element from one of two disjoints sets is the sum of the cardinalities of the sets.
The rule of product says that the number of ways to choose an ordered pair is the number of ways to choose the first element times the number of ways to choose the second element.
We look at them in detail now.
If A and B are two finite sets with no members in common, then the number of ways to choose an item from one of the sets is the count of items in both sets. For example, a license plate may have either alphabets or numbers in each of the position. Since there are 26 alphabets and 10 numbers, there is only one pick out of 36. We can now extend this to sets that have duplicates and the answer does not change because it depends on cardinalities.
If we use the same sets A and B we can express the number of ways to choose an ordered pair is to choose the first element times that from the other set. For example, an icecream with 28 flavors and 4 toppings can be mixed and matched to give 28*4 different icecreams.