Cluster computing

Tuesday, February 25, 2014

I'm going to blog about Splunk CLI commands. By the way I'm going to check if fifo input is discontinued. Meanwhile lets talk about some basic CLI commands now.
There are several basic commands and it may take a while to cover all of them. I'll try going case by case such as say for a given task at hand. This way we will know how to use it. Again, there's plenty of literature on docs.splunk.com but my goal here is to mention the ones I've used.
Here's a command to register perfmon. You can modify the inputs.conf file with the details of the perfmon config
splunk add exec scripts\splunk-perfmon.path -interval 60
and splunk enable perfmon
The CLI commands are based on verbs and objects.
You can start or stop splunk with : splunk start splunkd --debug
but you can only do that with splunkd and splunkweb. Also, since we are talking about perfmon events, we can use the CLI to see what perfmon will be collecting with our command:
splunk list perfmon
In this case, it will give you output such as :
Monitored Perfmon Collections:
        LogicalDisk
                _TCP_ROUTING:windowsIndex
                counters:*
                disabled:0
                host:RRAJAMANIPC
                index:windows_perfmon
                interval:10
                object:LogicalDisk
These are what we define in the inputs.conf file.
Note that individual perfmon items can also be enabled or disabled separately
splunk enable perfmon LogicalDisk
and similarly we can disable them individually as follows:
splunk disable perfmon LogicalDisk
CLI commands enable to activate a configuration change with the reload command as
splunk reload perfmon which makes it effective immediately.

Program execution logging seems an art. While it can be dismissed as a chore, for sustaining engineering, this seems an invaluable diagnostic. What would make it easier to troubleshoot problems is when there is a descriptive message when errors occur. Typically these messages are for at-the-moment errors without any indication of what customer could do to mitigate it. I don't mean that error messages need to be expanded to include corrective actions in all cases. That would help but perhaps an association between error messages and corrective actions could be maintained. Say if we keep all our error message strings in one place then it could be easy to correlate the errors to the actions by keeping them side by side.
The corrective action strings need not even be in the logging but the association could help support and sustaining to diagnose issues. Especially when the workarounds are something that's domain knowledge. These will avoid a lot of communication and even help the engineers on the field.
At the same time, this solution may not be appropriate in all cases. For example, where we don't want to be too informative to our customers and where we don't found confound them with too much details. Even in such cases, being elaborate in the error conditions and the descriptive messages may help the appropriate audience to target their actions.
Lastly, I want to add that many feature developers might already be aware of common symptoms and mitigations during their development phase. Capturing these artifacts will help in common troubleshooting with the feature at a later point of time. Building a history or a database of such knowledge via simple bug tracking would immensely help. Since troubleshooters ofter search the bug database to see for similar problems reported.
Another consideration is that the application maintain data structures exclusively for supportability. For example, if there is an enumeration of all the workers for a given component, their tasks, their objects and states and if these can be queried in a pull operation independent of the method they are working on, it would be great. These pull operations could be invoked by views specific to runtime diagnostics. So they can be exposed via methods specific to management. These are different from logging in the sense that they are actually calls to the product to retrieve enhanced runtime information.

Monday, February 24, 2014

I finally wrote a program that demonstrates the high performance reading of Message queues from MSMQ on Windows. No matter the number of queues and the number of messages on the queues, this program illustrates reading via the use of IO completion ports. In brief, what the application does is it has a monitor that collects and configures the queues to be read
and creates a completion port for all the queues. Then it forks off threads that are get notified on this completion port whenever messages arrive. The threads then exchange this data read from messages with the monitor that can file/dump the data away
The completion port enables to specify multiple queues and tolerate any load.
The workers are spawned and closed when the port is ready. Closing the port signals the threads to terminate.
This is convenient for initialization and cleanup. Memory usage is limited to the copying of messages in transit and consequently very small as compared with the overall number of messages.
Secondly the application allows for threads to service any messages from the completion port. The messages are tied back to the queue names based on the overlapped key parameter that the threads set when reading a message. The threads know which queue handle the data is coming from and when reading it they can flag the necessary queue so that proper association can take place.
Another thing to keep track of is the task for all the threads is the same and simply to get notified on messages, to read them and post to the monitor. This way there is no restriction to the concurrency from the applications perspective. However, that said, the concurrency value is typically determined by the number of processors. Since these are OS threads, we rely on what they suggest. We do follow their recommendation to use a completion port but the threadpool we use with the completion port is something we can tweak based on what works. Lastly, I wanted to mention the properties we use for message queue receiving are determined by the application. While we can retrieve a large number of properties for each receive we are typically interested in the message buffer and size. So we need to determine these application chosen properties before we make Receive calls. The threads assume the structure of this context when receiving.

Sunday, February 23, 2014

Splunk 6 has a web framework with documentation on their dev portal that seems super easy to use. Among other things, it can help to gain App Intelligence i.e by improving semantic logging where the meaning can be associated via simple queries, to integrate and extend Splunk, such as with business systems or customer facing applications and to build real time applications that add a variety of input to Splunk.
One such example could be SQL Server Message Broker Queue. The Message broker keeps track of messages based on a "conversation_handle" which is a Guid.
Using a SQL data reader and a SQL query, we can get these messages which can then be added as input to Splunk. We issue RECEIVE commands like this :RECEIVE top (@count) conversation_handle,service_name,message_type_name,message_body,message_sequence_number
FROM <queue_name>
Unless the messages have messageType as "http://schemas.microsoft.com/SQL/ServiceBroker/EndDialog the message body can be read.
The queue listener that drives this should allow methods to configure the listener, and to start and stop.
Using a simple call back routine, the thread can actively get the messages as described and send it to a processor for completion. A transaction scope could be used in this routine.
Inbound and outbound processor queues could be maintained independently to be invoked separately. Both should have methods for process messages and to save failed messages.
The processed messages can then be written to a file for input to Splunk or to use the framework for directly indexing this input.
There are several channels for sending data from SQL server and this is one that could potentially do with a Splunk app.
In general writing such apps in languages such as CSharp or JavaScript has documentation but it would not be advisable to push it any further into the Splunk stack. This is because the systems are different and Splunk is not hosted on SQL server.
If Splunk is hosted on say one of the operating sytems, then certain form of input that is a nuance for that operating system could be considered but in general Splunk foundation on which the apps are built focuses on generic source types and leaves it to used discretion to send it through one of the established channels.

I'm taking a look at the windows IO completion ports today and writing about it. When an IO completion port is created by a process, there is a queue associated with this port that services the multiple asynchronous IO requests. This works well with a thread pool. The IO completion port is associated with one or more file handles and when an asynchronous IO operation on the file completes, an IO completion packet is queued in the First in First out order to the completion port.
Note that the file handle can be in any arbitrary overlapped IO endpoint ranging from file, sockets, named pipes or mail slots etc.
The thread pool is maintained in the Last In First Out manner. This is done so that the running thread can continuously pick up the next queued completion packet and there is no time lost in context switches.This is hardly the case though since thread may switch ports or put in sleep or terminate and the other workers get to service the queue. When threads waiting on a GetQueuedCompletionStatus call can process a completion packet when another running thread enters a wait state. The system also prevents any new active threads until the number of active threads falls below the concurrency value.
In general, the concurrency value is chosen as the number of processors but this is subject to change and its best to use to profiling tool to see the benefits before thrashing. I've a case where I want to read from multiple mail slots and these are best serviced by a thread pool. The Threads from the pool can read the mail slots and place the data packets directly on the completion port queue. The consumer for the completion port will then dequeue it for processing. In this example, The threads are all polling the mail slots directly for messages. and place them on the completion port. This is fast and efficient and polling can delay for queues with same or no current message. However, this is not the same model as a completion port notification for that mail slot or a call back routine for that mail slot. In the latter model, there is a notification, subscription model and it is better at utilizing system resources. These resources can be quite some if the number of mail slots or their number of messages are high. we can make the polling model fast as well with a timeout value of zero for any calls to read the mail slots and skipping those that dont' have actionable messages. However, the notification model helps with little or no time spent on anything other than servicing the messages in the mail slots as and when they appear. The receive call seems to have a builtin wait that relieves high cpu usage.

Friday, February 21, 2014

Yesterday I saw a customer report for a failure of our application and it seemed at first a disk space issue. however, file system problems are generally something that applications cannot workaround.
Here the file system was a NFS mount even though it had the label of a GPFS mount. Further disk space was not an issue. Yet the application reported that it could not proceed because the open/read/write was failing. Mount showed the file system mount point and the remote server it mapped to. Since the mount was for a remote file system, we needed to check both the network connectivity and the file system read and writes.
A simple test that was suggested was to
Try writing a file outside the application with the dd utility to the remote server
Something like
dd -if /dev/zero -of /remotefs/testfile -b blocksize
And if that succeeds, read it back again as follows:
dd -if /remotefs/testfile -of /etc/null -b blocksize
With a round trip like that, the file system problems
could be detected.
The same diagnostics can be made part of the application diagnostics.

Thursday, February 20, 2014

I'm not finding time tonight but I wanted to take a moment to discuss an application for data input to Splunk. We talked about user applications for Splunk and sure they can be written in any language but when we are talking performance reading orders such as for an MSMQ cluster, we want it to be efficient in memory and CPU. What better way to do it than to push it down the way to the bottom of the Splunk stack.This is as close as it can get to the Splunk engine. Besides MSMQ clusters are high volume queues and there can be a large number of such queues. While we could subscribe to notifications at different layers, there is probably nothing better than having something out of the box from the Splunk application.
I've a working prototype but I just need to tighten it. What is missing out of this is the ability to keep the user configuration small. The configuration currently takes one queue at a time but there is possibility to scale that. One of the things I want to do for example is to enable a regular expression for specifying the queues. This way users can specify multiple queues or all queues on a host or cluster with .* like patterns. The ability to enumerate queues on clusters is via name resolution. and adding it to the prefix for the queue names. With an iterator like approach all queues can be enumerated.
One of the things that I want is to do is to enable transactional as well as non-transactional message reading. This will cover all the queues on a variety of deployments. Other than the system reserved queues most other queues including the special queues can be processed by the mechanism above. Making the message queue monitoring as first class citizen of the input specifications for Splunk, we now have the ability to transform and process as part of the different T-shirt size deployments and Splunk roles. This will come in very useful to scale on different sizes from small, medium to enterprise level systems.
I also want to talk about system processing versus app processing of the same queues. There are several comparisons to be drawn here and consequently different merits and de-merits. For example, we talked about different deployments. The other comparisons include such thing as performance, being close to pipelines and processors, shared transformations and obfuscations, indexing of data and no translation to other channels, etc.
Lastly I wanted to add that as opposed to any other channels where there is at least one level of redirection, this directly taps into a source that forms a significant part of enterprise level systems.
Further more, journaling and other forms of input lack the same real time processing of machine data and is generally not turned on in production systems. However Splunk forwarders are commonly available to read machine data.