Cluster computing

Tuesday, February 18, 2014

In terms of storage, we discussed that local storage is preferable for each worker. The resources are scoped for the lifetime of a worker. There is no co-ordination required between producers and consumers for access to resources. Storage can have a collection of data structures. With the partitioning of data structures, we improve fault tolerance.
In our case we have n queues with arbitrary number of messages each from a cluster. To best process these, we could enumerate and partition the queues to different worker threads from a pool. The pool itself can have different number of workers as configurable and the number of queues assigned to any worker could be determined based on dividing the total number of queues by the number of threads.
The queues are identified by their names so as such we work a global list of queue names that the workers are allotted to. This list is further qualified to select only those that are candidates for monitoring.
Whether a queue is candidate for monitoring is determined by a regular expression match between what the user provides and the name of the queue. The regular expression and pattern matching is evaluated against each name one by one to select the filter of candidate queues.
The queues are enumerated based on windows API and these are with the corresponding begin and get next methods. Each queue retrieved will have a name that can be matched with the regex provided.
The queues may have different number of messages but each monitor thread works on only the current message on any queue. If that message is read or timeout, it moves to the current message of the next queue to work on . All candidate queues are treated equally with the optimization that no messages are fixed costs that we could try to reduce with say smaller timeouts
If we consider this round robin method of retrieving the current message from each of the queues, there is fair treatment of all queues and a guarantee for progress. What we will be missing is whether we can accelerate on queues where the same or no messages are current. If we could do that, we would be processing the queues with more number of messages faster. If we didn't do round robin, we wouldn't fair to all queues. Therefore we do not identify the priority queues based on the number of distinct messages they carry. The method we have will process the queues with more number of messages and will scale without additional logic or complexity.
Each set of queues are partitioned for workers so there is no need to solve any contention and load is optimal per worker.
The number of threads could be taken as one more than the number of available processors.

Monday, February 17, 2014

We review command line tools used for support of Splunk here.
cmd tool can invoke other tools by including the required preset environment variables. These can be displayed with the splunk envvars command.
The btoollllllll can be used to view or validate the Splunk configuration files. This is taking into account configuration file layering and user / app context i.e the configuration data visible to the given user and from the given app or from an absolute path or with extra debug information.
btprobe queries the fish bucket for file records stored by tailing by specifying the directory or crc compute file. Using the given key or file, this tool queries the specified BTree
classify cmd is used for classifying files with types.
fsck diagnoses the health of the buckets and can rebuild search data as necessary.
hot, warm, thawed or cold buckets can be specified separately or together with all.
locktest command tests the locks
locktool command can be used to set and unset the tool
parsetest command can be used to parse log files
pcregextest command is a simple utility tool for testing modular regular expressions.
searchtest command is another tool to test search functionality of Splunk.
signtool is used for verification and signing spunk index buckets.
tsidxprobe will take a look at your time series index files or tsidx and verify the formatting
or identify a problem file. It can look at each of the index files.
tsidx_scan.py is a utility script to search for tsidx files at a specified starting starting location, runs tsidxprobe for each one, and outputs the results to a file.
Perhaps one more tool that could be added to this belt is one that helps with monitoring and resource utilization to see if the number of servers or settings can be better adjusted

Saturday, February 15, 2014

What's a good way to scale the concurrent processing of a queue both on the producer and consumer side ? This seems a text book question but think about support on cross platform and high performance. Maybe if we narrow down our question to windows platform, that would help. Anyways, its the number of concurrent workers we can have for a queue. The general rule of thumb was that you could have as many threads as one more than the number of processors to keep everyone busy. And if its a light weight worker without any overhead of TLS storage, we could scale to as many as we want. The virtual workers can use the same pool of physical threads. I'm not talking fibers which don't have the TLS storage as well. Fibers are definitely welcome over OS threads. But I'm looking at a way to parallelize as much as we can in terms of number of concurrent calls on the same system.
In addition we consider the inter worker communication both in a failsafe, reliable manner. OS provides mechanisms for thread completion based on handles returned from the CreateThread and then there's a completion port on windows that could be used with multiple threads. The threads can then close when the port is closed.
Maybe the best way to do this would be to keep a pool of threads and partitioned tasks instead of timeslicing.
Time-slicing or sharing does not really improve concurrent progress if the tasks are partitioned.
Again, it helps if the tasks are partitioned. The .Net task parallel library enables both declarative parallel queries and imperative parallel algorithms. By declarative we mean we can use notations such as 'AsParallel' to indicate we want routines to be executed in parallel. By Imperative we mean we can use partitions, permutations and combinations with linear data.
In general a worker helps with parallelization when it has little or no communication and works on isolated data.
I want to mention diagnostics and boost. Diagnostics on a workers activity to detect hangs or for identifying a worker among a set of workers are enabled with such things as logging or tracing and
identifiers for workers. Call level logging and tracing enable detection of activity by a worker. Between a set of workers, IDs can tell apart a worker from the test. Logging can include this ID to detect the worker with a problem activity.
There can also be a dedicated activity or worker to monitor others.
Boosting a workers performance is in terms of cpu speed and memory. Both are variables that depend on hardware. Memory and caches come very helpful in improving the performance of the same activity by a worker.

Friday, February 14, 2014

Thread windows style

I found this WINgnam style implementation on the web:


#ifndef _WIN_T_H_

#define _WIN_T_H_ 1
#include <iostream>

nclude <cassert>

#include <memory>
#
i#include <windows.h>

class Runnable {
pub
#include <process.h>

lic:

virtual ~Runnable() = 0;

 virtual void* run() = 0;

 };

class Thread {
public:

le> run);
 Thread();
 virtual ~Thread
 Thread(std::auto_ptr<Runna
b();
 void start();
 void* join();
private:
 HANDLE hThread;

matically
 std::auto
 unsigned wThreadID;
 // runnable object will be deleted aut
o_ptr<Runnable> runnable;
 Thread(const Thread&);

d when run() completes
 void setComplete
 const Thread& operator=(const Thread&);
 // call
ed();
 // stores return value from run()
 void* result;
 virtual void* run() {return 0;}

hread(LPVOID pVoid);
 void printError(LPSTR lpszFunction, 
 static unsigned WINAPI startThreadRunnable(LPVOID pVoid);
 static unsigned WINAPI start
TLPSTR fileName, int lineNumber);
};

class simpleRunnable: public Runnable {
public:
 simpleRunnable(int ID) : myID(ID) {}
 virtual void* run() {

ad: public Thread {
public:
 simpleThread(int ID) : myID(ID) {}
  std::cout << "Thread " << myID << " is running" << std::endl;
  return reinterpret_cast<void*>(myID);
 }
private:
 int myID;
};

class simpleThr
e
 virtual void* run() {
  std::cout << "Thread " << myID << " is running" << std::endl;
  return reinterpret_cast<void*>(myID);
 }
private:
 int myID;
};


#endif

Thursday, February 13, 2014

We look at some more inputs to Splunk today. The SDK offers the following TcpSplunkInput class, UdpInput class, WindowsActiveDirectoryInput class, WindowsEventLogInput class, WindowsPerfmonInput class, WindowsRegistryInput class, WindowsWmiInput class
. In addition there's ScriptInput and MonitorInput class.
The Input class is the Splunk Input class from which all specific inputs are derived. The InputCollection is a collection of inputs and it has heterogenous members with each member mentioning its type of input. The type of input is identified by the InputKind class. The different InputKinds are monitor, script, tcp/raw, tcp/cooked, udp, ad, win-event-log-collections, registry, WinRegMon, win-wmi-collections.
The monitor input monitors files, directory, script or network for new data. It has a blacklist, whitelist and crcSalt. A crcSalt is the string that Splunk has for a matching cyclic redundancy check.
As with all inputs, the corresponding Args class are used to specify the arguments to the inputs.
A ScriptInput represents a scripted data input. There is no limit to the format or content of this kind of data. A TcpInput is for raw TCP data as in the capture directly over the wire without any additional application layer processing. The latter is handled by TcpSplunkInput class.
The UdpInput represents the UDP data input. Note that there is no separate class for cooked udp input. Can you guess why ? It has to do with sessions and application logic.
A WindowsActiveDirectoryInput reads directly from the Active Directory. Since organizations secure their resources via the ActiveDirectory, this is the best input to know the hierarchy of the organization.
The Windows event log input reads directly from the event sink for windows. All windows and user applications can generate event logs and these are helpful in troubleshooting.
The Windows perfmon event input reads performance monitoring data and this is helpful for operations to see the load on the server in terms of utilization of critical resources such as memory and cpu.
The Registry input is used to gather information on windows registry keys and hive where applications and windows persist settings, state and configuration data between server reboots.
The WMI input is different from other inputs in that WMI is used for server management by operations and is a different kind of data provider.
What we could add is perhaps a MSMQ input since this has access to all the messages in the windows message queuing. These messages could be from active named private or public queues, dead letter queues, poison queues, and even journal queues. i.e everything except the non-readable internal reserved queues. When journaling is turned on we get access to all the messages historically versus getting notifications as and when the messages arrive.

Wednesday, February 12, 2014

Splunk has an SDK for programmability in different languages including C#. The object model is granular to enable the same kind of functionality as with the web interfaces. We briefly enumerate a few here:
There's an application class that represents the locally installed Splunk app.
There's an application archive that represents the archive of a Splunk app.
Both application and application archive derive from Entity class. The Entity class represents the base class for all Splunk entities. EntityMetadata class provides access to the metadata properties of a corresponding entity and can be instantiated with the static GetMetdadata() method.
The application args class extends Args for application creation properties. The Args class is a helper class so that the Splunk REST APIs can be called with key value pairs arguments
ApplicationSetup class represents the setup information for a Splunk app.

The BaseService functionality is common to both Splunk Enterprise and Splunk storm. The ConfCollection represents the collection of configuration options

Alerts are represented by FiredAlert class and their groupings - FiredAlertGroup and FiredAlertGroupCollection.

The HttpService class is for the web access and uses both http and https protocols.
The Index class represents the Splunk DB/Index object. Index also comes with corresponding IndexArgs.
The Job class represents a search Job and comes with its own JobArgs, JobEventArgs, JobExportArgs, JobResultsArgs, and JobResultsPreviewArgs. The Message and MessageArgs and MessageCollection are used to represent Splunk messages.
The ResultsReaderJson and ResultsReaderXML are specific derivations of the abstract ResultsReader class used for reading search results.

The MonitorInput class represents a monitor input which is a file, directory, script or network port and soon to include windows message queuing. These are monitored for new data.
The WindowsActiveDirectoryInput, WindowsEventLogInput, WindowsPerfmonInput, WindowsRegistryInput and WindowsWmiInput corresponding data input class.

The Receiver class exposes methods to send events to Splunk via the simple or streaming receiver endpoint.