Cluster computing

Monday, March 31, 2014

I want to talk about Splunk saved searches and their visibility aka permissions. Splunk has the following permissions :
A saved search could either be only private (local) or it could be visible to the current app or it could be visible to all applications (global)
The roles that Splunk defines are 'Everyone', 'admin', 'can_delete', 'power', 'splunk-system-role' and 'user'.
Each of these roles can have read or write capability.
Saved searches are searches that can be scheduled for runs. The permissions can also be specified via the configuration file with the following format:
[object_type/object_name]
access = read : [ comma-separated list of roles ], write: [comma-separated list of roles ]
We can also set permissions for all objects of a type
[eventtypes]
access = read : [ *]. write : [admin, power]
Objects can also be made to be globally available by adding an entry for the object in default.meta
as
export = system
As an example, to make all event types in the business analytics app viewable in every app in the Splunk installation, we use the following:
[eventtypes]
access = read : [ * ], write : [admin, power ]
export = system
Every app and its objects is governed by a set of permissions and these can be specified for every role within Splunk.
Every user has their own user directory. Objects visibility can be promoted from user level to app level and to global across all apps. This is called sharing and Splunk implements it by moving that object from the user directory to the app directory.

In today's post, we continue our discussion on Splunk. In addition to the discussion on search earlier, I wanted to bring up that search artifacts are maintained in the dispatch folder. We can view all the processing information in these folders including events, commands, arguments, preview and results.
We now look at alerts. Splunk Enterprise can be configured to send alert messages to anyone when real time or historical search results have met a condition. The conditions can cover a variety of threshold and trend based scenarios.
There are three Splunk alert categories. These are per-result alerts, scheduled alerts and rolling-window alerts. Alerts are based on reports that run on a regular interval over a set of historical time range or in real time (if the report is a real time search) When alerts are triggered, different actions are executed. There are several alert actions such as e-mail notifications that we will cover but later.
The per-result alerts are real-time searches that trigger every time the base search returns a result.
This is usually authored to be invoked when an matching result comes in and is generally used in workflow oriented applications. These alerts can be throttled to ensure they don't fire too often.
Alerts based on historical searches usually run on a regular schedule. This alert type triggers whenever a scheduled run of a historical search returns results that meet a particular condition. These are generally lower priority alerts and more for monitoring over time. For example, trigger an alert whenever the number of 404 errors in any 1 hour interval exceeds 100.
Real time searches can also have alerts that monitor events within a rolling time window. These trigger when its conditions are met by events as they pass through this window in real time. For example, trigger an alert when there are three consecutive failed logins.

Sunday, March 30, 2014

We talk about distributed search in Splunk today. As we know from the discussion on distributed deployment, Splunk Enterprise performs three major functions as it moves data through the data pipeline. First, it consumes data from files, the network or elsewhere. Then it indexes the data. Finally it runs interactive or scheduled searches on the indexed data Each of this functionality can be split into dedicated instances of Splunk that can number from a few to thousands.
The dedicated instances that consume data are called forwarders and they can be light weight or universal. The indexers do the heavy indexing and they are usually the ones with the higher performance configurations. Typically, there's one indexer for every 150/200 GBs of daily indexing and a 12 to 24 core server for each indexer instance. Indexers can also be clustered and configured to replicate each others data. This process is known as index replication. Replication helps prevent data loss and improves data availability. Clusters also have a builtin distributed search capability which we will get to shortly. The other type of dedicated instances are the search heads. The search heads co-ordinate searches across the set of indexers, consolidate their results and present them to the user. For the largest environments, several search heads sharing a single configuration set can be pooled together. With search head pooling, we can co-ordinate simultaneous searches across a large number of indexers. With these logical definitions, we can now mix and match the components on same or different physical machines and configurations to suit our needs. Often if the instances share the same machine, it will be a high end machine with several cores and GBs of RAM. Deployment decisions are based on the amount of incoming data, the amount of indexed data, the number of concurrent users, the number of saved searches, the types of searches employed and the choice to run Splunk apps.
Note that distributed search is not about deployment. This is a builtin functionality for Splunk.
In distributed search we look at use cases such as the following:
As a Splunk administrator, I want to distribute indexing and searching loads across multiple Splunk instances to that it is possible to search and index large quantities of data in reasonable time.
As a Splunk administrator, I want to distribute search to control access to indexed data such that the admins and security personnel have full access and others require access only to their data.
As a Splunk administrator, I want to have different Splunk instances in different geographical offices such that local offices have access to their own data while the centralized headquarters hae access at the corporate level.
In the distributed search, the Splunk Enterprise instance that does the searching is referred to as the search head and those that participate in the search are called search peers. Typically, the search head will run search across all its search peers unless otherwise limited as specified by the splunk_server field in the query. If you want to pick a subset for searches, we do this with search head pooling.
When a distributed search is initiated, the search head replicates and distributes its knowledge objects to its search peers. Knowledge objects include saved searches, event types and other entities used in searching across indexes. This is packaged in a knowledge bundle and distributed to the peers.
Knowledge bundles typically reside under the $SPLUNK_HOME/var/run folder and have a .bundle extension. Bundles can contain almost all the contents of the search head's apps. Instead of copying over the bundles to different peers, its sometimes advisable to mount the knowledge bundles. Together with the bundle user authorization flows to the peers. The user running the search, the role and the location of the distributed authorize.conf file are included with the bundle.

In today's post, we bring up a C# application with the SDK we talked about but not the service broker example I gave earlier. we focus on the SDK itself, specifically on the inputs to write an event. Did you know that you can color code the event display with how you specify the events. I didn't until recently. I will go over some of these features and explore them with the SDK.
But first I want to mention that most of Splunks features are also available directly from the management service REST APIs. For example, we can give
curl -k -u admin:changeme https://localhost:8089/services/apps/local/myapp \ -d description="My application"
Splunk SDKs provide a layer over the REST APIs so we can directly instantiate the objects and prepare them. Knowing the REST APIs is still helpful in cases where the APIs are not available for a feature in a particular SDK or when they are marshalling parameters to the REST APIs and you are interested in finding out what is set and what isn't. By the way there is an SDK for most of the languages for web development but that also means that not all SDKs may have the same capabilities. The REST APIs provide a uniform consistent layer for all the features and are also documented similar to the SDKs. There's a reference PDF file available that describes most of the endpoints.
The endpoints are categorized as data, admin etc and we will see a little more on these shortly but I want to bring up that the REST APIs are easy to use with tools like curl and Fiddler.
Splunk's REST APIs are available for access control, applications, clusters, configurations, deployment, Indexes, inputs, knowledge, license and output. Indexes, input and output are something we will cover first.
We will also look at Search REST API. Searches can be adhoc or saved. Saved searches can be scheduled for runs. There's also auto-complete functionality as well.
I will return to this post shortly. I want to cover distributed search.

Friday, March 28, 2014

Today I will continue my discussion on thread level software instrumentation
On window s there thread local storage available. Here we create a buffer and keep all our stack based data we collect. To collect CPU ticks, we read the system clock in the constructor and again in the destructor of an object we put on the stack. The diff is the time spent executing code and can be cumulated with the running total on the thread to see how much time each takes. Note that its ideal to keep this counter on the scheduler. since threads could attach and detach from the currently executing run.
Similarly it is easier to have a base method for allocations and deletions and they can be take the stack trace when the corresponding system calls are made. This then gives us a histogram of stack traces for allocations and we can use that to find the biggest users of memory or the most frequent users.
Having a thread local storage is desirable on any platform especially given the functionality available via language specific provisions such as C++ 11 that introduces thread_local. Creating a pointer sized variable enables allocation of an arbitrary buffer reserved for this thread which allows thread specific instrumentation that is not normally covered from routines that place a variable on the stack. Undeniably the latter is required too. For example, crash logs are written this way. But here we specify what else we can add to crash log notifications via thread local storage.
The first thing we said is historical stack traces that are not captured by as-of-this-point-of-time stack trace by stack local variables.
When do we take this stack trace, for every exception encountered. Even if not all of them are crash notations.The suggestion is that by keeping track of the last few exceptions at any given time, when we get a dump, it gives us another data point of interest.
We add an exception handler to the chain of handlers that each thread maintains. The first of these we will customize to take a stack trace, put it on our thread local ring buffer, and return 'keep looking' so that other handlers including the most recent frame's own handler to get a chance to handle the exception.
Why is this better than logging ? It's complimentary to logging. Dumps and logs are both relevant for diagnosis and often every bit of information helps.
One consideration for the ring buffer is that it could be implemented as a skip list for fast traversal.
Creating a ring buffer on each thread is one part. It could be used for different purposes.
using it for storing exceptions encountered us one such purpose. Usually the operating systems meet this purpose and using it for applications specific tasks is preferable. One such purpose could be to list the last user query or commands
For now we will just focus on creating a ring buffer for every thread on all platforms.
We do this whenever we start a thread or have a callback for initialization routines. There should be fewer of those through out the source code.

After discussing about MSMQ and how to monitor messages (not journaling) on the queues without requiring any additional switches or trigger services on a production instance, I stumbled on and found the tracing for MSMQ under logman query providers and validated that we don't seem to have trace where each message arrival is traced. Hence I found another argument point in favor of what I would like to build for my application.
In any case, here's some more context on how I stumbled on this finding that I want to write an MSMQ monitor for my application.
First, I'm increasingly studying high memory usage in select scenarios with my application and in this context I wanted to find out non-invasive options for production systems of customers. So I came across our windows tools namely logman/tracelog/xperf:
We can turn on private heap tracing on windows either with logman (an out-of-box tool : we just type 'logman query providers' http://msdn.microsoft.com/en-us/library/ms751538(v=vs.110).aspx) or xperf from Windows Performance Analysis Center (http://msdn.microsoft.com/en-us/performance/default.aspx)
tracelog -start heapTrace -f c:\temp\heapTrace.etl -heap -pids 1 <pid_of_splunkd>
tracelog -stop heapTrace
OR xperf -start HeapSession -heap -PidNewProcess "myapp.exe" -BufferSize 1024 -MinBuffers 128 -MaxBuffers 128 -stackwalk HeapAlloc+HeapRealloc
xperf -stop HeapSession -stop -d c:\temp\xperf_heap.etl
we could also explore stackwalking with xperf -help stackwalk and xperfview -> Get Stack counts by type
Xperf allows us to view heap allocation information in addition to kernel info such as CPU and with stacktraces at the time of allocation! More here : http://msdn.microsoft.com/en-us/library/windows/hardware/hh162920.aspx
And if tracing doesn't work we use Procdump to take a fulldump of our process on production systems without stopping the process http://technet.microsoft.com/en-us/sysinternals/dd996900.aspx
If we had the luxury we could use gflags, appverifier, pageheap and umdh

But anyways getting back to my discussion on my application. Here's what I wanted to try writing next after MSMQ monitor:
These will be thread based monitors for resource usages and circular ring buffer of stacktrace collections
A circular ring buffer of stack trace collections per scheduler or node is very helpful in diagnosing a variety of information. In terms of memory used, we use only pointer sized bytes for each frame ptr and ten of those for each backtrace and say ten to fifty of those buffers in the circular ring buffer should not add upto much.
The stack trace capture to circular ring buffer could be triggered by anything - exceptions, memory allocation and deallocation etc.
In fact the cpu usages are also easy to watch, we read cputicks and += the delta that we are on and off the stack.

Thursday, March 27, 2014

Today I will cover a software component I'm looking into that consolidates the configuration and administration of the dynamic atom feeds from different model generators that I mentioned in the previous to previous post. When a user makes a REST request to the server, the server does some checks to determine the response to return to the user. In our case, we are looking at atom feeds from different REST handlers all of which conform to this security and configuration mechanism. We will study this component in some depth and will try to come up with potential improvements.
First this component is called the AdminManager. This component is responsible for managing the server side model generation that is made available for consumption by various interfaces such as the UI and the command line Interface. In a way, this component automates access and configuration of the models that the server generates.
Let us take a closer look at this component. There's several things to mention.
We begin with access control. This component lists some common actions for all the handlers to respond to different REST endpoints. These are listed as create, edit, remove, list, members, reload, new, move, enable, disable and doc. More than one action can be checked together with a bit mask. The access control can be checked with capabilities. There are two capabilities specified. These are read and write. Actions translate to capabilities that could be demanded or set by handlers. Handlers can be enabled / disabled. Results from the handlers can be filtered. Actions and capabilities/permissions determine the type of response. We don't block execution, we just return error or results.
Handlers all derive from a base class. The base class has methods for the actions involved. Each derived handler can override these methods corresponding to each action to produce the result they want. If the overrides are not specified, they default to error. There are flags for some of the actions to see if these actions are possible such as if you want to move an item. There is also a custom action possible that handlers can use to extend there functionality. The actions can return results in JSON or XML format. Applications and Users alike that can utilize these REST handlers can be specified with wild cards. It can also take a list of features. These features could be considered a pre-requisite for the handlers to execute.
Access control itself involves several checks. We discuss these now.
First supported actions are filtered. Whether a user can read or write is determined based on the capability for the corresponding read or write permission. The user is checked for the permission involved. This is not necessarily an RBAC access control but merely a check against the privileges for the user. The system user for instance has all the privileges.The capabilities a user has is determined from the license manager for instance or by the explicitly added capabilities. This is also necessary from audit purposes. If the auditing is not necessary, then the check against capabilities could always return true. On the other hand if we are to audit all denies, we should leave a trail in the audit log.
Actions and capabilities are mapped and determine the access for the most part.
Next we talk about ACLs and how they are set and determined. Setting ACLs is itself a custom action and requires write capability. Owner, sharing and permissions are specified when setting the ACL. The rule of thumb here has been to have just enough checks in place and nothing too complex or flexible to be called an overkill. That shouldn't limit us from making it flexible for extensions in the future. Today we require owner, sharing and permissions for a resource, but tomorrow we could have not just read and write permissions but permissions such as Full control, modify, read and execute, read and write, special permissions etc. Today we have explicit privileges, tomorrow they could be chained or inherited. Today we have auditing for users, tomorrow we could differentiate, users, groups, apps, service accounts etc.Today we have a builtin user for system, tomorrow we could have many different builtin users and groups. Today when we mean share we share all permissions, tomorrow we could set the permission levels to whom we share with.
Now coming back to determines the lines of authentication, authorization and auditing in the calls for processing REST requests.
First is the initialization step. The capabilities and evaluation rules are added from a configuration file. In this configuration file, for a particular endpoint, the capabilites and evaluation rules will be mentioned when the handler was first registered with this mechanism.
The capabilites could be for example read, write, search. etc. The rule could be the endpoint qualifier to match.
Note that this configuration file is a great start for re-organization. When there is enough handlers registered to see the kind of capabilities, matches, actions, customizations specified, the new architecture could be used to make it easier for the components to specify what they want.
Further more, most of the configuration variations will come in handy to provision the central governance that this handler meets. If there are enough handlers specified in the AdminManager configuration file, this is a low hanging fruit for a rewrite.
At this point we are able to filter the supported actions based on the configuration information and the passed in user context and the look up of the associated user's capabilites. This filtering of supported actions is where the authorization line is crossed.
After the initialization step and depending on the authorization, we can ACL resources as we have described earlier.
Next we evaluate the custom actions. Actions can have arguments. These can be used together with the actions to differentiate the kind of response required.
Next we validate the arguments, their presence and values as specified in the configuration.
At this point the authorization is complete and we are ready to execute the handler.