Wednesday, January 29, 2014

Today I'm going to talk about Splunk.
 And perhaps I will first delve into one of the features. As you probably know Splunk allows great analytics with Machine Data. And it treats data as key value pairs that can be looked up just as niftily and as fast as with any Big Data. This is the crux of the splunk in that it allows search over machine data to find the relevant information when its otherwise difficult to navigate the data due to its volume.  Notice that it eases the transition from organizing data to better query. The queries can be expressed in select form and language.
 While I will go into these in detail including the technical architecture shortly, I want to cover the regex over the data. Regex is powerful because it allows for matching and extracting data. The patterns can be specified separately. They use the same meta characters for describing the pattern as anywhere else.
 The indexer can selectively filter out events based on this Regex. This is specified via two configuration files Props.conf and Transforms.conf - one for configuring Splunks processing properties and another for configuring data transformations.
Props.conf is used for linebreaking multiline events, setting up character set encoding, processing binary files, recognizing timestamps, setting up rules based source type recognition, anonymizing or obfuscating data, routing select data, creating new index time field extractions, creating new search time field extractions and setting up lookup tables for fields from external sources. Transforms.conf is used for configuring similar attributes. All of these require corresponding settings in props.conf
This feature adds a powerful capability to the user by transforming the events, selectively filtering the events and adding enhanced information. Imagine not only working with original data but working on something that can be transformed to more meaningful representations. Such a feature not only helps with search and results but also helps better visualize the data.
 

No comments:

Post a Comment