Cluster computing

Friday, June 13, 2014

How do we implement a tracer event data in Splunk ?
First, we create a modular input that can take a special kind of data. We know the data structures to do this because we have seen how to use them before
Second, we need a destination or sink where this data will be indexed. We should not be using the null queue since these tracers carry useful information and gathering them over time will help with statistics.
The information is added per node during transit. and different types of nodes such as search peers or indexers or forwarders can stamp their node type.
The overall implementation is just a class declaration and definition that registers itself as a processor for all incoming data and kicks into action only when a specific kind of data is encountered.
The data is very simple and has a reserved field to used to identify the tracer.
The payload in the tracer consists of a simple structured data involving the timestamp, the node type, the node id, and duration of time spent.
Also in any topology, the tracer data will flow from one source to one destination For multicast, there will be many more of the copies of the tracers made. Once they are all indexed we can group them. Also the final destination for the data can be one index or all indexes. In other words we flood the topology to cover all the indexers.
Where the tracer differs from the existing heartbeat functionality is that this is more for the entire route rather than between adjacent source destination. A tracer is triggered to flow through a path consisting of one or more nodes. It should be triggered by the user or periodic runs.

Today we look at a little bit more on this feature. We want to see how the parser, processor, and indexer will treat this data.
Let's start with the parser.

Today I will implement this some more.

Cluster computing

Friday, June 13, 2014

No comments:

Post a Comment