Cluster computing

Friday, April 11, 2014

In today's post we continue our discussion a Fiddler like application with a modular input to Splunk. One of the things we have to consider is that on production machines the type of network traffic may be very different from a desktop. So the first thing one has to do is to determine the use cases. There is more inbound traffic on a pproduction system than there I'd outbound. While there is a lot of information to gather on inbound traffic such services already being outsourced to third party proxy providers. What this app does is it gives a monitoring toil in the hands of the individual users or administrators that the cab rub on desktop. Note that even the light weight forwarder is only deployed to a handful of machines for each instance if an Enterprise class Splunk server. What we are talking about can scale to several thousands with one instance on each machine and at the same time be ubiquitous as in they can be on mobile devices as well.
Furthermore, we could argue that packet capture tools can be turned on by admins on all desktops and these could be logged onto a common location from which Splunk can read and populate the events. In practice, seldom do we enable such applications by default without user opt in on grounds of privacy and security even for employees of an organization. Besides it leads to more maintenance overhead with very little benefit for governance or security. Its more common on the other hand to selective control the intranet and internet zones, proxies etc across the organization and not go after individual computers with the exception of software updates and publishing. That said, a central log appending from multiple sources is also not common for the sake that it introduces a central point of failure, and possibly slower responsiveness on the individual users computer. That is why its better to separate this overall workflow into two separate workflows - one for pushing an app onto the individual users computer through software updates and push mechanism - and another to collect the events / log gathered by this app onto a common file or database index. The former is where our app or Fiddler will come in useful. The latter is what Splunk is very good at with its own collect and index mechanism. Our app goes an extra step over Fiddler in that it collects the packet captures and forwards to Splunk. This way we have absolutely no problems in utilizing the best of both for the number of individual user computers in an organization.

We will now look at what all things a proxy does. We will try to see not only the relay behavior but also the filtering it can do. Proxies support promiscuous mode listening. In our case we have a transparent proxy that does not modify the requests or responses. Proxies can also be forward or reverse. A forward proxy helps with anonymity in that it retrieves resources from the web on behalf of the users behind the proxy. A reverse proxy is one that secures the resources of the corporate from outside access This comes in helpful to maintain quarantine lists to stage access to the protected resources. Network address translation is a useful concept in this regard. Its also referred to as fencing when used with virtual machines. A reverse proxy can do several things such as load-balancing, authentication, decryption or caching. By treating the proxy as another server, the clients don't know which server processed the request. The reverse proxy can distribute the load to several servers which means the configuration is not just a fail over but a load balancing one.
SSL acceleration is another option where this proxy enables hardware acceleration and a central place for SSL connectivity for clients.
The proxy can also choose to serve/cache static content and facilitate compression or to communicate to the firewall server since the rules on the firewall server could now be simplified.

Thursday, April 10, 2014

I will be exploring the Netmon option for Splunk in this post. Hopefully, I should be able to cover most of the technical details. This application is expected to capture http or https traffic only, hence it is much more similar to Fiddler than it is to Wireshark or NetMon since they are based on the lower layers of networking stack in the application. The principle in a packet capture tool such as this is to substitute the internet proxy with one from the application that has an address and port as say 127.0.0.1:8888. If we look at the System.Net.WebProxy class, we have the options to set the Address property and specify the web proxy for instances of the web request in an application. Global proxy settings are specified in the machine and application level configuration file. In Internet-Explorer when we make proxy settings per-machine rather than per user, we are forcing all users to use the same proxy settings rather than to use their own settings of the proxy. This works well for packet capture because when we switch the proxy we know that we will capture all traffic. In production environments, this is typically not an issue. Since our application sits between the WinInet and the corpnet proxy on these machines, it should be able to capture all the http/s traffic. We may have to call the WinInet methods by PInvoke or use the poshHTTP class as available on the Internet. WinInet exposes four methods InternetOpen, InternetCloseHandle, InternetSetOption, and InternetQueryOption to set and retrieve Internet settings. With the WebBrowser data structure, we can push the settings to all applications accessing the internet.
Having talked about setting and removing our proxy, we now look at the other role of the proxy which is to play the man-in-the-middle. This is done via the forward option. Note that Splunk has an inbuilt option to read network traffic at a TCP port and localhost. The same can be used from the proxy to log packets. However the proxy we implement also needs to forward the packets outbound from the machine. And this is what we look at next.
Here we have to pay attention to three things:
First is that we should relay the packets hence we need to parse the destination and forward to that address and port (typically 80 or 443).
Second, we turn off the keep-alive and send the requests and responses as if new i.e. we use the connection-close on both sides.
Third, our forwarding is in both directions so we have the same implementation for source-destination basis.
With the above, we have barely scratched the surface but we know this is viable.
One important thing to mention here is that the the requests and responses will have parses be data because they have headers. By that we mean no matter what the source and destination is the requests and responses can be input to Splunk with a predetermined event schema. The value addition to regular packet capture is the Splunk search that we can now do to enable powerful analytics.

Another Splunk app could be to perform a neteork capture on any machine. This could be machine local capture or app local or even a browser page local. In these cases, the functionality is to sniff the packets similar to what Netmon, Fiddler or Wireshark does. We can listen on the network interface and log each packet. There's functionality available on windows to do that with the wininet library. We could even filter the packets per process. The application can be written in C# and with Splunk SDK library. All the packets captured can be a modular input to Splunk. The events written to Splunk can have the same format as the request response details on any one of the network capture tools. In this tools there is an option to interact and replay the packets or modify and sens a new one. In this case, this is a passive listen and log mode. The packets may be binary or text and we may choose to capture only text. We may have to switch the proxy for the duration of this application. And the packets logged can have all the header information even if it is https.
What this functionality provides is a way to search the packets in a way like we can on Splunk with powerful queries that no other application has. It will also give us the ability transform data before it is indexed. Further more, the application could have a great deal of machine data generated over time. Where it differentiates from the existing Splunk out of box ability to log top and udp pack wets is its scope. The scope can be a machine or app or session. It should even be able to work on a headless server or on a device. In Splunk language it would be able to work on a single lightweight forwarder that sends events to an indexer from which search heads can pull results for the users queries.

Wednesday, April 9, 2014

We talk about a Splunk app today to read installation logs on a machine. The app monitors and targets installation logs. Optionally, it could try to detect when an application is being installed and turn on verbose logging. The application itself can be packaged in Wixsharp. Also, the application works on windows installer technologies and is to be implemented in C#.
In addition the application will send all classification of messages from these files to the Splunk server as events to be indexed. So it could have a modular input for this machine.
As with the Octopus tool, this app could read different machines.
The Octopus tool is known for facilitating MSI installations on datacenter machines because it can do that consistently for any subset of the machines in the data center. It repeats the same process that it does on one machine with others as well.
Octopus also has a web interface. and this is convenient to choose the machines on which it deploys. For example, we can choose the package we want to deploy and the target machines and Octopus can deploy on all these machines. Should there be a need to change the configuration on the machines, they can be parameterized and passed to the application. This is very convenient.
The application could follow the same as what Octopus does in the sense that it reads from multiple machines and collects the data together with the hostname. The hostname has to be granular in that it should resolve to the physical machine in say a cluster. This kind of granularity is important because we want to associate the logs to the machines.
The application could also look for different levels of details such as whether there were errors in the logs, whether there were registries altered, whether there were files touched, whether there were and settings changes, whether there were any custom actions, font files etc. A lot of the installations leave behind the log files and these are more handy than the installation information received from msiinv and msiinfo kind of tools. These tools can be run for any target machine to get information on the current state of all the applications and their installations. That also provides a valuable input for feeding events into the Splunk. The tools can be run at periodic intervals or on demand as well although the former is recommended and only the deltas may need to be fed into Splunk.

Tuesday, April 8, 2014

We continue our discussion on Ford Fulkerson algorithm today. We talked about improving the efficiency of Ford-Fulkerson basic algorithm with a data structure corresponding to the directed graph, G' = (V, E') where E' implies there is an edge from u to v or from v to u in E. and edges E are also in G'. Given a flow f on G, the edges in the residual network Gf, consists of all edges (u,v) of G' such that c(u,v) - f[u,v] <> 0 The time to find a path in the residual network is now O(V + E') = O(E) if we use depth first or breadth first search. This is a significant improvement. Each iteration on the while loop now takes O(E) time making the overall complexity as O(E | F* |)
Since the algorithms works on incremental units of flow, we look at whether this algorithm scales for large networks. For simple networks, we know that it works efficiently since the |f*| is small.
In a large network, maximum flow can have a value to the order of 10 ^ 6 and this many units of flow traverse a path s->u->t while another so many units traverse the path s->v->t. If the first augmenting path found by Ford Fulkerson algorithm is s->u->v->t then the flow has value 1 after first iteration. If the second iteration has the path s->v->u->t, then the flow has value 2. If we choose the former in odd number iterations and the latter in even number iterations, we would perform a total of two million augmentations, increasing the flow value by only 1 unit at each time.
We could improve the bound on the Ford Fulkerson algorithm if we implement the computation of the augmenting path p with a breadth first search that is if the augmenting path is a shortest path from s to t in the residual network where each edge has unit distance (weight). This implementation of the Ford Fulkerson method is called the Edmonds Karp algorithm. The Edmonds Karp algorithm runs in O(V E2) time because it depends on the distances to vertices in the residual network Gf.
This improvement is due to shortest path distance. The shortest path distance in a residual network increases monotonically with each flow augmentation. We prove this lemma this way:

Let us take the vertex v belonging to V - {s,t} where s is the source and t is the sink. We will assume the shortest path distance from s to v decreases with a flow augmentation and then derive the contradiction. .Let f be the flow before the first augmentation and f' just afterwards.
Let v be the vertex with the minimum shortest path distance that was decreased by the augmentation so that the shortest path of v with the new flow is less than the one before.
Let p = s ~ u --> v be a shortest path from s to v in Gf', so that (u,v) belongs to Ef' and
the shortest-path from s to u = shortest path from s to v - 1
Because of how we chose v, we know that u was unchanged.
In the new flow, the shortest-path label of u is at least as much as it was before.
The claim now is that the edge u,v did not belong the previous Ef. We state this because:
if we had that edge belong to Ef, we would also have the shortest-path to v in the old flow
to be less than the shortest path to u with the new flow plus unit-distance by simple triangular inequality.
and definitely less than the shortest path to u with the old flow plus unit-distance because they have to be different
and also equal to the shortest path to v in the new flow given that it got decremented by unit-distance with the augmentation
overall contradicting that u,v was in the earlier Ef
The Edmond Karp algorithm always augments flow along the shortest paths, and therefore the shortest path from s to u in Gf has (v,u) as its last edge.
Therefore, we say that
the shortest path distance (s,v) = shortest path distance (s,u) in old flow- 1
<= shortest path distance (s,u) in new flow - 1 by inequality
= shortest-path distance(s,v) - 2 given that the distance for v decremented.
which is contradicting that the distance did decrement.
We show that such a vertex did not exist and we strictly increment the shortest path distance with each flow augmentation.

Sunday, April 6, 2014

The basic Ford Fulkerson algorithm is given by the following:
Ford-Fulkerson(G, s, t)
for each edge (u,v) belonging to E[G]
do f[u,v] = 0;
f[v,u] = 0;
While there exists a path p from s to t in the residual network Gf
do cf(p) = min[cf(u,v) : (u,v) is in p]
for each edge (u,v) in p
do f[u,v] = f[u,v] + cf(p)
f[v,u] = -f[u,v]

The Ford Fulkerson algorithm expands on the Ford Fulkerson method mentioned in the previous post. The while loop repeatedly finds an augmenting path p in Gf and augments flow f along the residual capacity cf (p). When no augmenting path exists, the flow f is the maximum flow.
How the augmentation is done determine s how well the algorithms performs.
If a breadth first traversal is chosen, the algorithm runs in polynomial time.
However let's first take the case when the augmentation path is chosen arbitrarily and the capacities are integral.
The complexity of the initialization steps are O (E)

The complexity of the while loop is O (E f*) where f* is the maximum flow through the system. The while loop is executed at most F* times. The efficiency of the while loop can be increased with the choice of a suitable data structure.

Saturday, April 5, 2014

Today we discuss Ford Fulkerson method to solve maximum flow problem. A maximum flow problem is one where we can interpret a directed graph as a flow network. Each directed edge has a stated capacity.In the maximum flow problem, we wish to capture the greatest rate at which material can be perceived to flow through the graph without violating any capacity constraints.
The Ford Fulkerson method is defined as follows:
Ford-Fulkerson-method(G,s,t)
initialize flow f to 0
while there exists an augmenting path p
do augment flow f along p
return f
As we can see, this method is iterative. At each iteration, we increase the flow by an augmenting path which is simply a path along which we can send more flow and then augmenting the flow along this path. We repeat this process until no augmenting path can be found. There is a way to show that this yield maximum flow which is the max flow min cut theorem. In the Ford Fulkerson method, there can be different implementations to attain the maximum flow and hence named as a method instead of an algorithm.
In the graph with a flow network and a flow, we will have some edges that can admit more flow. This additional flow that we can push through is called residual capacity. The Ford Fulkerson method repeatedly augments the flow along augmenting paths until a maximum flow can be found. The max-flow min-cut theorem tells us that a flow is maximum if and only if its residual network contains no augmenting path. To prove this theorem, a technique called a cut of flow is used.
A cut (S, T) of flow network is a partition of V into S and T = V - S such that s belongs to S and t belongs to T. If f is the flow, then the net flow across the cut is f(S,T). The capacity of the cut is C(S,T). A minimum cut of a network is a cut whose capacity is minimum over all cuts of the network.
The max-flow min-cut theorem states that if f is a flow in a flow network G = (V, E) with source s and sink t, then the following conditions are equivalent:
1. f is a maximum flow in G
2. The residual network Gf contains no augmenting paths
3. |f| = c(S,T) for some cut (S,T) of G
The first condition implies the second condition. This we can show by proof of contradiction. If f is the maximum flow and there is an augmenting path then the flow sum f + fp has a flow in G with a strictly greater than |f| contradicting the assumption we just made.
The third condition implies the first because the value of any flow in network is bounded from above by the capacity of any cut of G. The condition |f| = c(S,T) thus implies that f is a maximum flow.
In a basic Ford Fulkerson algorithm, in each iteration, we find some augmenting path and increase the flow f on each edge of p by the residual capacity cf(p). In this implementation, we make use of the formula that the net flow across the cut (S,T) is f(S,T) = |f| which lets us calculate the residual capacity. Given that edges have capacity and flow and no edge implies no flow and no capacity, we update the flow f[u,v] between each pair of vertices that are connected by an edge by calculating the residual capacity in a temporary variable and augmenting the flow with this capacity.
The residual capacity is the minimum of the capacities. We update the flow in each step until there is none.