Cluster computing

Monday, March 7, 2016

Today we continue reading the paper "Horton: Online Query Execution Engine For Large Distributed Graphs" by Sarwat, Elnikety, He and Kliot.
This paper talks about large graphs. These graphs don't fit on a single server. Map-Reduce and Pregel are not good enough for processing interactive adhoc queries against large graphs. Horton is different. Let us take a look at its architecture. It has four components - the graph client library, the graph coordinator, graph partitions and the graph manager. The graph client library sends queries to the graph coordinator. It uses an asynchronous messaging system to receive results. The graph coordinator prepares an execution plan for the query. converts it to an automata and decides which partition to run. Each partition runs the executor and returns the results back to the client. The partitions are managed by the graph manager. The relevant partition is transfered to main memory and back to persistent storage.
The graph client library - sends queries to the graph coordinator in the form of regular expressions, and receives the query results directly from the graph partitions.
The graph coordinator provides the query interface for Horton. It receives the query from the client library, validates and parses the query. The coordinator optimizes the query plan and translates the query into a finite state machine. This is then sent for parallel processing over the partitions.
Every graph partition manages a set of graph nodes and edges.Horton is able to scale out only because of partitions. The automata received from the coordinator is executed on the local partition using a local query execution engine.
#codejam continued
Yesterday we were filling a linear array of size n with k elements of value 1. We called it NChooseK. Today we would like those arrangements of NChooseK where K are contiguous.

void NChooseKContiguous(int N, int K, ref List<List<int>> arrays)
{
debug.assert(N>K && N > 0 && K > 0 && arrays!= NULL);
for (int i = 0; i < N-K+1; i++)
{
var array = new List<int>(N);
array.AddRange(Enumerable.Repeat(0,N));
for (int j = i; j < K; j++)
array[i] = 1;
arrays.add(array);
}
}

Cluster computing

Monday, March 7, 2016

No comments:

Post a Comment