Cluster computing

Monday, August 7, 2017

We continue to discuss ZooKeeper which is a wait free coordination for Internet scale systems. It allows distributed processes to co-ordinate with each other through a shared hierarchical namespace of data registers. It is also used for distributed cluster management. It can be used for distributed synchronization in the form of locks, barriers and queues.The Zookeper service is replicated over a set of machines. All machines store a copy of the data (in memory). A leader is elected on service startup. Clients only connect to a single zookeeper server and maintains a TCP connection. Clients can only read from any Zookeeper server, writes go through the leader and need majority consensus. The data model consists of a hierarchical namespace where each node in the namespace is called a ZNode. Every such node in a Zookeeper's namespace is identified by a path. ZNode paths are canonical, absolute and slash-separted and they have no relative references. Paths are appended with a monotonically increasing number. ZNode with the path ending in the lowest number is elected the leader. Each node supports clients to set watch operations which the zookeeper uses to send notifications to the client. All watches are time-ordered but client should handle latency. APIs supported by all nodes are both synchronous and asynchronous
ZooKeeper introduces elements from group messaging, shared registers, and distributed lock services in a replicated centralized service.
#codingexercise
Yesterday we discussed an optimal solution for this problem: Given an integer array, find the maximum for each and every contiguous subarray of size k
Today we see a simple solution which works for small k
void PrintMaxInWindowSizeK(List<int> A, int k)
{
for (int i = 0; i <= A.Count - k; i++)
{
int max = A[i];
for (j = 1; j < k ; j++)
{
if (A[i+j] > max)
max = A[i+j];
}
Console.WriteLine("{0},", max);
}
}
The same works with a deque of size k by making only one pass over the list of integers.

Sunday, August 6, 2017

We continue to discuss ZooKeeper which is a wait free coordination for Internet scale systems. It allows distributed processes to co-ordinate with each other through a shared hierarchical namespace of data registers. We can think of it as a distributed lock server but it involves much more. It is available as open-source and introduces simple interface involving naming, configuration management, lock and synchronization and group services. This can be customized. It is used in configuration management in an unattended manner or for simpler deployment or provisioning. It is also used for distributed cluster management. It can be used for distributed synchronization in the form of locks, barriers and queues. We recall that the gossip protocol was used to propagate the updates via message exchanges. The Zookeper service is replicated over a set of machines. All machines store a copy of the data (in memory). A leader is elected on service startup. Clients only connect to a single zookeeper server and maintains a TCP connection. Clients can only read from any Zookeeper server, writes go through the leader and need majority consensus. The data model consists of a hierarchical namespace where each node in the namespace is called a ZNode. Every such node in a Zookeeper's namespace is identified by a path. ZNode paths are canonical, absolute and slash-separted and they have no relative references. Nodes may have varying lifetimes - from persistent which remains until deleted to ephemeral which remains until session close. These nodes can also be sequence-based where a monotonically increasing number is appended to the end of the path. Each node allows creation-deletion operations, getting and setting data or ACLs and syncing. These can easily be issued from the command line. Each node supports clients to set watch operations which the zookeeper uses to send notifications to the client. All watches are time-ordered but client should handle latency. APIs supported by all nodes are both synchronous and asynchronous.
Read requests are processed locally at the Zookeeper server to which the client connects. However, write requests are forwarded to the leader and go through a majority consensus before response is generated. Consistency is enforced via sequential updates, atomicity, a single system image, reliability and timeliness. To illustrate with an example of Zookeper as a cluster manager, we see the following operations : Each client host sets a watch on all the cluster members. Each host as a cluster member is also an ephemeral node. Whenever a node joins or leaves, it generates an alert. It keeps updating members periodically for node status changes.
In this case, we can also walkthrough the operations for leader election. Each znode has a path. All participants of the election process create an ephemeral-sequential node on the same election path. The node with the smallest sequence number is the leader. Each follower node listens to the leader until the next lower sequence number. When the leader is removed, another one is elected. When a session expires, re-election is conducted. We can see this in action when clients request to acquire a distributed lock. Each client creates a sequential ephemeral znode under a common path. The client corresponding to the lowest number is the winner of the lock. When the lock is release, the number is deleted forcing a re-election. Zookeeper ships client libraries in many languages so that programmability options are expanded. It should be noted that watches are one time triggers, continuous watches require reset after every trigger of a watch. This leads to a design issue with "herd effect" when there are many watches on the same node. Therefore some performance improvements involve steps such as separating the zookeper transaction log to its own disk, setting the heap size to avoid swapping, keeping session timeouts long enough to allow proper cleanup and carefully handling changes in z-node. Note that since a cluster is involved, addition or removal of hosts is easy. This comes in very handy for production hosts. Services that are deployed on production hosts where hosts change frequently can now easily be tracked with Zookeeper. In fact, Twitter uses this very technique for its service discovery.
#codingexercise
Given an integer array, find the maximum for each and every contiguous subarray of size k
void PrintMaxInWindowSizeK(List<int> A, int k)
{
var d = new deque<int>();
if (k > A.Count) return;
int i;
for (int i = 0; i < k; ++i)
{
while ( (!d.empty()) && A[i] >= A[d.back()])
d.pop_back();

d.push_back(i);
}
for ( int i = k; i < A.Count; ++i)
{
Console.WriteLine("{0},", A[d.front()]);
while ( (!d.empty()) && d.front() <= i - k)
d.pop_front();
while ( (!d.empty()) && A[i] >= A[d.back()])
d.pop_back();
d.push_back(i);
}
Console.WriteLine("{0},", A[d.front()]);
}

Saturday, August 5, 2017

Today we start discussing ZooKeeper which is a wait free coordination for Internet scale systems. It allows distributed processes to co-ordinate with each other through a shared hierarchical namespace of data registers. We can think of it as a distributed lock server but it involves much more. It is available as open-source and introduces simple interface involving naming, configuration management, lock and synchronization and group services. This can be customized. It is used in configuration management in an unattended manner or for simpler deployment or provisioning. It is also used for distributed cluster management. It can be used for distributed synchronization in the form of locks, barriers and queues. We recall that the gossip protocol was used to propagate the updates via message exchanges. The Zookeper service is replicated over a set of machines. All machines store a copy of the data (in memory). A leader is elected on service startup. Clients only connect to a single zookeeper server and maintains a TCP connection. Clients can only read from any Zookeeper server, writes go through the leader and need majority consensus. The data model consists of a hierarchical namespace where each node in the namespace is called a ZNode. ZNode paths are canonical, absolute and slash-separted and they have no relative references.
#coding exercises
Find the product of first K magic numbers
int GetMagicN(int n)
{
int power = 1;
int result = 0;
while (n)
{
power = power * 5;
if ( n & 1)
result += power;
n >> = 1;
}
return result;
}
long GetMagicProduct (int k)
{
long result = 0;
for (int I = 1; I <= k; I++)
result *= GetMagicN(k);
return result;
}
It might be interesting to note that since the magic numbers map to bit representations of consecutive integers, we can easily tell how far apart are perfect powers of 5 in the magic number sequence.

Friday, August 4, 2017

In today's post we complete the conclusion of Snowflake data warehouse architecture. The engine for Snowflake is columnar, vectorized and push-based. The columnar storage is suitable for analytical workloads because it makes more effective use of CPU caches and SIMD instructions. Vectorized execution means data is processed in a pipelined fashion without intermediary results as in map-reduce. The Push-based execution means that the relational operators push their results to their downstream operators, rather than waiting for these operators to pull data. It removes control flow from tight loops.Data is encrypted in transit and before being written to storage. Key management is supported with key hierarchy so that the keys can be rotated and re-encrypted. Encryption and key management together complete the security. By using a hierarchy , we reduce the scope of the keys and the data to be secured.
Snowflake introduced three data types variant, array and objects. These enabled it to be a document store as well. with the gelp of these data types, it introduced msssive efficiencies in data ingestion. This "schema later" approach also allowed it to be parse and transform later.
Snowflake performs concurrency control using snapshot isolation implemented over multi-version concurrency control which means a copy of every changed database object is preserved for some duration. The table files are stored in S3 and are therefore immutable. Therefore write operations on the table produce newer versions of the table and file operations are tracked in the metadata. This makes MVCC a natural choice for concurrency control.
To limit the data access that is relevant to a given query, Snowflake performs min-max based pruning, which is also known as small materialized aggregates, zone maps and data skipping. The system maintains data distribution for a given chunk of data as min and max values of the chunk so that the system can discard certain chunks that are not needed for a query. This is analogous to key ranges in B+ trees.
With these features, Snowflake supports pure software as a service experience and continuous availability. It differs from other major vendors such as Google Cloud platform which offers BigQuery service but where such service requires append-only data as well as schema, Snowflake offers full ACID transactions and not require schemas.
Microsoft SQL data warehouse also separates compute and storage but it requires administrative tasks and limits the number of queries executed concurrently. Moreover it supports non-relational data with PolyBase unlike Snowflakes' built in data types.
#codingexercise
Given a sorted array of words, find the order of characters
List<Char> GetAlphabeticalOrder(List<string> words, int alphabetSize)
{
Graph g(alphabetSize);
for (int i = 0; i < words.Count-1; i++)
{
var word1 = words[i];
var word2 = words[i+1];
Tuple<char, char> t = GetFirstMismatch(word1, word2);
if (t != null)
g.addEdge(t.first, t.second);
}
var ret = new List<Char>();
g.topologicalSort(ref ret);
return ret;
}

Thursday, August 3, 2017

In previous blogposts we have been discussing Snowflake computing. Today we conclude our discussions.The engine for Snowflake is columnar, vectorized and push-based. The columnar storage is suitable for analytical workloads because it makes more effective use of CPU caches and SIMD instructions. Vectorized execution means data is processed in a pipelined fashion without intermediary results as in map-reduce. The Push-based execution means that the relational operators push their results to their downstream operators, rather than waiting for these operators to pull data. It removes control flow from tight loops.Data is encrypted in transit and before being written to storage. Key management is supported with key hierarchy so that the keys can be rotated and re-encrypted. Encryption and key management together complete the security. By using a hierarchy , we reduce the scope of the keys and the data to be secured.
Snowflake introduced three data types variant, array and objects. These enabled it to be a document store as well. with the gelp of these data types, it introduced msssive efficiencies in data ingestion. This "schema later" approach also allowed it to be parse and transform later.
#codingexercise
Find the length of the longest subsequence of consecutive integers in a given array
int GetLongest(List<int>A)
{
if (A == null || A.Count == 0) return 0;
if (A.Count == 1) return 1;
A.sort();
int max = 1;
int cur = 1;
for (int i = 1; i < A.Count; i++)
{
if (A[i-1] + 1 == A[i])
{
cur = cur + 1;
}
else
{
max = Math.Max(max, cur);
cur = 1;
}
}
max = Math.Max(max, cur);
return max;
}

Wednesday, August 2, 2017

Today we continue the discussion on Snowflake computing. The engine for Snowflake is columnar, vectorized and push-based. The columnar storage is suitable for analytical workloads because it makes more effective use of CPU caches and SIMD instructions. Vectorized execution means data is processed in a pipelined fashion without intermediary results as in map-reduce. The Push-based execution means that the relational operators push their results to their downstream operators, rather than waiting for these operators to pull data. It removes control flow from tight loops.Data is encrypted in transit and before being written to storage. Key management is supported with key hierarchy so that the keys can be rotated and re-encrypted. Encryption and key management together complete the security. By using a hierarchy , we reduce the scope of the keys and the data to be secured.
Snowflake stores both semi-structured and schemaless data.There are three different data types added - variant, array and object. Variant type includes all native data types as well as arrays and objects. These therefore store documents. Arrays and objects are specializations of variants.
The variant data type is a self describing binary serialization which supports fast key value lookup as well as efficient type tests, comparison and hashing. The variant type also helps Snowflake to perform Extract Load Transform instead of Extract Transform and Load operations. We saw that this significantly reduced the data ingestion operations time for a customer of Snowflake as compared to the existing processes using MongoDB. The ability to load JSON directly while allowing parsing or type inference is called the "schema later" approach. This approach decouples the producers and consumers. In a traditional warehouse, changes to the schema required co-ordination between departments and time-consuming operations.
#codingexercise
Find the sum of first n magic numbers
Magic numbers can be expressed as a power of 5 or sum of unique powers of 5. They occur in series represented by binary distributions such as : 001, 010, 011 etc. for 5, 25, 30 ...
Therefore,

int GetMagicN(int n)
{
int power = 1;
int result = 0;
while (n)
{
power = power * 5;
if ( n & 1)
result += power;
n >> = 1;
}
return result;
}
long GetMagicSum (int k)
{
long result = 0;
for (int I = 1; I <= k; I++)
result += GetMagicN(k);
return result;
}

Tuesday, August 1, 2017

Today we continue the discussion on Snowflake architecture.The engine for Snowflake is columnar, vectorized and push-based. The columnar storage is suitable for analytical workloads because it makes more effective use of CPU caches and SIMD instructions. Vectorized execution means data is processed in a pipelined fashion without intermediary results as in map-reduce. The Push-based execution means that the relational operators push their results to their downstream operators, rather than waiting for these operators to pull data. It removes control flow from tight loops.Data is encrypted in transit and before being written to storage. Key management is supported with key hierarchy so that the keys can be rotated and re-encrypted. Encryption and key management together complete the security. By using a hierarchy , we reduce the scope of the keys and the data to be secured.
#algorithm revisit
Show KMP matching algorithm
KMP-Matcher(T, P)
n = T.Length
m = P.Length
PI = Compute_prefix_function(P)
q = 0
for i = 1 to n
while q > 0 and P[ q + 1 ] != T[i]
q = PI[q]
If P[q + 1] == T [i]
q = q + 1
If q == m
print pattern occurs with shift i - m
q = PI[q]

Compute_prefix_function(P)
m = P.Length
PI = new array [1 .. m]
PI[1] = 0
k = 0
for q = 2 to m
while k > 0 and P[ k + 1 ] != P[q]
k = PI[k]
If P[k + 1] == P[q]
k = k + 1
PI[q] = k
return PI