Thursday, July 27, 2017

We were discussing Snowflake cloud services from their whitepaper. The engine for Snowflake is Columnar, vectorized and push-based. The columnar storage is suitable for analytical workloads because it makes more effective use of CPU caches and SIMD instructions. Vectorized execution means data is processed in a pipelined fashion without intermediary results as in map-reduce. The Push-based execution means that the relational operators push their results to their downstream operators, rather than waiting for these operators to pull data.  It removes control flow from tight loops.
We now revisit the multi data center software as a service design of Snowflake.  A web user interface is provided  that supports not only SQL operations, but also gives access to database catalog, user and system management, monitoring and usage information. The web user interface is only one of the interfaces to the system but it is convenient not only to use Snowflake but perform administrator tasks as well.   Behind the web user interface, Snowflake is designed as a Cloud Service that operates on several Virtual Warehouse compute instances all of which share a Data Storage layer where the data is replicated across multiple availability zones.
The cloud Services layer is always on and comprises of services that manage virtual warehouses, queries and transactions and all the metadata. The Virtual warehouses  consist of elastic clusters of virtual machines. These are instantiated on demand to scale the query processing.  The data storage spans availability zones and therefore is setup with replication to handle the failures from these zones. If a node fails, other nodes can pick up the activities without much impact on the end users. This differs from Virtual warehouses which do not span availability zones.
#codingexercise
static int GetCountIncreasingSequences(List<int> A, uint subarraysize)

{

int[] dp = new int[A.Count];



for (int i = 0; i < A.Count; i++)

{

dp[i] = 1;



for (int j = 0; j <= i - 1; j++)

{

if (A[j] < A[i])

{

dp[i] = dp[i] + dp[j];

}

}

}

return dp.ToList().GetRange(0, subarraysize).Sum();

}


Find and print longest consecutive number sequence in a given sequence 
Int GetLongestContiguousSubsequence(List<uint> A) 
{ 
Var h = new Hashtable(); 
For (int I = 0; I < A.Count; i++) 
        If (h.ContainsKey(A[i]) == false) 
             h.Add(A[i], 1); 
int max = INT_MIN; 
for (int I = 0; I < A.Count; i++) 
{ 
     int cur = 0; 
     for (int j = A[i]; j >= 0; j--) 
           if (h.ContainsKey(j)) 
               cur++; 
    max = Math.Max(max, cur); 
} 
return max; 
} 

The nested for loops have overlapping sub problems, so we could at least memoize the results. Alternatively we can sort the array to find longest span of consecutive integers for the whole array.

No comments:

Post a Comment