Cluster computing

Monday, December 28, 2015

AWS Metrics & Billing continued:

AWS Metrics cited so far included the following:

CPUCreditUsage – This metric is available at a 5 minute frequency and are available as counts. This metric identifies the amount of time during which the physical CPUs were used for processing instructions by virtual CPUs allocated to the instance.

Multiplying the count with the duration we get the total instance-hours for that instance.

CPUCreditBalance – This is the number of CPU credits that an instance has accumulated. This metric is used to determine how long an instance can burst beyond its baseline performance

Like CPUCreditUsage – This metric is available at a 5 minute frequency and are available as counts. Multiplying the count with the duration we get the total instance-hours of credit for that instance.

CPUUtilization – This is the percentage of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance. It is measured in percents.

This data can be filtered based on the InstanceID that is specific to the identified instance or InstanceType that encompasses all instances running with the specified instance type. This helps you categorize the data by the type of the running instance. There are other filters available as well but these suffice as examples for individual instances and groups.

Users need their own access keys to make programmatic calls to AWS from the AWS Command Line Interface (AWS CLI), Tools for Windows PowerShell, theAWS SDKs, or direct HTTP calls using the APIs for individual AWS services. To fill this need, you can create, modify, view, or rotate access keys (access key IDs and secret access keys) for IAM users.
IAM users can be added.
If we look at the bills generated by AWS, they typically have a cost breakout like this:

AWS Service Charges:

Elastic Compute Cloud:

US West (Oregon) Region

Amazon Elastic Compute Cloud Running Linux/UNIX

$0.00 per Linux t2.micro instance-hour (or partial hour) 261 Hrs $0.00

Total: $0.00

EBS

$0.00 per GB-month of General Purpose (SSD) 2.785 GB-Mo $0.00

Total: $0.00

Region Total: $0.00

CT to be collected: $0.00

GST to be collected: $0.00

US Sales Tax to be collected: $0.00

VAT to be collected: $0.00

Note that this bill is specific to User to region and categorized by compute or storage. It can show further drill down as reports based on instances and instance groups.

With SNS notifications we can populate a table for staging usage data that can have the following attributes

- Customer

- Region

- Service (ECS/EBS for now)

- dimension

- Instance (optional and if applicable)

- Volume (optional and if applicable)

- Current ( value )

- Cumulative (value)

- min ( applicable only for current in the absence of historical data)

- max ( applicable only for current in the absence of historical data)

- avg ( applicable only for current in the absence of historical data)

- units

- timestamp/sequence-counter ( optional for historical data)

- Created by

- Created Date

Notice that this staging table has a few salient points

- It gathers usage data but can also be simplified to allocation data. For example we need not collect the metrics provided by the AWS CloudWatch but only use the vcpus allocated for each instance. This way we can directly calculate the bill based on vcpus allocated and timeslices.

- It attempts to be as granular as possible for its metric as available from the source usage data so that services can aggregate and group by as appropriate even though there might be similar features from AWS itself. For example, we may have a way to get data based on instance groups but the records only show per instance with instance details stored in a separate table. Otherwise if the instance details are not available, then use a generic dimension for storing relevant grouping factors.

- Historical data need not be preserved as long as current and cumulative are available. This way we simplify the table even more. If contiguous timeslices data are available we can populate them as historical data so the attributes min, max and avg are not necessary.

This staging data is filled by a service reading the AWS SNS as a web server or any consumer that subscribes to those messages.

The populated data can then be used to display the bills for each customer when they login.

Sample subscription setup:

<?php

require 'vendor/autoload.php';

use Aws\CloudWatch\CloudWatchClient;

$key = "your_key";

$secret = "your_secret";

$region = "us-west-2";

$version = "latest";

$interval = 15;

// Use the us-west-2 region and latest version of each client.

$sharedConfig = [

'region' => $region,

'version' => $version,

'credentials' => array(

'key' => $key,

'secret' => $secret,

'key' => $key,

'secret' => $secret,

];

// Create an SDK class used to share configuration across clients.

$sdk = new Aws\Sdk($sharedConfig);

$client = $sdk->createSns();

$result = $client->subscribe(array(

// TopicArn is required

'TopicArn' => 'arn:aws:sns:us-west-2:792498758900:BillingMetrics',

// Protocol is required

'Protocol' => 'email',

'Endpoint' => 'rravishankar@gmail.com',

));

echo 'RESULT='.serialize($result);

$result = $client->confirmSubscription(array(

// TopicArn is required

'TopicArn' => 'arn:aws:sns:us-west-2:792498758900:BillingMetrics',

// Token is required

'Token' => '2336412f37fb687f5d51e6e241d7700bdc2fd53a7a19f21d5f50d0cbe0f8a467d10f83122e7eadb59d60bcfd8434754522c8c944e52b3592cd2a0b7f5cdaab36e1c7aad614f57e70ad78fa2e4ece8bb6a9b7b13fec0464a35d59864ae1ccee29fc873310f17eeb77ff5859a22c9f411496f7b60faeea6a8f56fea96038f5f7f9',

//https://sns.us-west-2.amazonaws.com/confirmation.html?TopicArn=arn:aws:sns:us-west-2:792498758900:BillingMetrics&Token=2336412f37fb687f5d51e6e241d7700bdc2fd53a7a19f21d5f50d0cbe0f8a467d10f83122e7eadb59d60bcfd8434754522c8c944e52b3592cd2a0b7f5cdaab36e1c7aad614f57e70ad78fa2e4ece8bb6a9b7b13fec0464a35d59864ae1ccee29fc873310f17eeb77ff5859a22c9f411496f7b60faeea6a8f56fea96038f5f7f9&Endpoint=rravishankar@gmail.com

'AuthenticateOnUnsubscribe' => 'true',

));

echo 'RESULT_CONFIRMATION='.serialize($result);

Sunday, December 27, 2015

Q: Given a circular linked list. Find the longest sequence of numbers, where a sequence starts with any number, for example 3, and ends when you see that number again, another 3,
Imagine the circular linked list
3 8 9 7 2 1 3 4 6 [3] same as the first element i.e 3
The longest sequence would be 3 8 9 7 2 1, the other candidate being 3, 4, 6
Finding for instance, starting at 8 and getting to the same 8 wouldn't count as a valid sequence.
Courtesy : CareerCup
A:
We build a hashtable of the last encountered index of the elements in the list. If the element is seen again, we calculate the difference in the new and the previous index and as long as it is less than the total count of elements in the list and more than the previously encountered difference, we update the difference and note down the start. After all the updates are made, we have a start index and the highest difference between two occurrences of same element.

void PrintLongestSequence(List<int> numbers)
{
int start = 0;
int length = 0;
var h = new Hashtable<int,int>();
for (int i = 0; i < numbers.Count; i++)
{
    if (h.Contains(numbers[i]))
    {
         var old = h[numbers[i]];
         var new = i;
         var difference = new - old;
         if (difference > length)
         {
               start = old;
               length = difference;
         }
     }
     h[numbers[i]] = i;
}
Console.WriteLine();
for (int i = start; i < start+length; i++)
    Console.Write("{0} ", numbers[start]);
Console.WriteLine();
}

Saturday, December 26, 2015

Given an 8 x 8 matrix, find all possible paths, moving one cell downwards or one cell to the right (one cell per movement) from cell (0,0) to cell (7,7)

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace MatrixWalk
{
    class Program
    {
        const int rows = 8;
        const int cols = 8;
        static void Main(string[] args)
        {
            var path = new List<Tuple<int,int>>();
            int totalPaths = 0;
            GetPaths(0, 0, ref path, ref totalPaths);
        }
        public static void GetPaths(int x, int y, ref List<Tuple<int, int>> path, ref int totalPaths)
        {
            if (x < rows && y < cols)
                path.Add(Tuple.Create(x, y));
            if (x == rows-1 && y == cols-1)
            {
                totalPaths += 1;
                PrintPath(path, totalPaths);
            }
            if (x + 1 < rows)
                GetPaths(x + 1, y, ref path, ref totalPaths);
            if (y + 1 < cols)
                GetPaths(x, y + 1, ref path, ref totalPaths);
            path.RemoveAt(path.Count - 1);
        }
        public static void PrintPath( List<Tuple<int, int>> path, int totalPaths)
        {
            Console.WriteLine("-------------------------------");
            Console.WriteLine("Path number: {0}", totalPaths);
            Console.WriteLine("Path Size: {0}", path.Count);
            System.Diagnostics.Debug.Assert(path.Count == rows+cols-1);
            for (int i = 0; i < path.Count; i++)
            {
                Console.Write("[{0},{1}]", path[i].Item1, path[i].Item2);
            }
            Console.WriteLine();
            Console.WriteLine("-------------------------------");
        }
    }
}

Friday, December 25, 2015

Today we continue discussing the DMB algorithm from the paper titled optimal distributed online prediction using mini-batches by Dekel, Gilad-BachRach, Shamir and Xiao. One of the theorems they propose is one that bounds the expected regret over m samples.
The theorem is stated this way : Let f(w,z) be an L-smooth convex loss function in w for each input in the stream and assume that the stochastic gradient (Big-Delta) computed at the input zi has a std deviation squared variance for all predictions. If the update rule phi has the serial regret bound Psi as a function of this variance and m, then the expected regret of the algorithm over m samples is at most
(b + mu) times Psi as a function of ( std dev squared / b and upper ceiling of m / (b+mu)).

Specifically if Psi is a function as 2 D -squared L + 2 D std dev square-root(m) which was established earlier as the bound in the ideal serial solution and we set the batch size as b = m power rho for any rho in the range 0 to half, the expected regret bound is 2 D sigma sq.rt(m) + sigma sq.rt(m)
This bound is asymptotically equal to the ideal serial bound and even the constants in the dominant terms are identical. The bound scales nicely with the network latency and the cluster size k, because mu which actually scales logarithmically with k does not appear in the dominant sq.rt(m) term. On the other hand, the naïve no communication theorem is worse than this bound by a factor of square-root(k)

#codingexercise
Question : Find if two binary trees have common subtrees
answer : We find the inorder traversals of the trees as strings and then find common substrings in the two representations.
        static void FindCommonSubTrees(Node tree1, Node tree2)
        {
            var substrs = GetSubstrings(GetInOrder(tree1));
            foreach (var str in substrs)
                Console.WriteLine(str);
            var otherstrs = GetSubstrings(GetInOrder(tree2));
            foreach (var str in otherstrs)
                Console.WriteLine(str);
            var intersection = substrs.Intersect(otherstrs);
            foreach (var str in intersection)
                Console.WriteLine(str);
        }
        static List<string> GetSubstrings(string input)
        {
            var ret = new List<string>();
            ret.Add(input);
            for (int i = 0; i < input.Length; i++)
            {
                for (int pos = 0; pos <= input.Length; pos++)
                {
                    if (pos + i < input.Length)
                    {
                        var substr = input.Substring(pos, i);
                        ret.Add(substr);
                    }
                }
            }
            return ret;
        }

Thursday, December 24, 2015

int GetFirstEvenCountOccurence(List<int> numbers)
{
var countable = new Hashtable();
for (int I = 0; I < numbers.Count; I++)
{
if ( countable.contains(numbers[I])
    countable[numbers[I}] += 1;
else
    countable.Add(numbers[I], 1);
}
for (int I = 0; I < numbers.Count; I++)
{
if (countable[numbers[I]] %2 == 0)
     return numbers[I];
}
return -1;
}

Today we continue discussing the DMB algorithm where a deterministic update rule is applied in a distributed environment. This technique resembles the serial mini-batch technique described earlier and is therefore called the distributed mini-batch algorithm (DMB).
The algorithm processes the input stream in batches j = 1,2, where each batch contains b + mu consecutive inputs where b is the batch size and mu is the additional inputs. In each batch, all of the nodes use a common predictor wj. While observing the first b inputs in a batch, the nodes calculate and accumulate the stochastic gradients of the loss function f at wj. Once the nodes have accumulated b gradients altogether, they start a distributed vector sum operation to calculate the sum of these b gradients. While the vector sum operation completes in the background, mu additional inputs arrive and the system keeps processing them using the same predictor wj. The gradients for these additional mu inputs are discarded. Once the vector sum completes, each node holds the sum of the b gradients collected during batch j. Each node divides this sum by b and obtains the average gradient, which is denoted by gj-bar. Each node feeds this average gradient to the update rule phi, which returns a new synchronized prediction wj+1. While only the first b/k gradients are used to compute the next predictor. all b+mu inputs are counted in the regret calculation. If the DMB algorithm is applied to a spanning tree in the network, the aggregation is performed at the root and broadcasted.

Wednesday, December 23, 2015

Today we continue with the paper optimal distributed online prediction using mini-batches by Dekel, Gilad-BachRach, Shamir and Xiao. We now review the Distributed Mini-Batch for Stochastic online prediction where the authors show that the mini-batch idea can be exploited to obtain nearly optimal regret bounds. They begin with the assumption that any communication over the network incurs a latency. The network is an undirected graph G over the set of nodes, where each edge represents a bi-directional network link. The latency incurred between u and v is therefore proportional to the graph distance between them.and is maximum at the diameter The network is also assumed to have limited bandwidth but instead of quantifying it, they require that the communication load at each node remains constant, and does not grow with the number of nodes k or with the rate at which the incoming functions arrive.
The communication model used with this network is one that allows a distributed vector sum operation. This operation begins with each node holding a vector vj, and ends with each node holding the sum of the vectors vj from 1 to k. The messages are transmitted along a rooted minimum depth spanning tree where the leaves send their vectors to their parents who aggregate and in turn send the result to their parents until it reaches root. The root computes the sum from all and broadcasts it back to all the nodes.
There are two advantages with this model : one that each up-link and each down-link in the minimum spanning tree is used only once The vector sum operations can be started back to back. These vector sum operations will run concurrently without creating network congestion on any edge of the tree and these network operations are non-blocking where each node can continue processing inputs while the vector sum operations take place in the background. This lets us overcome the restriction imposed by network latency.
The DMB algorithm is a distributed mini-batch algorithm. where it runs on parallel on each node in the network again utilizing the summation form we discussed with the communication model above.
#codingquestion
Given an integer array, find the first element with an even number of occurrences.
First of all this problem is not the same as finding the first duplicate because the total number of times that duplicate can occur can still be odd. Hence we must process all the numbers in the array into a count table with elements as keys and their count of occurrences as values. Therefore a hashtable with constant time lookup will help. Given that the count of occurrences can be arbitrary, we must safeguard against no even number count of occurrences or other edge conditions. the first pass over the integer array populates the count table and the second pass over the integer array checks for the first element with a parity. To conserve space, we can use a bitmap where the bit at the index given by the number or element of the integer array is tested for parity but we would need another data structure to say whether an element was ever encountered.