Thursday, December 31, 2015

An interesting problem solving question.
This questions has been adapted from a codejam session.
Let us take a hacker who wants to identify passwords from fingerprints.
There are M keys with fingerprints and N letters in the password.
The hacker knows both M and N but there are different permutations for the password and the hacker does not know the order in which the fingerprints were made.
Take for instance M = 3 and  N = 4 and the fingerprints on the numbers 3,5 and 7.
This means that the following passwords are possible 3357, 3557 and 3577.
and the corresponding permutations.
The permutations for 3357 are 4! / 2  = 12 in number because two of the letters are identical and so the resulting permutations will be the same.
Since any of the M characters can appear in duplicate to make  a set of 4 in the password, we have 3 * 12 = 36 possible passwords with this configuration.
The permutations can be generated by a sample code below
Void Permute (String a, StringBuilder b, bool[] used) 
{ 
  If ( b.Length == a.Length { print b.ToString(); return;} 
  For (int I = 0; I < a.Length ; i++) 
  { 
    If (used[i]) continue; 
     used[i] = true; 
     B += A[i]; 
     Permute(a, b, used); 
     B [B.Length – 1] = ‘/0’; 
     used[i] = false; 
  } ms
} 
But let us look at how to find out the number of such permutations for different values of M and N
if M = 3 and N = 5, then we know that two letters are repeated or one letter appears thrice.
In general, repetitions are taken care of by dividing the permutation by the factorial of  the number of objects that are identical.
33557  = 5!/(2!2!) = 120/4 = 30
33357  = 5!/3! = 120/3! = 20
In the latter case 5 and 7 can each occupy one of the five positions times the remaining four positions for the other
now the above two cases can be repeated with 5 and 7 as well so we have a total of 3 * (30+20) = 150 cases.
As N increases we have further breakouts of the difference between M and N.
If M = 3 and N = 6, then we can have three letters repeating twice, two letters with one repeating thrice and another repeating twice and one letter or one letter repeating four times.

#billing in AWS
ImageType to vCPU

$imageTypeVCPU = array(
"t2.nano" => 1,
"t2.micro" => 1,
"t2.small" => 1,
"t2.medium" => 2,
"t2.large" => 2,
"m4.large" => 2,
"m4.xlarge" => 4,
"m4.2xlarge" => 8,
"m4.4xlarge" => 16,
"m4.10xlarge" => 40,
"m3.medium" => 1,
"m3.large" => 2,
"m3.xlarge" => 4,
"m3.2xlarge" => 8,
"c4.large" => 2,
"c4.xlarge" => 4,
"c4.2xlarge" => 8,
"c4.4xlarge" => 16,
"c4.8xlarge" => 36,
"c3.large" => 2,
"c3.xlarge" => 4,
"c3.2xlarge" => 8,
"c3.4xlarge" => 16,
"c3.8xlarge" => 32,
"r3.large" => 2,
"r3.xlarge" => 4,
"r3.2xlarge" => 8,
"r3.4xlarge" => 16,
"r3.8xlarge" => 32,
"g2.2xlarge" => 8,
"g2.8xlarge" => 32,
"i2.xlarge" => 4,
"i2.2xlarge" => 8,
"i2.4xlarge" => 16,
"i2.8xlarge" => 32,
"d2.xlarge" => 4,
"d2.2xlarge" => 8,
"d2.4xlarge" => 16,
"d2.8xlarge" => 36,
);

Sample bill generated with : https://github.com/ravibeta/PHPSamples/blob/master/billing.php

Wednesday, December 30, 2015

Today we review another coding question :
We have a grid with R rows and C columns in which every entry is either 0 or 1. We are going to perform N operations on the grid, each of which is one of the following:


  • Operation M: Change a number in one cell of the grid to 0 or 1
  • Operation Q: Determine the number of different connected regions of 1s.
    A connected region of 1s is a subset of cells that are all 1, in which any cell in the region can be reached from any other cell in the region by traveling between cells along edges (not corners).
This is the same as the problem I solved in the earlier post.

we solved it by identifying connected regions where a set of 1s appear together. The only variation here is that if two cells of 1 are adjacent only diagonally and not horizontally or vertically then they are not connected. There we identified the contiguous regions as extending the rightmost corner of the rectangle, here we bound the connected region by exploring top then right then bottom edges. A cell can be said to be in an already identified connected region if we can reach it from another point in the region horizontally or vertically where it shares one of the coordinates.

Now to continue with the AWS billing discussion, we were saying we could even prepare the bill based on allocations not usage. We get information on allocations as follows:
<?php
require 'vendor/autoload.php';
use Aws\CloudWatch\CloudWatchClient;

$key = "";
$secret = "";
$region = "us-west-2";
$version = "latest";
$interval = 15;

// Users/rajamani/Downloads/devserver/bill/vendor/aws/aws-sdk-php/src

// Use the us-west-2 region and latest version of each client.
$sharedConfig = [
    'region'  => $region,
    'version' => $version,
  'credentials' => array(
    'key' => $key,
    'secret'  => $secret,
  ),
    'key'    => $key,
    'secret' => $secret,
];

// Create an SDK class used to share configuration across clients.
$sdk = new Aws\Sdk($sharedConfig);
$client = $sdk->createEc2();
  date_default_timezone_set('America/Los_Angeles');
$result  = $client->describeInstances(array(
));
echo 'RESULT='.serialize($result);

$result = (array)$result;
foreach( $result as $reservations)
{
 echo 'RESER='.serialize($reservations)."\n";
 foreach($reservations['Reservations'] as $reservation)
 {
   echo 'RESERVATION='.serialize($reservation)."\n";
   foreach($reservation["Instances"] as $instance)
   {
        $attributes = $client->describeInstanceAttribute(array(
    'DryRun' => false,
    // InstanceId is required
    'InstanceId' => $instance["InstanceId"],
    // Attribute is required
    'Attribute' => 'instanceType',
));
       echo 'ATTRIBUTES='.serialize($attributes)."\n";
   }
 }

}

?>

Tuesday, December 29, 2015

Openstack billing
Ceilometer measures the usage of resources in terms of disk usage, network usage and CPU usage. For each instance the usage details for following meters will be shown: 
  1. Disk Read Bytes 
  1. Disk Write Bytes 
  1. Disk Write Requests 
  1. Disk Read Requests 
  1. Network outgoing bytes 
  1. Network incoming bytes 
  1. Network outgoing packets 
  1. Network incoming packets 
  1. CPU 
 
Ceilometer arranges all the data in terms of resources and their meters. 
We can get a list of all the meters using  
GET /v2/resources/ and  
GET /v2/resources/(resource_id)  
 Where resource contains links such as a url to identify itself and a url for its meters such as http://localhost:8777/v2/meters/volume?q.field=resource_id&q.value=bd9431c1-8d69-4ad3-803a-8d4a6b89fd36 
  
GET /v2/meters/ that returns all know meters and  
GET /v2/meters/(meter_name) which even come with statistics 
GET /v2/meters/(meter_name)/statistics 
  
Samples and statistics can also be drawn from  
GET /v2/samples 
And GET /v2/            samples/(sample_id) 
 
Tables can be created in the following manner: 
RatePlan
  1. Id 
  1. Name  
  1. CPU cost per second 
  1. Disk cost per MB 
  1. Network cost per MB 
  1. Cost for flavors 

Table for instance-bill will have the following attributes 
  1. Resource_id 
  1. Rateplan_id 
  1. Date and time at which the bill was calculated 
  1. Resource_bill 
Openstack provides three  types of meters:  
Cumulative -  those that increase over time 
Delta -  those that change over time  
and Gauge - those that are discrete items and fluctuating values 
When we pick the metrics for compute and storage, we are typically interested in the first and the last of the above. These will come in helpful to provide information on an instance basis. For example, compute has instance, number of vcpus and memory as gauge and cpu time as cumulative. This theme is common to Openstack as well as Amazon as we have seen.

#codingexercise
Given pairs of numbers which correspond to number of unit-heights on  two vertical lines  and the pair representing starting and ending points for a connecting line, find the number of intersection points in the list of pairs. At every intersection points there are only two connectors. There is only one connector at each starting or ending point

int GetNumberOfIntersections(List<Tuple<int,int>> pairs)
{
int total = 0;
for (int i = 0; i < pairs.Count; i++)
{
if (pairs[i].Item1 > pairs[i].Item2)
{
int start = pairs[i].Item2;
int end = pairs[i].Item1;
int count = 0;
for (int j = 0; j< pairs.Count; j++)
{
if (j == i ) continue;
if (pairs[j].Item2 >= pairs[start] &&
     pairs[j].Item2 < pairs[end])
    count++;
}
total += count;
}
else
{
int start = pairs[i].Item1;
int end = pairs[i].Item2;
int count = 0;
for (int j = 0; j< pairs.Count; j++)
{
if (j == i) continue;
if (pairs[j].Item1 > pairs[start] &&
     pairs[j].Item1 <= pairs[end])
    count++;
}
total += count;
}
return total;
}

 

Monday, December 28, 2015

AWS Metrics & Billing continued: 
AWS Metrics cited so far included the following: 
CPUCreditUsage This metric is available at a 5 minute frequency and are available as counts. This metric identifies the amount of time during which the physical CPUs were used for processing instructions by virtual CPUs allocated to the instance.  
Multiplying the count with the duration we get the total instance-hours for that instance 
CPUCreditBalance – This is the number of CPU credits that an instance has accumulated. This metric is used to determine how long an instance can burst beyond its baseline performance 
Like CPUCreditUsage – This metric is available at a 5 minute frequency and are available as counts. Multiplying the count with the duration we get the total instance-hours  of credit for that instance.  
CPUUtilization – This is the percentage of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance. It is measured in percents. 
This data can be filtered based on the InstanceID that is specific to the identified instance or InstanceType that encompasses all instances running with the specified instance type.  This helps you categorize the data by the type of the running instance. There are other filters available as well but these suffice as examples for individual instances and groups. 

 Users need their own access keys to make programmatic calls to AWS from the AWS Command Line Interface (AWS CLI), Tools for Windows PowerShell, theAWS SDKs, or direct HTTP calls using the APIs for individual AWS services. To fill this need, you can create, modify, view, or rotate access keys (access key IDs and secret access keys) for IAM users.
IAM users can be added.
If we look at the bills generated by AWS, they typically have a cost breakout like this: 
AWS Service Charges: 
 Elastic Compute Cloud: 
    US West (Oregon) Region 
     Amazon Elastic Compute Cloud Running Linux/UNIX 
         $0.00 per Linux t2.micro instance-hour (or partial hour)      261 Hrs     $0.00 
         Total:                                                                                                                            $0.00 
     EBS 
        $0.00 per GB-month of General Purpose (SSD)                  2.785 GB-Mo    $0.00 
        Total:                                                                                                                             $0.00 
  Region Total:                                                                                                                     $0.00 
CT to be collected:                                                                                                             $0.00 
GST to be collected:                                                                                                           $0.00 
US Sales Tax to be collected:                                                                                          $0.00 
VAT to be collected:                                                                                                          $0.00 
Note that this bill is specific to User to region and categorized by compute or storage. It can show further drill down as reports based on instances and instance groups. 
With SNS notifications we can populate a table for staging usage data that can have the following attributes 
- Customer 
- Region 
- Service (ECS/EBS for now) 
- dimension  
- Instance (optional and if applicable) 
- Volume (optional and if applicable) 
- Current ( value ) 
- Cumulative (value) 
- min ( applicable only for current in the absence of historical data) 
- max ( applicable only for current in the absence of historical data) 
- avg ( applicable only for current in the absence of historical data) 
- units 
- timestamp/sequence-counter  ( optional for historical data) 
- Created by 
- Created Date 
Notice that this staging table has a few salient points 
 - It gathers usage data but can also be simplified to allocation data. For example we need not collect the metrics provided by the AWS CloudWatch but only use the vcpus allocated for each instance. This way we can directly calculate the bill based on vcpus allocated and timeslices. 
 - It attempts to be as granular as possible for its metric as available from the source usage data so that services can aggregate and group by as appropriate even though there might be similar features from AWS itself. For example, we may have a way to get data based on instance groups but the records only show per instance with instance details stored in a separate table. Otherwise if the instance details are not available, then use a generic dimension for storing relevant grouping factors. 
- Historical data need not be preserved as long as current and cumulative are available. This way we simplify the table even more. If contiguous timeslices data are available we can populate them as historical data so the attributes min, max and avg are not necessary. 
  
This staging data is filled by a service reading the AWS SNS as a web server or any consumer that subscribes to those messages. 
The populated data can then be used to display the bills for each customer when they login. 
Sample subscription setup: 
<?php 
require 'vendor/autoload.php'; 
use Aws\CloudWatch\CloudWatchClient; 
$key = "your_key"; 
$secret = "your_secret"; 
$region = "us-west-2"; 
$version = "latest"; 
$interval = 15;   
// Use the us-west-2 region and latest version of each client. 
$sharedConfig = [ 
    'region'  => $region, 
    'version' => $version, 
  'credentials' => array( 
    'key' => $key, 
    'secret'  => $secret, 
  ), 
    'key'    => $key, 
    'secret' => $secret, 
]; 
  
// Create an SDK class used to share configuration across clients. 
$sdk = new Aws\Sdk($sharedConfig); 
$client = $sdk->createSns(); 
$result  = $client->subscribe(array( 
    // TopicArn is required 
    'TopicArn' => 'arn:aws:sns:us-west-2:792498758900:BillingMetrics', 
    // Protocol is required 
    'Protocol' => 'email', 
    'Endpoint' => 'rravishankar@gmail.com', 
)); 
echo 'RESULT='.serialize($result); 
$result = $client->confirmSubscription(array( 
    // TopicArn is required 
    'TopicArn' => 'arn:aws:sns:us-west-2:792498758900:BillingMetrics', 
    // Token is required 
    'Token' => '2336412f37fb687f5d51e6e241d7700bdc2fd53a7a19f21d5f50d0cbe0f8a467d10f83122e7eadb59d60bcfd8434754522c8c944e52b3592cd2a0b7f5cdaab36e1c7aad614f57e70ad78fa2e4ece8bb6a9b7b13fec0464a35d59864ae1ccee29fc873310f17eeb77ff5859a22c9f411496f7b60faeea6a8f56fea96038f5f7f9', 
//https://sns.us-west-2.amazonaws.com/confirmation.html?TopicArn=arn:aws:sns:us-west-2:792498758900:BillingMetrics&Token=2336412f37fb687f5d51e6e241d7700bdc2fd53a7a19f21d5f50d0cbe0f8a467d10f83122e7eadb59d60bcfd8434754522c8c944e52b3592cd2a0b7f5cdaab36e1c7aad614f57e70ad78fa2e4ece8bb6a9b7b13fec0464a35d59864ae1ccee29fc873310f17eeb77ff5859a22c9f411496f7b60faeea6a8f56fea96038f5f7f9&Endpoint=rravishankar@gmail.com 
    'AuthenticateOnUnsubscribe' => 'true', 
)); 
echo 'RESULT_CONFIRMATION='.serialize($result); 
?>