Cluster computing

Wednesday, December 17, 2014

In a post, previous to the last one, I was talking about securing AWS S3 artifacts by a corporate subscriber.

The following are the mechanisms available to implement a corporate governance strategy:
IAM Policy:
These are global policies and are attached to IAM users, groups, or roles, which are then subject to permissions we have defined. They apply to principals and hence in their definition, the application to a principal is implicit and usually omitted.
For programmatic access to the resources created by Adobe users, we will require an account that has unrestricted access to their resources. Creating such a user will help with requesting an AWS Access Key and an AWS Secret that will be necessary for use with the programmatic access.

Bucket Policy:
These are policies that are attached to the S3 buckets. They are S3 specific as opposed to IAM policies that are applicable across all AWS usages by the principals. These bucket policies are applied one per bucket. A sample bucket policy looks like this:
{
"Id": "Policy1418767581422",
"Statement": [
{
"Sid": "Stmt1418761739258",
"Action": "s3:*",
"Effect": "Allow",
"Resource": "arn:aws:s3:::*",
"Principal": {
"AWS": [
"rravishankar"
]
}
}
]
}
Notice the mention for a principal and a resource in this bucket policy. The ARN is specified such that it covers all the resources we want to govern. Note that these resource names have to be specified with wild cards. This is done via :
arn:aws:s3:::my_corporate_bucket/*
arn:aws:s3:::my_corporate_bucket/Development/*

ACLs:
These are specified by the user and admin to specific folders or objects within a bucket or to the bucket itself. These are fine grained control of individual resources. The access control can be expressed in terms of pre-canned ACLs such as :
'private', 'public-read', 'project-private', 'public-read-write', 'authenticated-read', 'bucket-owner-read', 'bucket-owner-full-control'
Or they can be granted explicitly to an individual, group or e-mail based principal with permissions such as :
FULL_CONTROL | WRITE | WRITE_ACP | READ | READ_ACP

In this regard, the set of items that the governance needs to take include the following:

1) Create a corporate governance service principal or account.

2) Request access key and secret for this account.

3) Set up a distribution list with storage team members for this account.

4) Prefer to have only one bucket deployed for all corporate usage.

5) User access will be based on folders as described by the prefix in the keys of their objects.

6) Define a path as the base for all users just like we do on any other file storage.

7) Cross user access will be permitted based on ACL’s No additional action is necessary except for provisioning a web page on the portal that lists the bucket and objects, their ACLs and applying those ACLs

8) If step 1) is not possible or there are other complications, then have a bucket policy generated for the account we created to have full access (initially and later reduced to changing permissions) to all the objects in that bucket.

9) Policies are to be generated from :

http://awspolicygen.s3.amazonaws.com/policygen.html

10) Apply policies generated to their targets, - IAM or buckets.

11) Provision the webpage.

#codingexercise
Decimal GetOddNumberRangeStdDev(Decimal [] A)

{

if (A == null) return 0;

Return A.OddNumberRangeStdDev();

}

Today I will be reading from the WRL research report on Shared Memory Consistency Models. The topic means a specification of memory semantics that helps right correct and efficient programs on multiprocessors. Sequential access is also one such model but it is severely restricting for multiprocessors. That is why system designers introduce relaxed consistency models but they are sometimes so different that a general specification becomes altogether more complicated. In this paper, the authors describe issues related to such models. Here the models are for hardware based shared memory systems.There are optimizations made based on each system but this paper describes the models with the same terminology. Furthermore, the paper also describes the models from the point of view of a programmer.
Shared memory models provide a single address space abstraction over parallel systems. It simplifies difficult programming tasks such as data partitioning and dynamic load distribution. Having a precise understanding of how memory behaves is important for programming. Consider a producer consumer queue where memory is written by one and read by another. If the memory gives the old value prior to the write for the reader, then there is an inconsistency between the programmer's expectation and the actual behaviour.
On a single processor, a read returns the last write because the instructions are executed in program order which isn't the case for multiprocessors. We can extend the notion to multiprocessors in which case its called sequential consistency and the reads will now be consistent because the processor is executing in program order.However, this simple and intuitive model severely restricts the hardware and compiler optimizations.

#codingexercise
Decimal GetOddNumberRangeStdDev(Decimal [] A)
{
if (A == null) return 0;
Return A.OddNumberRangeStdDev();
}

A consistency model like the one described is an interface between programmability and system design. The model affects programmability because the programmers use it to reason the correctness of the program. The model affects performance because system designers will use it to determine optimizations in hardware and software.

Tuesday, December 16, 2014

In today's post we talk about security and policies as applicable to AWS S3 artifacts. Policies are mapping that dictate what users have access to what resources. Typically policies are declared and applied to containers. If you see the .Net security model, there is a mention for different tiers of policies. These include an Enterprise level policy which comes in handy for corporate accounts. Policies don't necessarily have to be expressed in the same way on different systems but there is a reason for a consistency that we see in many different systems. For example, the inheritance of permissions set at a folder level flows to its contents. Another way to see these principles is the cross account authorization. For example, one owner of a folder may grant permissions to another user for reading and writing. Policies and their applications are implemented differently but there is a lot in common in their design. This comes from their requirements which are common. Coming back to the mention on S3 artifacts, there are two ways enterprise policies can be applied
One is to create a corporate level bucket with user level folders. Another is to apply a policy with full access to a corporate governance account to all the buckets.

Monday, December 15, 2014

Decimal GetEvenNumberRangeStdDev(Decimal [] A)

{

if (A == null) return 0;

Return A.EvenNumberRangeStdDev();

}

Today we continue to discuss the WRL long address trace generation system. We were reviewing the effects of line size in the first and second level cache in this study and the contribution of the write buffer. We summarize the study now. This is a method to generate and analyze very long traces of program execution on RISC machines. The system is designed to allow tracing of multiple user processes and the operating system. Traces can be stored or analyzed on the fly. The latter is preferred because the storage is very large even for a very short duration. Besides it allows the trace to be compacted and written to tape. The system requires that the programs be linked in a special way.
From the trace analysis using this system, it was found that while second level cache was necessary, large second level caches provide little or no benefit. A program which has a large working set can benefit from a large cache. Block size plays a big role in first level instruction cache, but sizes above 128 bytes had little effect on overall performance. If the proportion of writes in the memory references is large, then the contribution from the write buffer is significant. There is little or no benefit from the associativity in large second level cache.

#codingexercise
Decimal GetEvenNumberRangeVariance(Decimal [] A)

{

if (A == null) return 0;

Return A.EvenNumberRangeVariance();

}

Decimal GetEvenNumberRangeMode(Decimal [] A)
{
if (A == null) return 0;
Return A.EvenNumberRangeMode ();
}

Today we will continue to discuss the WRL system for trace generation. We discussed the effects of direct mapped and associative second level cache. We now review the line size in first and second level cache. When the line size was doubled or quadrupled with a 512 K second level cache, there were reductions in the Cumulative Total CPI and this was independent of the second level size. The improvement was due to decreased contribution of the instruction cache. The data cache remained constant. The improvement in doubling the line size was the same as the improvement in increasing the second level cache size eight times. The effects of doubling the length of the lines in the second level cache from 128 bytes to 256 bytes made no difference. This may be due to too much conflict for long line sizes to be beneficial.
We next review write buffer contributions. The performance of the write buffer varied from program to program. This is mainly due to the sensitivity of the write buffer to both the proportion and distribution of writes in the instructions executed. A write buffer entry is retired every six cycles. If the writes were any more frequent or bunched, it would degrade the performance. The relationship between the proportion of writes in a program and the write buffer performance is clear. The CPI contributed by the write buffer shows a corresponding jump.
There was a case where the percent of writes was frequently in the 15-20% range but the CPI write buffer is usually zero. If the writes were uniformly distributed below a threshold, the write buffer will never fill and a subsequent write will never be delayed. Above the threshold, there may be some delays. Since the second level cache is pipelined into three stages, the processing of a single miss gives the write buffer a chance to retire two entries. If enough misses are interspersed between the two writes, the write buffer may work smoothly Thus the first level data cache miss ratio is the third factor that comes into play.
Although this report doesn't study it, but the effects of entropy on the cache distribution hierarchy could be relevant. The Shannon entropy is defined as negative sum of P(x) log base 2 P(x)

Sunday, December 14, 2014

Data forwarders rely on one of the following means :
file monitoring
pipe monitoring
network port monitoring
settings/registry monitoring
etc.
However there are some protocols that also exchange data, ftp, ssh, http and although they are over network ports, they have to be filtered.
Instead take for example ssh, if there were a daemon that could forward relevant information on a remote computer via ssh, then we don't need a data forwarder on the remote machine and the footprint of the forwarder will be even smaller. A forwarder's reachability of data can be considered to be expanded by ssh data forwarders.
Now let us consider ssh forwarder code. This may be set up as a listener script so that on all incoming connections using a particular login, we echo the relevant data back. In this case, the script is as small as a bash script that can read up and spew out the relevant data as a response to the incoming connection and close the connection.
Since the data is forwarded only on ssh connection, this model works well with monitoring mechanisms that can periodically poll. Most monitoring mechanisms are setup as publisher subscriber model or polling model.
The SSH connections have the additional benefit that they are generally considered more secure than the the TCP connections because the data is not in the clear. Furthermore, the script deployed to the remote computer follows the usual mechanisms for securing this access. Hence it is likely to meet up with less resistance in deployment. On the other hand, consider the installation of a binary as part of product package on the remote machine. That may take a lot of questions in some cases such as licensing, fees etc.
Lastly, the commands to determine the data to be forwarded and sending it are typical shell commands and there is a more transparency in what script is deployed.
#codingexercise
Decimal GetEvenNumberRangeMax(Decimal [] A)
{
if (A == null) return 0;
Return A.EvenNumberRangeMax ();
}
#codingexercise
Decimal GetEvenNumberRangeMean(Decimal [] A)
{
if (A == null) return 0;
Return A.EvenNumberRangeMean ();
}

Saturday, December 13, 2014

Today we continue to review the WRL system for trace generation. We read up on the effect of direct-mapped versus associative second level cache. When associativity is increased, miss ratio can decrease but the overall performance may not improve or even degrade. If the cache is very large, it is likely to have a low missed ratio already. Increasing the associativity might provide little or no benefits. Associativity helps with the miss ratio but it adds up to the cost of every reference to a cache.
If we ignore this cost, we can bound the benefits of associativity with a comparison between the direct mapped and the fully associative caches. The results showed that the cumulative CPI for a fully associative second level cache with four different sizes were minimally effective for large caches but more effective for small caches.
A more useful measure for the benefits of associativity may be to see the effect on the total CPI as a function of the cost per reference of the associativity.
We know that the CPI for the associative case will be a delta more than the CPI for the direct mapped case. This delta is the combination of the first level miss ratio and the difference in the average cost per reference to the second level cache. This difference can be elaborated in terms of the cost of a hit in the associative cache (ha), the cost of a hit in a direct mapped cache (hd), the cost of a second level cache miss (same direct-mapped and associative) (m), the miss ration for a second level associative cache and the miss ratio for a second level direct-mapped.
In this case, the delta in the average cost per reference can be worked out as the difference in the average cost per reference in the associative case and in the direct mapped case. The first term is sum of the costs of access to the associative second level and the miss from this level. The associative second level cost of access is a combination of the cost of a hit and the miss ratio for that cache. One minus this miss ratio times the cost of a second level cache miss which is the same for a direct mapped and the associative case is the cost of a miss from this level. This kind of elaboration is true for the second term as well in the case of the direct mapped second-level cache. Writing the costs of a hit in terms of a ratio, we get the delta in the average cost per reference to be weighted based on the difference in the miss ratios for a second level associative cache and the second level direct mapped cache
We can now plot the graph of the cumulative CPI in both cases, and see that there is a cross over. The distance between the points at which the crossover occurs is the maximum gain to be expected from the associativity.

#codingexercise
Decimal GetEvenNumberRangeMin(Decimal [] A)
{
if (A == null) return 0;
Return A.EvenNumberRangeMin ();
}