Cluster computing

Monday, February 1, 2016

Paas, CloudFoundry and UI

User Interfaces and portals for Web applications have traditionally been deployed to dedicated virtual machines and are bound to it. When such applications need to be migrated, it involves manual intervention and planning. But both applications and services can both take benefits of CloudFoundry and PaaS. They are not restricted to the same physical machine that they were once deployed on. With these software, the applications and services can be rolled from one host to another seamlessly without any loss of functionality. They don't require any updates either to code or configuration when they are rolled between hosts.
User Interface is no exception. If APIs and services can avail the advantages of PaaS which include such things as automated code deployments from source control using say Jenkins, continuous monitoring of applications and services, consistent maintenance across applications and services, availability of multiple buildpacks and server software to host the applications. We will go through these benefits in detail shortly but let us take a look at the hosting requirements between API and UI in PaaS.
API software usually exposes an http/https endpoint and port for connectivity. While it may involve authentication, encryption and use of api-keys, all of these are done over basic authentication scheme or OAuth, they are contained to api implementations and external service consolidators such as a gateway. A user interface on the other hand may have to integrate with an identity provider and authentication mechanism such as SAML or OAuth external and probably central to an organization and not necessarily within the control of the applications and services framework. Fortunately these are facilitated with a route from the PaaS. The application merely has to handle the redirects from the said authentication mechanisms and the redirects are to specific name:port that can be ported between PaaS hosts as routes. The external ip address of the host and port can be retrieved in code as CF_INSTANCE_ADDR and CF_INSTANCE_PORT
Most of the UI code is generally written as static drop in /var/www folder or as a server side software such as node.js or django. In either case, the provisioning of UI could be attempted on CloudFoundry and PaaS by deploying the code and setting the environment variables such as HOME and WEBDIR

#codingexercise
Yesterday we discussed Hamiltonian cycles from complete graphs.
Permutations of a string:
Void Permute (String a, StringBuilder b, bool[] used, List<string> edges)
{
If ( b.Length == a.Length) { if ( edges.Any(x => b.contains(x)) == false) print b.ToString(); return;}
For (int I = 0; I < a.Length ; i++)
{
If (used[i]) continue;
used[i] = true;
B += A[i];
Permute(a, b, used, edges);
B [B.Length – 1] = ‘/0’;
used[i] = false;
}
}

Not the forbidden edges can actually be enumerated in different combinations and excluded from the permutations before a permutation reaches full length.

Sunday, January 31, 2016

We continue discussing the paper "Shielding applications from an untrusted cloud with Haven,” written by Andrew Baumann, Marcus Peinado, and Galen Hunt. We discuss the various issues encountered. We now review SGX optimizations that were recommended by this Haven study. One was Exception Handling and the other was Demand Loading.
When an exception is reported to the user, SGX provides the register context and information to shield module. The shield then performs sanity-checks to ensure that the exception is valid, it is in the shield module and that it should be reported to the LibOS. When the shield prepares to deliver the exception to the LibOS, it copies the context and cause in a format defined by the drawbridge ABI. The context is modified to run the LibOS exception handler. Haven resumes the modified context. But the ERESUME instruction is only available outside the enclave. In fact both ERESUME and IRET are illegal in an enclave. The shield, therefore, EEXITs to a small untrusted stub that immediately resumes the enclave, restoring the context. The LibOS then runs and the application exception handlers get a chance which typically finish by executing the IRET to restore the original context. But the IRET is an illegal instruction and so another exception is triggered. A single application exception results in two exceptions and eight enclave crossings. If these instructions were permitted or their replacements provided, then it would be a significant performance win.
Demand loading is the other scenario where optimization has been made possible. Haven's shield loads application and LibOS binaries. Initially however it did not load lazily. A lazy load is so called because the virtual address space is reserved but the pages are allocated and filled only on first access. As loading occurs, other threads may also access the same pages. Demand paging is done using a private memory mapping before remapping pages with appropriate permissions in the final location. But SGX does not support moving an existing page. Haven loads all the binaries even before a page fault occurs. This adds overhead to startup because there may be many DLLs that are to be loaded costing time and also space. An optimization would be to include an instruction which allows a new page to be both allocated and initialized with a copy of the data from the private memory mapping. Then this new page can be made available to the application. Thus demand loading can be another win.
#codingexercise
In a complete undirected graph, there are n nodes numbered 1 to n and there are k forbidden edges. Find the number of Hamiltonian cycles in this graph that don't contain the forbidden edges. Assume that order of traversal does not matter.

A Hamiltonian cycle is one which visits all the vertices exactly once. A forbidden edge that connects two vertices must not be in this cycle. The graph is complete and undirected so every pair has an edge. Therefore a node can reach every other node. When the nodes are listed in the order of the Hamiltonian cycle, we see that those nodes can be jumbled in any way because each node can reach every other node. Consequently the problem degenerates to finding the permutations of the sequence 1 to n. such that the forbidden edges don't appear.

Permutations of a string:
Void Permute (String a, StringBuilder b, bool[] used, List<string> edges)
{
If ( b.Length == a.Length) { if ( edges.Any(x => b.contains(x)) == false) print b.ToString(); return;}
For (int I = 0; I < a.Length ; i++)
{
If (used[i]) continue;
used[i] = true;
B += A[i];
Permute(a, b, used, edges);
B [B.Length – 1] = ‘/0’;
used[i] = false;
}
}

Saturday, January 30, 2016

Governance, Regulations and Compliance:

Cloud computing has proliferated VMs while security standards have been trying to catch up from the resource centric IT environments earlier to more datacenter oriented environments now. Fortunately, this has evolved based on the clouds are offered – public, community, private and hybrid. A public cloud has the highest risk due to lack of security control, multi-tenancy, data management, limited SLA and lack of common regulatory controls. A community cloud has moderate risk due to multi-tenancy, however it has less risk than public cloud due to shared legal/regulatory compliance issues. A private cloud has the least risk due to single ownership and strong shared mission goals along with legal/regulatory requirements. A Hybrid cloud has risk that depends upon combined models. A combination of private/community is lowest risk while a combination of public/community poses greatest risk. The scope also matters. A public cloud serves several hundreds of organizations. A community cloud works with the private network of two or more organizations. A private cloud is entirely internal to the organization’s private network. A hybrid cloud can have a private/community cloud entirely within the organization’s private network with spillover capacity to a public/community cloud.

The Information security governance framework is primarily Plan, Do, Check, Act cycle of continuous improvement and is comprised of seven management processes. These are strategy & planning, policy portfolio management, Risk management, management overview, Communication & outreach, compliance and performance management, awareness & training. The management processes govern the implementation and operation of the functional processes, which vary based on the cloud environment.

Central to the implementation of the functional processes, is the scheduled sweep of resources for GRC purposes. These sweeps involve tightening the configurations of Virtual Machines in all forms and flavors. These cover such things as the network connectivity configurations and System Security Services. When a user logs into the VM, whether his password has expired or not, whether he is still an active employee or not, whether a login can be granted or not etc. are all part of the server hardening requirements. Yet the boiler plate configurations at each Virtual machine often escape the scrutiny that otherwise falls on the public cloud. When a public cloud is set up to run an Active Directory, it is usually done as a managed service. The connectivity from the virtual machines depends on their configurations. The access provider, the id provider and the change password provider specified in the sssd configuration determine how the virtual machines enable accounts. A careful scrutiny of this configuration can itself eliminate several vulnerabilities and maintenance activities. The cost of workflows and implementations increases significantly as and when the ripples reach downstream systems or later point of time. Therefore early and proactive mitigation by compliance and governance processes is immensely beneficial. When done right, it does not even require to change very often

#coding exercise
we discussed generating groups of K from N elements using recursion as saying
void assignGroups(int n, int k)
{
if (k == 1) { // all elements in one group }
if (n == k) { // each element in its own group }
return assignGroups( k * assignGroups(n-1) + assignGroups(n-1, k-1));
From the above, We know the k sets can have sizes in one of the following configurations
N-k+1, 1, 1, ... 1 (k-1 times)
N-k, 2, 1 ... 1 (k-2 times )
.
.
N/k, N/k, N/k .... (k times)

These sequences of sizes don't change while the contents can change.
But we know we can We can generate combinations of any length m for one of the sets in k with

Void Combine (List<point> a, List<point> b,int m, int start, int level)

{

For (int I = start ; I < a.Length; i++)

{

b[level] = a[i];
if (b.length == m) print(b);

If (I < a.Length)

Combine(a,b, m, start+1,level+1)

B[level] = NULL;

}

In fact we already know all the combinations of all sizes from above. This helps us make one set in k. The remaining sets can be made with remaining elements with the same code.
So instead of taking permutations, we could also solve this problem exclusively with combinations.

Thursday, January 28, 2016

We continue discussing the paper "Shielding applications from an untrusted cloud with Haven,” written by Andrew Baumann, Marcus Peinado, and Galen Hunt. We discussed performance study earlier. Today we discuss trusted computing base and the various issues encountered. The LibOS includes a large subset of Windows. But all the code is in the enclave. Therefore it is all in the user's trusted computing base and its sanctity can be maintained with scanning for malware, code inspection and other such means. This allows the application to move from a private data center to a public cloud. In this regard, Haven addresses two real threats : a malicious employee of the cloud provider with either admin privileges or hardware access and a government subpoena.Finally, Haven also relies on the processors' correctness.
There are some future work planned with Haven. Haven does not currently prevent rollback of filesystem state beyond the enclave's lifetime. It cannot avoid the following attack. the enclave is terminated.( eg, the host fakes a crash) and its in-memory state is lost. A new instance of the enclave accessing the VHD can read consistent data but not the latest. To mitigate such attacks, a non-volatile storage is needed but every write then has a networking communication cost which is way too expensive. However critical writes could be safeguarded this way.
Haven also relies on system time and timeouts by the untrusted host but both these can be changed by a malicious host. Haven plans to mitigate this by ensuring the clock always runs forward and the use of a cycle counter as an alternative time source.
VMs can be saved resumed and migrated. Host has to capture and recreate guest state. However Haven explicitly prevents this. Haven plans to support these in the future with checkpoint and resume at the Drawbridge ABI level. In the simplest form, the host could request the Haven guest to suspend itself which it would do by capturing its own state to an encrypted image. The host can spin off a new enclave to resume execution on another node. Before gaining access to the encrypted image, the new guest would perform an attestation step. This gives it the keys necessary to access the encrypted checkpoint image.

#square fields problem ( covering N points with K squares on XY plane )

Another way to solve this problem would be to generate the permutation of the N points as follows:
Void Permute (List<Point> a, List<Point> b, bool[] used)
{
If ( b.Length == a.Length)  { print b.ToString(); return;}
For (int I = 0; I < a.Length ; i++)
{
If (used[i]) continue;
used[i] = true;
B += A[i];
Permute(a, b, used);
B [B.Length – 1] = NULL;
used[i] = false;
}
}
and for each permutation assign them to different squares where squares range from 1 to K and each square can have 1 to N-K+1 points.
List<List<Point>> assignSquares(List<Point> b)
{
int I = 0;
var squares = new List<List<Point>>();
for (int m = 0; m < k; m++) {
// initialize each of the k squares
int count = N-K+1;
var square = new List<Point>(count);
square.AddRange(Enumerable.Repeat(NULL, count));

// assign
for(int t = 0; t < N-K+1; t++){
    if ( I == N) continue;
    square[t] = b[I];
    I++;
}
squares.Add(square);
}
return squares;
}
abc
ab c
ba c
ac b
ca b
bc a
cb a
abcd
abcd

How do we meet the compliance and security requirements for a Cloud?
Here are some cloud provider readiness checks:
Does the cloud have the ability to encrypt data at rest and in transit?
Does the cloud have the ability to pull audit information via logs?
Does the cloud include role-based access control?
Does the cloud have the ability to map roles according to enterprise hierarchy, or to a facsimile of the enterprise organizational structure?
Can the cloud authenticate against a central system of record based on user roles and assignments?
Can the cloud integrate with existing command-and control systems?
Can the cloud back up data off the cloud?
Does the cloud have built-in disaster recovery capabilities?
courtesy : Shriram Natarajan
In addition, the organization supporting cloud services will need to honor audit requirements and Health Insurance Portability and Accountability Act. Payment Card Industry customers will need to re-certify yearly to ensure they are still complying with regulations, and the cloud provider should be able to meet these requirements. We can guarantee the cloud compliance once we have fully understood and documented the data flow. That includes not only data ownership and auditability, but cloud and storage configuration as well. If anything is not included or not documented in the contract, it doesn't exist and the organizations that want to maintain governance, regulation and compliance can't use a contract template. A customized contract is a must to ensure their needs are met.
Another approach is to use Plan, Do, Check and Act cycle also known as Deming cycle for cloud security and compliance management. In this case, the Plan phase
determines the scope of security and compliance
requirements including regulatory and business requirement evaluations, and designs deployment accordingly.
The Do phase defines the security controls and risk management framework. This includes choosing encryption, secu-
rity token, identity and access management, identity management options and
controls to detect and prevent intrusion,. The last category includes security incident and event management and data leakage protection products. In the Check phase, organizations define auditing objectives. These involve not just SOX but also fine grained historical action logging. In the Act phase they mitigate vulnerabilities. This not only includes audits, but also continuous monitoring and security improvements.
Courtesy: Kanwarjeet Panesar

Tuesday, January 26, 2016

Automated configurations:
In today’s cloud computing, the unit of compute and storage offering is a virtual machine. Although limited in the cpu power and hard disk, a virtual machine suffices as a platform to host several applications, services and their workloads. However, cloud computing taught us to make the application more modular and host each in their own virtual machine. This poses two new requirements – we need more compute and storage and we need a container around these. We solved some of these with being able to create clusters and with such as docker. However, clusters often are homogeneous in their constituents and dockers enable application isolation rather than compute and storage containers.
There is another challenge as well in that we cannot use cloud computing for storage of confidential or private data. This we have worked around so far by hosting storage solutions external to the cloud such as a database server with encryption and a massive disk to support. In other cases, we have even looked at providing trusted computing on specialized hardware. But data and logs are not the only things persisted and they cannot be even always secured on common servers unless they go to their own database and log indexes. Many times we don’t even have maintenance on these shared solutions or feedback cycle to improve the resource utilization.
This is where I propose a set of configurations I call templates that consists of a set of VMs each earmarked for a purpose. For example, we can take a set where one VM is designated ‘app1’ and hosts applications, another designated ‘app2’ also hosts applications, one designated as data1 for data storage, one dedicated as cache1 for cache and one designated as ‘message queue’ server etc. Templates vary in their size and configurations and are formed by eliciting patterns from existing virtual machine usages for a cloud that caters to many users.
Templates then become a unit of distribution and like the virtual machines they can be added and released. Template differ from so called ‘projects’ or other forms of containers in that they are pre-build, preloaded and come configured with a topology and software. The way we build and roll out templates are similar in nature to the way we build images. Images are central to how VMs are cut and released to end-users. Templates too can be used to create instances and assigned to users.

#coding exercise

N points are given on a plane that need to be covered with

K squares that are symmetrical, aligned to the coordinate axes and can overlap. What is the minimum length of an edge of the square ?

We know that N cannot be less than K

If N == K, then the square lengths are known and nothing needs to be done.

if N > K, there will be some points that share the same square.

Our task is to put the points within the squares.

Let us take a look at how to put points within the square

assume left[K], right[K], top[k] and bottom[k] to give the boundaries of the i'th-square of k-squares

int placePointInSquare(int i, int x, int y)

{

int newSize = 0;

int origleft = left[i];

int origright = right[i];

int origtop = top[i];

int origbottom = bottom[i];

left[i] = Math.min(origleft, x);

right[i] = Math.max(origright, x);

top[i] = Math.max(origtop, y);

bottom[i] = Math.min(origbottom, y);

newSize = Math.max(right[i]-left[i], top[i]-bottom[i]);

return newSize;

}
Initially we form a square one points with left=right=x and top=bottom=y and then we can add more points to it.
The idea in the above function is merely that we update based on both x and y axis.
The groups of K can be formed with combinations. We know combinations is recursive - an example of combining k elements from n is given by the combine() method as below

Combinations of a string
Void Combine (List<point> a, List<point> b, int start, int level)
{

   For (int I = start ; I < a.Length; i++)
   {
       If (b.Length == a.Length) print b;
       b[level] = a[i];
       Print b.ToString();
       If (I < a.Length)
            Combine(a,b, start+1,level+1)
       B[level] = NULL;
   }
}
However, in this case we have to distribute n elements into k squares.
we do this with Stirling numbers of the second kind which can be defined recursively as :
s2(n,k) = 1, if n > 0 and k = 1
            = n * s2 (1,1), if n = k
            = k * s2(n-1) + s2(n-1, k-1), 0 < k < n

This recurrence formula follows from the enumerations.
For example,

Thus if A {1, 2, 3, 4} and k 2 we would get the following subsets:

{1, 2, 3}, {4}

{2, 3, 4}, {1}

{1, 3, 4}, {2}

{1, 2, 4}, {3}

{1, 2},{3, 4}

{1, 3},{2, 4}

{1, 4},{2, 3}

In other words, we can combine the elements to form different lengths and take the Cartesian product of the combinations of lengths as above.

When we create different subsets of k from the n points, for each arrangement we can find out the newSize - one for each of the k subsets in the arrangement and choose the max from all the k subsets.

For this arrangement we have this resulting size of the squares available.
Similarly from each enumeration from the Stirling number, we can calculate the resulting size of square. Finally we choose the minimum resulting size from all such enumerations.

We continue discussing the paper "Shielding applications from an untrusted cloud with Haven,” written by Andrew Baumann, Marcus Peinado, and Galen Hunt. We were discussing the limitations of SGX that Haven had to workaround and which were subsequently fixed in the revised version of SGX. We now read up on the performance evaluation. Haven was developed and tested using a functional emulator for SGX provided by Intel. This does not come with the SGX CPU and the emulator is not cycle-accurate, hence a performance model was formed. The approach was to measure Haven's sensitivity to SGX parameters. This model assumed that an SGX implementation will perform the same as a CPU. However it will have two drawbacks - first the additional costs of SGX instructions and asynchronous exits, and second, an additional memory penalty (latency/bandwidth of cache misses )for memory encryption when accessing EPC. This ignores the cost at cold start and considers the performance after a warmup period. Many of the SGX instructions are executed only at enclave startup. EPC was considered large enough to hold the working set of the applications. The other overheads include the dynamic memory allocation and the transitions into and out of enclave mode and asynchronous exits. The dynamic allocation instructions only check and update the page protection metadata. The transitions necessitate a TLB flush and also perform a series of checks and updates.
As a control to study the comparision of Haven on SGX, a second version of Haven was implemented that does not use SGX.This simulates SGX performance for the above critical instructions by busy waiting the CPU for a configurable number of processor cycles. In addition, for each enclave transition a system call is used to flush the TLB. This adds a cost that is not on the SGX and hence this study gives a conservative estimate for the performance of Haven. There are two disallowed instructions in SGX - IRET and CPUID but their overhead cannot be simulated. Fortunately, they happen only at startup or are very rare. To simulate memory penalties and a slower EPC, the system's DRAM frequency was artificially reduced.
Two kinds of applications were run with this performance model. The first is a database server (SQL Server) and the second is a http server ( Apache ). The experiments were run on a 4 core Intel processor running at 2.4 GHz with 8 GB of 1600 MHz DDR3 RAM, a 240GB SSD and a gigabit ethernet running Windows 8.1 This mobile optimized platform was used because it let the DRAM frequency and timings.
#typewritermonkey
Yesterday we were discussing a typewriter monkey problem. We said the solution depended on the max possible occurrence of the target word as well as the average expected number of target words
The max gives the number of bananas that should be brought and the average gives the number that should be handed out leaving the rest to keep. But let us take a few examples of how to get the average.
if the typewriter can only type A or B ( K = 2) and the target word is B (L=1) and the monkey types two letters (S=2) then we have four possible outputs with equal probability of AA ,AB, BA and BB giving 0,1,1 and 2 target occurrences and the average pay is 4/4= 1. Consequently we get to keep 1.
Let us assume there's a function that pattern matches. These are well known and all ready available as say regex.
Now we just have to compute the score for the monkey
max = 0
avg = 0
for each possible permutation of length S
find substrings of length L by iterating i from 0 to S-L+1
for each substring find number of occurrences with regex as count
max = max(count, max);
avg = avg + count

print output = max-(avg/ K^S)