Cluster computing

Friday, November 14, 2014

One more coding exercise just like the previous:
#codingexercise
Int GetCountDuplicatesMax (int [] A)
{
if (A == null) return 0;
return A.CountDuplicatesMax();
}
In today's post we continue our discussion on the convergence of conjugate gradient method. We have seen that at each step of CG, the value of the current eigenvector is chosen from the initial eigen vector and the current subspace. The current subspace is formed from the span of previous subspaces. Each subspace was found by applying the matrix A against a vector. The vector is the initial residual which is also the initial eigenvector. These subspaces called Krylov subspaces are found iteratively. Recall that subspace means that when we don't know what the search vector should be, we choose a search subspace to find it in. Fortunately this subspace is incremental and we look for the next residual to be A-orthogonal to this current search subpace, so the next residual is A-orthogonal to all the previous except the current search vector. This makes it easier to find the next residual. The new search vector will be a linear combination of the next residual and the current search vector.
For a given iteration, the error term can be found by applying the initial eigenvector to the sum of all the previous matrix and the identity matrix. This sum can be expressed as a polynomial of a degree equal to the number of iterations so far. It doesn't matter if the polynomial is expressed in terms of a scalar or a matrix, what we want is to evaluate it.This means that we can evaluate it with eigenvalues as scalar or with matrix A. This flexible notation of expressing the sum as a polynomial P will let us apply a vector the same to whether P was in terms of A or in terms of a scalar eigenvalue. Note that by definition of an eigenvalue, it is a scalar constant that makes the application of a matrix to a vector the same as the application of this eigenvalue to the same vector. If we keep applying the matrix and the eigenvalue to both sides of this equation, the equation doesn't change. This lets us write the error term as one that applies a polynomial to the initial error term. If we start with an initial polynomial that evaluates to one in this method, and express the error term as a linear combination of the orthogonal unit eigenvectors, we find that CG finds the polynomial that depends on how close a polynomial of degree i can be to zero on each eigenvalue, given the initial constraint of the initial polynomial = 1.

Thursday, November 13, 2014

#codingexercise
Int GetCountUnique (int [] A)
{
if (A == null) return 0;
return A.CountUnique();
}
To complete our post on the Oracle 11G identity access management service, we next look at Identity federation. Identity federation is required when there is a need for single sign on beyond a single internet domain. It consists of federation services that utilizes both AuthN modules and Service provider modules. The language for talking to Identity federation has been SAML which is an industry standard. SAML is an open framework for sharing security information on the internet through XML documents. SAML provides a standard way to transfer cookie information across multiple internet domains. Thus its a way to implement SSO. A SAML assertion can be used with web services security frameworks such as WS-Security. This identity federation service is also interoperable with CardSpace. By using proprietary directories or database or SAML assertions with internet directories, it can federate identity. In addition , it offers auditing, logging and monitoring. The OpenSSO implements security token service and supports web services trust language. Trust in this case, is usually established with WS-Trust and exchanging SOAP/WS-Security messages. Since trust is represented as tokens, there is a service to manage the issuing of tokens. This service issues, renews, cancels and validates security tokens, allows customers to write their own plugins and provides a WS-Trust based API for client and application access. The tokens issued are ones that can be authenticated via username, x.509, SAML and Kerberos.
Next we discuss Entitlement Server which is a fine grained authorization engine that externalizes, simplifies and unifies the entitlement policies. It offers a sophisticated delegated administration model to create modify or report on the entitlement policies. The administration server is layered above the authorization engine. While the administration server concerns itself with resource management, policy lifecycle, and policy distribution, the authorization engine is the policy framework or decision kernel that works with a publisher subscriber model.
The administration server acts as the policy administration point.
The security modules that implement the PDP communicates with the authorization server and the one that implements the PEP communicates with the authorization engine. The engine may talk to one or more attribute authorities and policy store.
The Adaptive Access Manager provides real time fraud detection, multifactor authentication and unique authentication strengthening. These are implemented as two modules - one is the authenticator and the other is the risk manager. The Authentication security protects against malware attacks. The risk manager looks at various risk factors simultaneously.
Information Rights management manages the contents produced by the subject. If a user signs into one application and writes one document and signs into another place and writes another, then they are secured by the IRM. Typically they are sealed and encrypted.
Governance of all of the above services can be facilitated with a common user interface.

Today I want to discuss the whitepaper from Oracle on Identity access management. Oracle 11g provides middleware service complete with identity management as an SOA. It secures the application grid in cloud computing. Resources as well as the processes acting on the management of the processes acting on those resources are protected. Identity access management includes such things as directory services, identity administration, access control, platform and web services security, identity and access governance, operational manageability, and service integration with suites both proprietary and external. The benefits include comprehensive identity services, integration with other services, standards based architecture where modules can be written as plug-ins. By comprehensive, it implies access control, single sign-on, role governance, multi-factor authentication, identity analytics, audits and reports. Integration benefits means each identity management and access control met through a business transaction from applications such as other middle-ware modules which works seamlessly. This offering leverages and integrates Oracle database through its own directory and identity virtualization services. It also offers Information Rights management to secure content. The standards based benefit implies that the data transfer via Security Services Markup Language and WS-Federation makes it possible for any vendor to customize with plugins.
The Oracle 11g identity access management service provides services for authentication, authorization, roles and entitlements, auditing, directory services, user provisioning, policy store, and session data management - all in a SOA model. It includes an Oracle Authorization Policy Manager for managing authorization policies. It manages both global and application-specific artifacts. Global artifacts include users, external roles, and system policies. Application specific policies are kept as a logical subset called a stripe in this policy store. Application specific artifacts include resource catalog, application policies, application roles, and role categories. The identity manager and the authorization manager both utilize these authorization policies. The only difference is that the policies are chained to identity store and while the identity manager modifies the identity store the authorization manager modifies the policy store. The User and Role API helps manage the identities using the identity governance framework hosted at projectliberty.org and even allows the ability for developers to create their own virtual identity database while retaining the ability to interconnect with enterprise identity services. The Authorization API is mentioned at another ongoing project called OpenAZ at the project liberty which uses the Extensible Access Control Markup Language that can represent attribute values along with an interoperable policy language to interpret them. The Authorization API is used for policy enforcement points, policy information points and policy decision points which issue authorization requests, obtain attributes from an attribute authority or the functionality of existing authorization providers. The Directory services include internet directory and enterprise directory and a virtual directory to provide identity aggregation and transformation without copying. The internet directory provides 1) scalability in terms of say the LDAP servers running on a node, 2) high-availability which is designed to enable continuous service availability at the process and storage level, 3) security in terms of both password and certificate based authentication and including encryption, 4) identity management and monitoring which is streamlined around two complementary components - enterprise manager and directory services manager, 5) directory integration and platform which includes a set of services enabling customers to synchronize data between various third party data sources 6) External authentication which enables seamless authentication against third party directories, 7) and an SDK for internet directory providers.
Next we discuss access management components which includes a consolidated SSO architecture, a policy simulator, an access manager, a session manager, administration console, a centralize diagnostics, and snapshots. The access management component concerns itself with authentication and identity assertion. The policy service works against all these modules.

Tuesday, November 11, 2014

When discussing conjugate gradient method in the previous post, we mentioned several items. Let's put them all together now. First, this is in fact a conjugate direction method. We don't have to keep track of previous search vectors. Each subspace is incremental and is found by applying a matrix to a vector. The name of the method has nothing to do with gradients because the search directions are not all gradients. It's just a misnomer.
We will next evaluate the convergence of conjugate gradient method. We saw that the progression for the method was linear. Why should we then care for convergence ? This is because there are a few errors encountered during the progress that impedes convergence. One for example is the floating point errors and the second for example is that the search vectors lose their A-orthogonality. The first problem could just as well occur in Steepest Descent methods and there are ways to overcome it. However, the second is not easy but there are ways to use this as an iterative procedure where we apply corrections. The convergence analysis therefore helps us evaluate running the method on sample datasets where it is not even possible to complete n iterations. The first iteration of CG is identical to the first iteration of the Steepest Descent method. We can apply the same conditions here too. If we look at the iteration equation, the initial search direction is also the initial residual direction formed from the initial subspace A applied to x0. All subsequent iterations result from the next residual and the current search vector. For convergence, let's start with the initial direction along the axes, the residual points to the center of the ellipsoid, giving a one step convergence to the solution. For a more general convergence, we can rely on the fact that we can find n A-orthogonal search vectors and we can express the error term as a linear combination of these vectors. With that we can find the next residual with some compromise. In fact, that error term is a weighted average from the step-lengths of previous residuals. Because its weighted, some of the contributions from the shorter step-lengths of residuals might increase and that is why this method is called a rougher. By contrast, the Jacobi method is a smoother because there every eigen vector is reduced on every iteration.
#codingexercise
Int GetCount (int [] A)
{
If (A == null) return 0;
Return A.Count();
}
Int GetCountDuplicates (int [] A)
{
If (A == null) return 0;
Return A.CountDuplicates();
}

In this post, I want to take a short detour to discuss OAuth API calls. This is a widely used pattern of API authorization. The API server merely issues a redirect to a web page where the user can login. The API server knows nothing about the user until it receives a callback from the referral site. When the callback comes, there is a module either in the API server or in the referral site called the OAuth Token provider that looks up the associated user and client to issue a token. When the token is issued, it is sent to the user who can use the api_key and token in making the API calls. Each API call validates the token and the api_key prior to sending a response.
Note that the API server is also allowed to be a referral to the OAuth token provider for mere convenience. This is called implicit workflow where the API server passes in the password directly to the token provider. As a sample code for this less secure method of getting tokens, here is the OAuth call:
var xdr = new XDomainRequest();
xdr.timeout = 10000;
xdr.onreadystatechange=function()
   {
   if (xdr.readyState==4 && xdr.status==200)
   {
   var resp = parseJSON(xdr.responseText);
   document.getElementById(params[access_token]).innerHTML=resp.access_token;
   }
}
xdr.open("POST","https://apiserver.com/v1/token/",true);
xdr.setRequestHeader("Content-type","application/json");
xdr.send("api_Key="+ document.getElementById("apiKey").value +"&grant_type=password&client_id=" + document.getElementById("apiKey").value + "&username=" + username + "&password=" + document.getElementById("password").value + "&scope=scope&client_secret=" + document.getElementById("apiSecret").value);
}
Another OAuth API token grant sequence is the authorization code grant. Here a code is issued and then the code is translated to token. From MSDN:
public static function getAuthenticationHeaderFor3LeggedFlow($code){
       // Construct the body for the STS request
        $authenticationRequestBody = "grant_type=authorization_code" . "&".
                                     "client_id=".urlencode(Settings::$clientId) . "&".
                                     "redirect_uri=".Settings::$redirectURI . "&".
                                     "client_secret=".urlencode(Settings::$password). "&".
                                     "code=".$code;

        //Using curl to post the information and get back the authentication response
        $ch = curl_init();
        // set url
        $stsUrl = 'https://apiserver.com/oauth2/token';
        //curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
        curl_setopt($ch, CURLOPT_URL, $stsUrl);
        // Get the response back as a string
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        // Mark as Post request
        curl_setopt($ch, CURLOPT_POST, 1);
        // Set the parameters for the request
        curl_setopt($ch, CURLOPT_POSTFIELDS, $authenticationRequestBody);
        //curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
        // By default, HTTPS does not work with curl.
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        // read the output from the post request
        $output = curl_exec($ch);
        // close curl resource to free up system resources
        curl_close($ch);

        //Decode the json response
        $tokenOutput = json_decode($output);
        $tokenType = $tokenOutput->{'token_type'};
        $accessToken = $tokenOutput->{'access_token'};
        $tokenScope = $tokenOutput->{'scope'};

        print("\t Token Type: ".$tokenType."\n AccessToken: ".$accessToken);
        // Add the token information to the session header so that we can use it to access Graph
        $_SESSION['token_type']=$tokenType;
        $_SESSION['access_token']=$accessToken;
        $_SESSION['tokenOutput'] = $tokenOutput;
       // it is possible to decode (base64) the accessToken and search claims, such as the user's upn
       // value.
       // However, this is not recommended because in the future, the access token maybe
       // encrypted.
       // $tokenOutput = base64_decode($accessToken);
       // $subString = strstr($tokenOutput,'"upn":');
       // $subString = strstr($subString, ',',TRUE);
       // $upn = rtrim(ltrim($subString,'"upn":"'), '"');
       // $_SESSION['upn']=$upn;
Similarly another OAuth API token grant sequence is merely for clients and this grant_type is called client_credentials.
As an example:

<?php
	function cURLcheckBasicFunctions()
	{
	if( !function_exists("curl_init") &&
	!function_exists("curl_setopt") &&
	!function_exists("curl_exec") &&
	!function_exists("curl_close") ) return false;
	else return true;
	}

	// declare
	if( !cURLcheckBasicFunctions() ) print_r('UNAVAILABLE: cURL Basic Functions');
	$apikey = 'your_api_key_here';
	$clientId = 'your_client_id_here';
	$clientSecret = 'your_client_secret_here';
	$url = 'https://apiserver.com/v1/oauth/token?api_key='.$apikey;
	$ch = curl_init($url);
	$fields = array(
	'grant_type' => urlencode('client_credentials'),
	'client_id' => urlencode($clientId),
	'client_secret' => urlencode($clientSecret),
	'scope' => urlencode('test_scope'),
	'state' => urlencode('some_state'));

	//url-ify the data for the POST
	$fields_string = '';
	foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
	rtrim($fields_string, '&');
	curl_setopt($ch, CURLOPT_URL, $url);
	// curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC ) ;
	// curl_setopt($ch, CURLOPT_USERPWD, $credentials);
	// curl_setopt($ch, CURLOPT_SSLVERSION, 3);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
	curl_setopt($ch, CURLOPT_POST, count($fields));
	curl_setopt($ch, CURLOPT_POSTFIELDS, $fields_string);

	//execute post
	if(curl_exec($ch) === false)
	{
	echo 'Curl error: ' . curl_error($ch);
	}
	else
	{
	echo 'Operation completed without any errors';
	}

	//close connection
	curl_close($ch);
	?> Now that we have completed our post on this topic, I will continue my discussion from previous posts shortly.but first another coding exercise: #codingexercise Int GetMode (int [] A) { If (A == null) return 0; Return A.Mode (); }

Monday, November 10, 2014

int getMin ( int [] A)
{
if (A == null) return 0;
return A.Min();
}

int getMax ( int [] A)
{
if (A == null) return 0;
return A.Max();
}

int getAvg ( int [] A)

{

if (A == null) return 0;

return A.Avg();

}

int getMedian ( int [] A)

{

if (A == null) return 0;

return A.Median ()

}
Today we continue to discuss conjugate gradient methods. We discussed conjugate gradient direction earlier but we knew the limitations there. In this post, we cover why conjugate gradient have become more popular. All of the methods we have discussed so far have been using the linear equation or quadratic form. In fact, the method of conjugate gradients is simply the method of conjugate directions where the search directions are constructed by conjugation of the residuals. We saw that when we replaced the position vectors with the residuals, it worked for the steepest descent method. And the residuals have the nice property that it is orthogonal to the previous search directions. This worked well for us because we were guaranteed a new search direction and if the residual was zero, we had arrived at the final answer. There is an even better reason to choose the residual as we will see in this method.
First the search vectors are built from the residuals, the subspace span is equal to Di. As each residual is orthogonal to the previous search directions, it is also orthogonal to the previous residuals. From the recurrence to find the residual as in the Steepest descent method, we already know that each residual is just a linear combination of the previous residual and the matrix component along the search direction. Since the search directions belong to the subspace span, each new subspace is formed from the union of the previous subspace and the current subspace ADi. We transform the subspace of search directions to a subspace of search direction to a subspace of residuals. This subspace is called Krylov subspace. It is formed by incrementally applying a matrix to a vector. Because it is incremental, it has the nice property that the next residual is orthogonal to the current subspace, This also implies that the next residual is A-orthogonal to the the previous subspace. With this the iteration becomes simpler as with Graham Schmidt conjugation because r(i+1) is already A-orthogonal to all of the previous search directions except di. What this means is we no longer need to store the old search vectors to ensure A-orthogonality of new search vectors. The major advance is what makes CG as important an algorithm as it is, because both the space complexity and time complexity per iteration are reduced from order of N ^2 to linear.

Sunday, November 9, 2014

#codingexercise
bool SequenceEqual(int[] A, int[] B)
{
   if ( A == null || B == null) return false;
   if ( A == B ) return true;
   if (A.Length != B.Length) return false;
   for (int i = 0; i < A.Length; i++)
        if (A[i] != B[i]) return false;
   return true;
}

Int getSum ( int [] A)
{
If (A == null) return 0;
Return A.Sum();
}