Cluster computing

Thursday, November 13, 2014

Today I want to discuss the whitepaper from Oracle on Identity access management. Oracle 11g provides middleware service complete with identity management as an SOA. It secures the application grid in cloud computing. Resources as well as the processes acting on the management of the processes acting on those resources are protected. Identity access management includes such things as directory services, identity administration, access control, platform and web services security, identity and access governance, operational manageability, and service integration with suites both proprietary and external. The benefits include comprehensive identity services, integration with other services, standards based architecture where modules can be written as plug-ins. By comprehensive, it implies access control, single sign-on, role governance, multi-factor authentication, identity analytics, audits and reports. Integration benefits means each identity management and access control met through a business transaction from applications such as other middle-ware modules which works seamlessly. This offering leverages and integrates Oracle database through its own directory and identity virtualization services. It also offers Information Rights management to secure content. The standards based benefit implies that the data transfer via Security Services Markup Language and WS-Federation makes it possible for any vendor to customize with plugins.
The Oracle 11g identity access management service provides services for authentication, authorization, roles and entitlements, auditing, directory services, user provisioning, policy store, and session data management - all in a SOA model. It includes an Oracle Authorization Policy Manager for managing authorization policies. It manages both global and application-specific artifacts. Global artifacts include users, external roles, and system policies. Application specific policies are kept as a logical subset called a stripe in this policy store. Application specific artifacts include resource catalog, application policies, application roles, and role categories. The identity manager and the authorization manager both utilize these authorization policies. The only difference is that the policies are chained to identity store and while the identity manager modifies the identity store the authorization manager modifies the policy store. The User and Role API helps manage the identities using the identity governance framework hosted at projectliberty.org and even allows the ability for developers to create their own virtual identity database while retaining the ability to interconnect with enterprise identity services. The Authorization API is mentioned at another ongoing project called OpenAZ at the project liberty which uses the Extensible Access Control Markup Language that can represent attribute values along with an interoperable policy language to interpret them. The Authorization API is used for policy enforcement points, policy information points and policy decision points which issue authorization requests, obtain attributes from an attribute authority or the functionality of existing authorization providers. The Directory services include internet directory and enterprise directory and a virtual directory to provide identity aggregation and transformation without copying. The internet directory provides 1) scalability in terms of say the LDAP servers running on a node, 2) high-availability which is designed to enable continuous service availability at the process and storage level, 3) security in terms of both password and certificate based authentication and including encryption, 4) identity management and monitoring which is streamlined around two complementary components - enterprise manager and directory services manager, 5) directory integration and platform which includes a set of services enabling customers to synchronize data between various third party data sources 6) External authentication which enables seamless authentication against third party directories, 7) and an SDK for internet directory providers.
Next we discuss access management components which includes a consolidated SSO architecture, a policy simulator, an access manager, a session manager, administration console, a centralize diagnostics, and snapshots. The access management component concerns itself with authentication and identity assertion. The policy service works against all these modules.

Tuesday, November 11, 2014

When discussing conjugate gradient method in the previous post, we mentioned several items. Let's put them all together now. First, this is in fact a conjugate direction method. We don't have to keep track of previous search vectors. Each subspace is incremental and is found by applying a matrix to a vector. The name of the method has nothing to do with gradients because the search directions are not all gradients. It's just a misnomer.
We will next evaluate the convergence of conjugate gradient method. We saw that the progression for the method was linear. Why should we then care for convergence ? This is because there are a few errors encountered during the progress that impedes convergence. One for example is the floating point errors and the second for example is that the search vectors lose their A-orthogonality. The first problem could just as well occur in Steepest Descent methods and there are ways to overcome it. However, the second is not easy but there are ways to use this as an iterative procedure where we apply corrections. The convergence analysis therefore helps us evaluate running the method on sample datasets where it is not even possible to complete n iterations. The first iteration of CG is identical to the first iteration of the Steepest Descent method. We can apply the same conditions here too. If we look at the iteration equation, the initial search direction is also the initial residual direction formed from the initial subspace A applied to x0. All subsequent iterations result from the next residual and the current search vector. For convergence, let's start with the initial direction along the axes, the residual points to the center of the ellipsoid, giving a one step convergence to the solution. For a more general convergence, we can rely on the fact that we can find n A-orthogonal search vectors and we can express the error term as a linear combination of these vectors. With that we can find the next residual with some compromise. In fact, that error term is a weighted average from the step-lengths of previous residuals. Because its weighted, some of the contributions from the shorter step-lengths of residuals might increase and that is why this method is called a rougher. By contrast, the Jacobi method is a smoother because there every eigen vector is reduced on every iteration.
#codingexercise
Int GetCount (int [] A)
{
If (A == null) return 0;
Return A.Count();
}
Int GetCountDuplicates (int [] A)
{
If (A == null) return 0;
Return A.CountDuplicates();
}

In this post, I want to take a short detour to discuss OAuth API calls. This is a widely used pattern of API authorization. The API server merely issues a redirect to a web page where the user can login. The API server knows nothing about the user until it receives a callback from the referral site. When the callback comes, there is a module either in the API server or in the referral site called the OAuth Token provider that looks up the associated user and client to issue a token. When the token is issued, it is sent to the user who can use the api_key and token in making the API calls. Each API call validates the token and the api_key prior to sending a response.
Note that the API server is also allowed to be a referral to the OAuth token provider for mere convenience. This is called implicit workflow where the API server passes in the password directly to the token provider. As a sample code for this less secure method of getting tokens, here is the OAuth call:
var xdr = new XDomainRequest();
xdr.timeout = 10000;
xdr.onreadystatechange=function()
   {
   if (xdr.readyState==4 && xdr.status==200)
   {
   var resp = parseJSON(xdr.responseText);
   document.getElementById(params[access_token]).innerHTML=resp.access_token;
   }
}
xdr.open("POST","https://apiserver.com/v1/token/",true);
xdr.setRequestHeader("Content-type","application/json");
xdr.send("api_Key="+ document.getElementById("apiKey").value +"&grant_type=password&client_id=" + document.getElementById("apiKey").value + "&username=" + username + "&password=" + document.getElementById("password").value + "&scope=scope&client_secret=" + document.getElementById("apiSecret").value);
}
Another OAuth API token grant sequence is the authorization code grant. Here a code is issued and then the code is translated to token. From MSDN:
public static function getAuthenticationHeaderFor3LeggedFlow($code){
       // Construct the body for the STS request
        $authenticationRequestBody = "grant_type=authorization_code" . "&".
                                     "client_id=".urlencode(Settings::$clientId) . "&".
                                     "redirect_uri=".Settings::$redirectURI . "&".
                                     "client_secret=".urlencode(Settings::$password). "&".
                                     "code=".$code;

        //Using curl to post the information and get back the authentication response
        $ch = curl_init();
        // set url
        $stsUrl = 'https://apiserver.com/oauth2/token';
        //curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
        curl_setopt($ch, CURLOPT_URL, $stsUrl);
        // Get the response back as a string
        curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
        // Mark as Post request
        curl_setopt($ch, CURLOPT_POST, 1);
        // Set the parameters for the request
        curl_setopt($ch, CURLOPT_POSTFIELDS, $authenticationRequestBody);
        //curl_setopt($ch, CURLOPT_PROXY, '127.0.0.1:8888');
        // By default, HTTPS does not work with curl.
        curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
        // read the output from the post request
        $output = curl_exec($ch);
        // close curl resource to free up system resources
        curl_close($ch);

        //Decode the json response
        $tokenOutput = json_decode($output);
        $tokenType = $tokenOutput->{'token_type'};
        $accessToken = $tokenOutput->{'access_token'};
        $tokenScope = $tokenOutput->{'scope'};

        print("\t Token Type: ".$tokenType."\n AccessToken: ".$accessToken);
        // Add the token information to the session header so that we can use it to access Graph
        $_SESSION['token_type']=$tokenType;
        $_SESSION['access_token']=$accessToken;
        $_SESSION['tokenOutput'] = $tokenOutput;
       // it is possible to decode (base64) the accessToken and search claims, such as the user's upn
       // value.
       // However, this is not recommended because in the future, the access token maybe
       // encrypted.
       // $tokenOutput = base64_decode($accessToken);
       // $subString = strstr($tokenOutput,'"upn":');
       // $subString = strstr($subString, ',',TRUE);
       // $upn = rtrim(ltrim($subString,'"upn":"'), '"');
       // $_SESSION['upn']=$upn;
Similarly another OAuth API token grant sequence is merely for clients and this grant_type is called client_credentials.
As an example:

<?php
	function cURLcheckBasicFunctions()
	{
	if( !function_exists("curl_init") &&
	!function_exists("curl_setopt") &&
	!function_exists("curl_exec") &&
	!function_exists("curl_close") ) return false;
	else return true;
	}

	// declare
	if( !cURLcheckBasicFunctions() ) print_r('UNAVAILABLE: cURL Basic Functions');
	$apikey = 'your_api_key_here';
	$clientId = 'your_client_id_here';
	$clientSecret = 'your_client_secret_here';
	$url = 'https://apiserver.com/v1/oauth/token?api_key='.$apikey;
	$ch = curl_init($url);
	$fields = array(
	'grant_type' => urlencode('client_credentials'),
	'client_id' => urlencode($clientId),
	'client_secret' => urlencode($clientSecret),
	'scope' => urlencode('test_scope'),
	'state' => urlencode('some_state'));

	//url-ify the data for the POST
	$fields_string = '';
	foreach($fields as $key=>$value) { $fields_string .= $key.'='.$value.'&'; }
	rtrim($fields_string, '&');
	curl_setopt($ch, CURLOPT_URL, $url);
	// curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC ) ;
	// curl_setopt($ch, CURLOPT_USERPWD, $credentials);
	// curl_setopt($ch, CURLOPT_SSLVERSION, 3);
	curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
	curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
	curl_setopt($ch, CURLOPT_POST, count($fields));
	curl_setopt($ch, CURLOPT_POSTFIELDS, $fields_string);

	//execute post
	if(curl_exec($ch) === false)
	{
	echo 'Curl error: ' . curl_error($ch);
	}
	else
	{
	echo 'Operation completed without any errors';
	}

	//close connection
	curl_close($ch);
	?> Now that we have completed our post on this topic, I will continue my discussion from previous posts shortly.but first another coding exercise: #codingexercise Int GetMode (int [] A) { If (A == null) return 0; Return A.Mode (); }

Monday, November 10, 2014

int getMin ( int [] A)
{
if (A == null) return 0;
return A.Min();
}

int getMax ( int [] A)
{
if (A == null) return 0;
return A.Max();
}

int getAvg ( int [] A)

{

if (A == null) return 0;

return A.Avg();

}

int getMedian ( int [] A)

{

if (A == null) return 0;

return A.Median ()

}
Today we continue to discuss conjugate gradient methods. We discussed conjugate gradient direction earlier but we knew the limitations there. In this post, we cover why conjugate gradient have become more popular. All of the methods we have discussed so far have been using the linear equation or quadratic form. In fact, the method of conjugate gradients is simply the method of conjugate directions where the search directions are constructed by conjugation of the residuals. We saw that when we replaced the position vectors with the residuals, it worked for the steepest descent method. And the residuals have the nice property that it is orthogonal to the previous search directions. This worked well for us because we were guaranteed a new search direction and if the residual was zero, we had arrived at the final answer. There is an even better reason to choose the residual as we will see in this method.
First the search vectors are built from the residuals, the subspace span is equal to Di. As each residual is orthogonal to the previous search directions, it is also orthogonal to the previous residuals. From the recurrence to find the residual as in the Steepest descent method, we already know that each residual is just a linear combination of the previous residual and the matrix component along the search direction. Since the search directions belong to the subspace span, each new subspace is formed from the union of the previous subspace and the current subspace ADi. We transform the subspace of search directions to a subspace of search direction to a subspace of residuals. This subspace is called Krylov subspace. It is formed by incrementally applying a matrix to a vector. Because it is incremental, it has the nice property that the next residual is orthogonal to the current subspace, This also implies that the next residual is A-orthogonal to the the previous subspace. With this the iteration becomes simpler as with Graham Schmidt conjugation because r(i+1) is already A-orthogonal to all of the previous search directions except di. What this means is we no longer need to store the old search vectors to ensure A-orthogonality of new search vectors. The major advance is what makes CG as important an algorithm as it is, because both the space complexity and time complexity per iteration are reduced from order of N ^2 to linear.

Sunday, November 9, 2014

#codingexercise
bool SequenceEqual(int[] A, int[] B)
{
   if ( A == null || B == null) return false;
   if ( A == B ) return true;
   if (A.Length != B.Length) return false;
   for (int i = 0; i < A.Length; i++)
        if (A[i] != B[i]) return false;
   return true;
}

Int getSum ( int [] A)
{
If (A == null) return 0;
Return A.Sum();
}

Saturday, November 8, 2014

In this blog post, I discuss a few implementations of authentication for web applications, particularly one that involves python language. There are several available for NodeJs and many companies already use them for the application and often they rely on OAuth and SSO. I haven't seen the use of CAS in companies but I have seen it in educational institutions. Today I want to quickly review some of the design behind the registration packages for django. There are quite a few available namely:
django - social auth - which makes social authentication simpler.
Pinax - which makes it popular for websites
django-allauth which integrates authentication, addressing, registration, account management as well as 3rd party social account
django-userena which makes user accounts simpler
django-social registration which combines OpenID, OAuth and FacebookConnect
django-registration which is probably the most widely used for the framework
django-email-registration which claims to be very simple to use and other such packages.
These implementations are essentially to facilitate the user account registration via templated views and a database or other membership provider backends.

There are other implementations as well such as EngineAuth, SimpleAuth and AppEngine-OAuth-Library. EngineAuth does the multiprovider authentication and saves the userid to a cookie.
SimpleAuth supports OAuth and OpenID. AppEngine-OAuth now provides user authentication against third party websites.

However, one of the things I'm looking for is a NodeJs style implementation that uses the providers as strategy. If we look at the passport implementation for example, I like the fact that we can easily change the strategy to direct against the provider of choice. In fact the interface is something that makes it quite clear.
You use methods like
app.get('/login', function(req, res, next)) {
passport.authenticate('AuthBackendOfChoice', function (req,res, next) {
:
etc.
You also use methods like the following:
var passport = require('passport') , OAuthStrategy = require('passport-oauth').OAuthStrategy; passport.use('provider', new OAuthStrategy({ requestTokenURL: 'https://www.provider.com/oauth/request_token', accessTokenURL: 'https://www.provider.com/oauth/access_token', userAuthorizationURL: 'https://www.provider.com/oauth/authorize', consumerKey: '123-456-789', consumerSecret: 'shhh-its-a-secret' callbackURL: 'https://www.example.com/auth/provider/callback' }, function(token, tokenSecret, profile, done) { User.findOrCreate(..., function(err, user) { done(err, user); }); } ));
I haven't found a django-passport implementation in the repo or for that matter any python-passport implementation.
but Netor technologies has a mention for something that's same name but is also an interesting read.
For example, they create a table to keep the application_info and user_info. The application_info is similar to the client in the OAuth protocol. In that it keeps track of the applications as well as the user information. The user information is keeping track of usernames and passwords. The user_applications is the mapping between the user and the applications.
The authentication is handled using a Challenge Response scheme. The server responds with the user's password salt along with a newly generated challenge salt and a challenge id. The client sends back a response with the hash resulting from hash(hash(password+salt) + challenge). These are read by the server and deleted after use. There's no need to keep them.
The code for create user looks like this:
def create(store, user, application = None):
if application is None:
application = Application.findByName(unicode('passport'))
result = UserLogin.findExisting(user.userid, application.applicationid)
if result is None:
result = store.add(UserLogin(user, application))
store.commit()
return result
and the authentication methods are handled in the controllers. The BaseController has the method to get user login and the servicecontroller has the method to authenticate via a challenge.
This seems a clean example for doing a basic registration of user accounts and integrating with the application.
However, I'm wondering why we don't have the passport library ported to python yet.
The implementation for that is also relatively easy.
One we have to define the class for Strategy and Passport. Next we implement the authenticate method and target the appropriate strategy. An out of box SessionStrategy may also be provided. If we look at the authenticate method, we are issuing a challenge and attempting to validate against one of the Strategies until there is none left. In fact, the passport framework implements just the initialize and the authenticate method.
I've listed EngineAuth, SimpleAuth and AppEngine-OAuth but they are still not the same to implement.

Friday, November 7, 2014

We now look at Gram-Schmidt conjugation technique. This helps us generate a set of A-orthogonal search directions say di. This technique provides a simple way to generate them. We begin with a set of n-linearly independent vectors ui which are say co-ordinate axes or something better. To construct di, take ui and subtract out any components that are not A-orthogonal to the previous vectors. We begin with two linearly independent vectors u0 and u1. Set d0 = u0. The vector u1 is composed of two components one u+ that is along the direction of u0 and another it's A-orthogonal u*. After conjugation, only the A-orthogonal portion remains and d1 = u*. In the next step we will have two A-orthogonal vectors d0 and d1 and we construct the new one based on the previous two. We find the components along each of the previous A-orthogonal vectors so far. The components are found by taking some length in the direction of the previously found vectors. This length is determined the same way as we mentioned earlier with a pair of A-orthogonal vector. After conjugation, only the A-orthogonal portion remains to be included to our set. We repeat until we have generated a full set of n orthogonal vectors.

def gram-schmidt(set_ui, A):
set_di = []
for i in range(len(set_ui)):
components = 0
for k in range(i):
components += component(ui,dk) # - (utranspose A dk) / (dk-transpose A dk)
di = ui + components
set_di = di.append(component)
return set_di

Let us take a closer look at the conjugate directions method. We see that if the search vector is constructed by using the above method with axial unit vectors, it becomes equivalent to performing Gaussian elimination. If we take the orthogonal directions method with the same axial vectors and stretched it, it would look very much like the one with the conjugate directions.

def gaussian-elimination(A,n,b):
     # augment the matrix A with b as the final column
     # decomposition to upper triangular matrix by repeatedly multiplying the first row and subtracting from that row.
     # back-substitution
for j in range(0:n-2): # start from the first column (we have to go column order first)
for i in range(n-1:j+1): # start from last row, repeat for each row above except for first
# pick a row to calculate the ratio
rows = [ r in range(0:i-1) if A[r,j] != 0 ]
if len(rows) == 0:
raise ('bad input')
row = rows[0]

if A[i,j] != 0:
                 ratio = A[i,j] / A[row,j]
     for k in range(0:n-1): # set first column to zero, repeat
                                                 # and don't forget the augmented column
if ratio > 0:
A[i,k] = A[i,k] - ratio * A[row,k]
b[i,0] = b[i,0] - ratio * b[row,0]
else
A[i,k] = A[i,k] + ratio*A[row,k]
b[i,0] = b[i,0] + ratio * b[row,0]
# we now have a matrix with lower triangular zero
# substitute the last value in the last row to the row above to solve each line