Cluster computing

Sunday, November 23, 2014

Sample python flask server to issue tokens based on OTT mentioned in earlier post:

# coding: utf-8

from datetime import datetime, timedelta

from flask import Flask

from flask import session, request

from flask import render_template, redirect, jsonify

from flask_sqlalchemy import SQLAlchemy

from werkzeug.security import gen_salt

from flask_oauthlib.provider import OAuth2Provider

from tokenprovider import TokenProvider

from json import dumps

import random

import string

app = Flask(__name__, template_folder='templates')

app.debug = True

app.secret_key = ''.join([random.choice(string.digits +

                   string.ascii_uppercase) for x in range(0, 32)])

app.config.update({

    'SQLALCHEMY_DATABASE_URI': 'mysql://root:thepassword@127.0.0.1/token',

})

db = SQLAlchemy(app)

oauth = OAuth2Provider(app)

class User(db.Model):

    id = db.Column(db.Integer, primary_key=True)

    username = db.Column(db.String(40), unique=True)

    salt = db.Column(db.String(4))

class Token(db.Model):

    id = db.Column(db.Integer, primary_key=True)

    user_id = db.Column(

        db.Integer, db.ForeignKey('user.id')

)

    user = db.relationship('User')

    # currently only bearer is supported

    token_type = db.Column(db.String(40))

    access_token = db.Column(db.String(6))

@app.route('/', methods=('GET', 'POST'))

def home():

    if request.method == 'POST':

        username = request.args.get('username','')

        user = User.query.filter_by(username=username).first()

        if not user:

            user = User(username=username,

             salt=''.join([random.choice(string.digits + string.ascii_uppercase)

            for x in range(0, 4)]

))

            db.session.add(user)

            db.session.commit()

        session['id'] = user.id

        print str(session)

        return redirect('/')

    user = None

    if 'id' in session:

         uid = session['id']

         user= User.query.get(uid)

    return jsonify({'msg': 'welcome,'+ str(user), 'success':'success'}), 200

@ott.tokengetter

def check_token(access_token=None):

    if access_token:

        return Token.query.filter_by(access_token=access_token).first()

    return None

@app.route('/session/', methods=('GET'))

def session():

     sessions = list()

     [sessions.append(i) for i in db.session]

     return jsonify ({'sessions': json.dumps(sessions)}), 200

@ott.tokensetter

def save_token(token, request, *args, **kwargs):

    id = User.query.filter_by(username=request.args.get('username','')).first().id

    toks = Token.query.filter_by(

       user_id = id

    ).all()

    # make sure that every client has only one token connected to a user

    db.session.delete(toks)

    tp = TokenProvider()

    tok = Token(user=request.args.get('username',''),

                token_type='bearer', access_token=tp.set_token())

    tok.save()

    tok.user_id = request.user.id

    db.session.add(tok)

    db.session.commit()

    return tok

if __name__ == '__main__':

    db.create_all()

    app.run(debug=True,port=8787)

In today's post we continue our discussion on how an OTP provider can replace the tokens used by an API for checking authorization. In this case, the we will call it an OTT provider. An OTT provider converts all users/ digests/sessions/cookies to track users into tokens. A TOTP 6 digit number as usually provided by OTP providers. These tokens not only replace the credentials but also serve to maintain sessions. With the API only accepting tokens, there is no need to track anything else. In a way there is no repository required fro the API implementations to see if there is a token associated with a user or a client. The token itself has both pieces - 'what you know' and 'what you have' and the API implementation can validate the token the same way as when the caller did after requesting it.
There are only three players:
A token provider that issues and validates token
An API client that requests tokens from the token provider and validates it
An API server that accepts only tokens and validates it from the same token provider.
Tamper proofing the request or encrypting the transport layer is not part of this discussion but considered necessary.
Although the OTT is enclosed with the request and sent over the wire, it is temporary and gives the api what is needed.
OTP is covered in RFC 2289 which states that there are two entities in the operation of the OneTimePassword system. The generator must produce the appropriate one-time password from the user's secret pass phrase, and from the information provided in the challenge from the server. The server must send a challenge that includes the appropriate generation parameters to the generator, must verify the one time password received, must store the last one time password it received, and must store the corresponding one-time password sequence number. The server must also facilitate the changing of the users' secret passphrase in a secure manner.
Int GetDistinctMean (int [] A)
{

if (A == null) return 0;

return A.GetDistinctMean ();

}

Saturday, November 22, 2014

Int GetDistinctMedian (int [] A)

{

if (A == null) return 0;

return A.GetDistinctMedian ();

}
Here's what a token provider server would implement:

Class TokenProvider ()
Def login(self, request):
Token = OneTimeToken ()
Session.Add (token)
Return token
Def logout (self, request, token):
token= Token.lookup (request)
If token:
Token.remove (token)
Session.remove (token)
Def refresh_token (token):
Return Token.refresh (token)
Def validateToken (token):
Return OTT.validateToken
OTT defined at https://github.com/ravibeta/pythonexamples/tokenprovider.py
and https://github.com/ravibeta/PythonExamples/blob/master/tokenserver.py

Friday, November 21, 2014

In today's post I continue discussing the method of Conjugate Gradients. We talked about preconditioning which is a technique for improving the condition number of a matrix. We tried to do this by first approximating A with M. M is a symmetric positive definite matrix which when inverted and applied to A helps indirectly solve the linear matrix equation we started with. However applying the inverse of M to A doesn't always result in either a symmetric or a definite matrix. Hence we brought in transformation where we substitute M with the E.E-Transposed. Such an E can be obtained by say Cholesky Factorization. In this case the substitution helps with making the component symmetric and positive definite. This process of using the CG is called the Transformed Preconditioned Conjugate Gradient Method. This method has an undesirable characteristic that E must be computed. Instead if we do a few careful variable substitutions, it can eliminate E. If we choose the new residual as E-inverse applied to the original residual and if we choose the new search vector as E-transpose applied to the original search direction, then we can derive what is called the untransformed preconditioned conjugate gradient method. The derivation works out like this: we start by finding the initial residual based on its definition and a starting position x0. we also set the initial search direction again based on its definition and using the initial residual. We write down the step length in terms of the above and we write down the next position and the next residual in terms of this step length and their iterative definition. This lets us calculate the factor to use with the iterative defintion for the search vector. The matrix E does not appear in this method.
Since it works for the CG, we can use the same technique to derive a pre-conditioned Steepest Descent Method that does not use E.
Let us take a quick look how.
In the steepest descent method, we use the iterative definition involving the residual and the step-length. We start by finding the intial residual and a starting postion. We write down the step-length in terms of the residual and then we calculate the next position.
The effectiveness of a preconditioner M is determined by the condition number of Minverse A and occassionally by its clustering of eigenvalues. The problem is to find a preconditioner. that approximates A well enough such as to make up for the cost of calculating the product M-inverse current residual once per iteration. The simplest pre-conditioner is a diagonal matrix. whose diagonal entries are identical to those of A. The process of applying this pre-conditioner, known as diagonal pre-conditioning or Jacobi pre-conditioning, is equivalent to scaling the quadratic form along the coordinate axes.
Another trivial pre-conditioner is to take a matrix M as A which gives us the quadratic form to be perfectly spherical. and so the solution takes only one iteration. The catch is that we aren't solving the linear equation we begin with so this isn't helpful.
There are many other pre-conditioners that have been studied. Some have been much more sophisticated.

#codingexercise
int GetCountOfLinesInSourceCode(string text)
{
var lines = text.split('\r\n');
int count = 0;
bool skip = false;
foreach ( var line in lines )
{
if (String.IsWhiteSpace(line)) continue;
if (line.startswith(single_line_comment)) continue;
if (line.startswith(multi_line_comment_begin)) { skip = true; continue;}
if (line.startswith(multi_line_comment_end)) {skip=false; continue;}
if (!skip) count++;
}
return count;
}

Today we continue to discuss some more of the authentication and authorization frameworks. In particular, we look at information rights management process such as in Oracle IRM. The process involved is as follows:
An author creates a document or an email on her desktop or notebook.
The document is sealed by the author when she publishes it. The document is then sent to the receiver
The receiver click on the document. The document is encrypted and cannot be opened without an installed IRM application. If the application is available, it can then be decrypted and shown to the receiver.
The components involved in this process include:
- an identity manager to centrally provision IRM users and entitlements.
- a virtual directory to synchronize IRM users and groups
- a Single sign on for desktop single sign on.
All user interfaces are consistent through out the users use of the applications and built on the same platform framework. The users have a seamless and familiar interface for access management. The identity management is provided by the Oracle Identity Manager. The identity management architecture is built on a data tier that supports an operational db, an ID store, and an audit DB. The Oracle web logic server is the layer above the data tier. This layer is responsible for all of domain template management, configurations, and installing, upgrading and patching. The weblogic server maintains a unified, consolidated and consistent view over the ID store which includes membership providers such as Active Directory. The layer above the web logic server is the Fusion middleware services provides such functionality as
the ability to configure new metadata models
to provision new users or export user information to other systems
for users to administer their own profiles with fine grained entitlements
for policy management based on profiles
for creating and provisioning workflows that enables automation of IT tasks along with a Workflow visualizer and a designer.
a password management that includes self-service.
and audit and compliance management


decimal GetCountDuplicatesStandardDeviation (int [] A)

{


if (A == null) return 0;


return A.GetCountDuplicatesStandardDeviation();



}

Int GetDistinctMode (int [] A)
{



if (A == null) return 0;



return A.GetDistinctMode ();



}

Thursday, November 20, 2014

This is a writeup on how to collect and visualize data that indicates who talks to whom and represent it as a neat little graph alongside your favorite application with which you browse this communication. Let us take an example of an e-mail application that has mails sent to you. Inevitably you will be in all of the e-mails. That’s what pigeonholing might do to you. But even with this data, you may be seeing mails where you were included just to be in the loop. These from and to labeled people may give indications not only to who is chatty but also whose mail is more important than others especially when it comes from outside your organization. Although this is a complex domain and it has many subjective variables, we can begin with this use case.

Let us say we iterate through our list of mails and populate the following table:

- - - - - - - - - - - -

- From | To |

- - - - - - - - - - - -

- A B

- B C

- D E

This may start to look like a graph where we can quickly form nodes with participants such as A, B, C, D and E. We also form edges between them for each entry that we make in this table. Notice that the edges have weights and they are directed.

We can write this graph as an adjacency list or matrix. We could prefer Matrix but for our purposes we simply want to illustrate a visualization of the information we have learned. Notice though that although we have decided to layout a graph, it can be drawn in several different ways. Some can be messier than others. Perhaps a better way to draw them would be to have them spaced apart and try not to have the edges cross over too much. This will make it easier to see the graph.

Now let us try the same with this graph. We have a technique by which we can optimize this graph layout.

# The following program lays out a graph with little or no crossing lines.

# This is adapted from a sample in "Programming Collective Intelligence" by OReilly Media

from PIL import Image, ImageDraw

import math

import random

vertex = ['A','B','C','D','E']

links=[('A', 'B'),

('B', 'C'),

('C', 'D'),

('D', 'E'),

('E', 'A'),

('C', 'E'),

('A', 'D'),

('E', 'B')]

domain=[(10,370)]*(len(vertex)*2)

def randomoptimize(domain,costf):

best=999999999

bestr=None

for i in range(1000):

# Create a random solution

r=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))]

# Get the cost

cost=costf(r)

# Compare it to the best one so far

if cost<best:

best=cost

bestr=r

return r

def annealingoptimize(domain,costf,T=10000.0,cool=0.95,step=1):

# Initialize the values randomly

vec=[float(random.randint(domain[i][0],domain[i][1]))

for i in range(len(domain))]

while T>0.1:

# Choose one of the indices

i=random.randint(0,len(domain)-1)

# Choose a direction to change it

dir=random.randint(-step,step)

# Create a new list with one of the values changed

vecb=vec[:]

vecb[i]+=dir

if vecb[i]<domain[i][0]: vecb[i]=domain[i][0]

elif vecb[i]>domain[i][1]: vecb[i]=domain[i][1]

# Calculate the current cost and the new cost

ea=costf(vec)

eb=costf(vecb)

p=pow(math.e,(-eb-ea)/T)

# Is it better, or does it make the probability

# cutoff?

if (eb<ea or random.random( )<p):

vec=vecb

# Decrease the temperature

T=T*cool

return vec

def crosscount(v):

# Convert the number list into a dictionary of person:(x,y)

loc=dict([(vertex[i],(v[i*2],v[i*2+1])) for i in range(0,len(vertex))])

total=0

# Loop through every pair of links

for i in range(len(links)):

for j in range(i+1,len(links)):

# Get the locations

(x1,y1),(x2,y2)=loc[links[i][0]],loc[links[i][1]]

(x3,y3),(x4,y4)=loc[links[j][0]],loc[links[j][1]]

den=(y4-y3)*(x2-x1)-(x4-x3)*(y2-y1)

# den==0 if the lines are parallel

if den==0: continue

# Otherwise ua and ub are the fraction of the

# line where they cross

ua=((x4-x3)*(y1-y3)-(y4-y3)*(x1-x3))/den

ub=((x2-x1)*(y1-y3)-(y2-y1)*(x1-x3))/den

# If the fraction is between 0 and 1 for both lines

# then they cross each other

if ua>0 and ua<1 and ub>0 and ub<1:

total+=1

for i in range(len(vertex)):

for j in range(i+1,len(vertex)):

# Get the locations of the two nodes

(x1,y1),(x2,y2)=loc[vertex[i]],loc[vertex[j]]

# Find the distance between them

dist=math.sqrt(math.pow(x1-x2,2)+math.pow(y1-y2,2))

# Penalize any nodes closer than 50 pixels

if dist<50:

total+=(1.0-(dist/50.0))

return total

def drawnetwork(loc):

#create the image

img = Image.new('RGB', (400,400),(255,255,255))

draw=ImageDraw.Draw(img)

#create the position dict

pos=dict([(vertex[i],(loc[i*2],loc[i*2+1])) for i in range(0, len(vertex))])

#Draw Links

for (a,b) in links:

draw.line((pos[a],pos[b]), fill=(255,0,0))

#Draw vertex

for (n,p) in pos.items():

draw.text(p,n,(0,0,0))

img.save('graph.jpg', 'JPEG')

img.show()

sol=randomoptimize(domain,crosscount)

crosscount(sol)

sol=annealingoptimize(domain,crosscount,step=50,cool=0.99)

crosscount(sol)

drawnetwork(sol)

Now this draws the graph as :

This is a neat little graph and shows that the participant A is more popular.

Wednesday, November 19, 2014

#codingexercise
Int GetCountDuplicatesMode (int [] A)
{
if (A == null) return 0;
return A.CountDuplicatesMode();
}

In today's post we continue to look at the PHP Implementation of a class based views which are not tied to model. A PHP controller that takes a single input for say a string to do CRUD operation may look something like this:


<?php

require  __DIR__ . '/crud.php';

#require_once 'crud.php'

class ExportsController extends AppController {

        public $helpers = array('Html', 'Form');

        public $exports = NULL;

        public function index() {

            $orm = new ORM();

            $result = json_decode($orm->list(), true); 

            $exports = $result['exports'];

            $ret = array();

            foreach($exports as $export) {

                   array_push($ret, $export['input']); 

            }

            $this->set('exports', $ret);

        }

        public function add($export = null) {        

             if(isset($this->request->data['input'])) {            

                $orm = new ORM();            

                $result = json_decode($orm->create($this->request->data['input']));            

                if (strpos($result, 'success') !== false){                

                           $this->Session->setFlash(__('Your export has been saved.'));

                           return $this->redirect(array('action' => 'index'));            

                }


               $this->Session->setFlash(__('Unable to add your export.'));

             }        





}

: similar for modify or delete

}

Notice we are not using ids of the items in the list and this is usually not recommended. The RESTAPI url comes out to be /exports/modify/9 for example. This is preferable way for REST based API. It doesn't mean that we cannot get the exports working the way we have it. It merely suggests that conforming to the ID based helps with keeping track of the collection when it changes.


Int GetCountDuplicatesMedian (int [] A)

{

if (A == null) return 0;

return A.CountDuplicatesMedian();

}