Sunday, November 16, 2014

#codingexercise

Int GetCountDuplicatesCount (int [] A)

{

if (A == null) return 0;

return A.CountDuplicatesCount();

}


#codingexercise

Int GetCountDuplicatesAvg (int [] A)

{

if (A == null) return 0;

return A.CountDuplicatesAvg();

}

We continue to discuss CG method. We were talking about starting and stopping the iterations. We next discuss preconditioning. Preconditioning is a technique for improving the condition number of a matrix. Let us say M is a symmetric, positive definite matrix that approximates A, but is easier to invert. Then applying M inverse to both sides of the linear equation we solve with Steepest Descent or CG method,  we can solve it more quickly because the eigenvalues of Minverse applied to matrix A on the left hand side of the linear equation are better clustered than those of A. However, the problem is M-inverse applied to A doesn't always come out to be positive symmetric matrix. We could avoid this difficulty by taking a pair of matrices E and E-Transpose whose product is M because for every positive symmetric M there exists such. Now the M-inverse applied to A and Einverse applied to A applied to E-Transponse-inverse have the same eigenvalues. By transforming, the latter is now symmetric and positive definite This process of using CG to solve this system is called Transformed Preconditioned Conjugate Gradient Method.
Today we look at the complexity in the convergence of the CG method. The dominating operations during an iteration are the matrix-vector multiplication which requires O(m) operations, where m is the number of non-zero entries in the matrix. In many cases however, A is sparse and m belongs to O(n). If we want to do just enough iterations to reduce the norm of the error in each iteration by a factor of epsilon applied to the initial error term, then we find this maximum number of iterations required to achieve it. In the case of Steepest Descent method, this maximum number of iterations is bounded by a constant k times the natural logarithm of the inverse of epsilon. And in the case of Conjugate Gradient method, it is bounded by the square root of the constant times a similar natural logarithm of the inverse of epsilon. The steepest descent has a time complexity of O(mk) where as the Conjugate Gradient has a time complexity of O(m square-root(k)). Both algorithms have a space complexity of O(m).
By going to polynomials of higher degree, we can generalize this to second order elliptic boundary value problems. There are finite difference and finite element approximation methods used for solving these problems on d - dimensional domains. Recall that the finite difference method applies a local Taylor expansion to approximate the differential equations. The method is used to solve Partial differential equations in a numerical way where a topological square network of lines is used to construct the discretization of PDE. Partial Differential Equations have the property that they are of the form : elliptic, parabolic and hyperbolic. This gives us the convenience of applying Taylor expansions. When there are multiple dimensions however, this topology does not approximate well. Instead, finite element technique is used. The Finite element method discretizes the region of interest into N-1 subdomains and assumes that the approximation is represented by an expansion basis which are continuous polynomials which we piecewise break and describe with one of a few possible shape functions. As an aside, a Finite volume method is another such improvement where we take the region of integration to be a control volume associated with a point of co-ordinate xi and taken around its immediate vicinity. Here the integral is taken over this small volume stretched on the unit length around xi which can then be translated in a semi discrete form. Peiro and Sherwin mention that this approach produces a conservative scheme if the flux on the boundary of one cell equals flux on the boundary of the adjacent cell. In our case, we look at elliptic boundary value problems and both finite difference and finite element  methods have a time complexity where the constant k is of the order of O(n^2 raised to the inverse of d). Thus for two dimensional problems, Steepest Descent has a time complexity of O(n ^2 ) and Conjugate gradient method has a time complexity of O(n  ^ ( 3/2)) for Conjugate Gradient method. For three dimensional problems, Steepest Descent has a time complexity of O(n ^ 5/3) while Conjugate Gradient method has a time complexity of O(n ^ 4/3) .
When discussing iterations using these methods, it might be helpful to discuss starting and stopping of such iterations. Starting is usually a good guess. If we know the approximate value of our variable, then we can use it as our starting value. Otherwise, we begin at zero and both Steepest Descent method and Conjugate Gradient method can find the solution to the linear system. The same cannot be said for non-linear systems because they may or may not converge and they may have several local minima. For stopping, both Steepest Descent method and Conjugate gradient method stops when it reaches the minimum point. We took this as when the residual becomes zero. However, there's a possibility that accumulated round off errors in the recursive formulation of the residual (using step-lengths and matrix-vector product) may lead to a false zero. This may be corrected by restarting with the residual.
 

Friday, November 14, 2014

#codingexercise

Int GetCountDuplicatesMin (int [] A)

{

if (A == null) return 0;

return A.CountDuplicatesMin();

}

We continue to look at picking perfect polynomials for the convergence of conjugate gradients. We have seen that at each step of CG, we can express the error term as a polynomial in a matrix or scalar applied to the initial error term and with Steepest Descent method, the initial error term is a linear combination of orthogonal unit eigenvectors. We also noticed that with each iteration, the convergence is only as good as the worst eigenvector. For the first iteration, we have a horizontal line for the the polynomial to evaluate to a constant 1. At this time we have two eigenvalues. In the second iteration, we have a slope for the line equal to the ratio of the two eigenvalues. Let us say that in the third iteration, the eigenvalue is 1 and the iterations terminate. We now have a parabola for the polynomial of degree two to fit these three points.  The computed error terms may be plotted somewhere close to these eigenvalues on these polynomial representations.
We see that the polynomial converges after n iterations and quicker if there are duplicated eigenvalues. If we had a floating point precision that was infinite, then it would take at most n iterations to converge to an exact solution. In fact, it might be quicker if the initial position vector is already A-orthogonal to some of the eigenvectors of A. But floating point round off errors may re-introduce these eigenvectors. We also find that when the eigenvalues are clustered together, it is quicker to converge. If we know something about these eigenvalues, then we could construct the polynomial faster. The more general case is when the eigenvalues are evenly distributed between Lambda min and Lambda max, the number of distinct eigenvalues is large, and the floating point roundoff occurs.
This range leads to another interesting idea called the Chebyshev polynomial.  This is a polynomial that minimizes the sum of squares of error terms over a range Lambda min and Lambda max rather than at a finite number of points. The polynomial is expressed in radial directions of at most unit length and have the property that they result in something smaller than a unit. It is also the maximum on all such polynomials in the domain of the chosen variable. When we use a Chebyshev polynomial to minimize the sum of square of error terms from the CG, we get an equation in terms of a condition variable that can be plotted as an increasing parabolic curve. This further shows that the first step of CG is identical to a step of Steepest Descent. After that the convergence is usually faster than that of Steepest Descent because of a good eigenvalue distribution or good starting points. This does not imply that every step of the iteration enjoys faster convergence. That said, the rate of convergence is much higher initially and gradual much later 
One more coding exercise just like the previous:
#codingexercise
Int GetCountDuplicatesMax (int [] A)
{
if (A == null) return 0;
return A.CountDuplicatesMax();
}
In today's post we continue our discussion on the convergence of conjugate gradient method. We have seen that at each step of CG, the value of the current eigenvector is chosen from the initial eigen vector and the current subspace. The current subspace is formed from the span of previous subspaces. Each subspace was found by applying the matrix A  against a vector. The vector is the initial residual which is also the initial eigenvector. These subspaces called Krylov subspaces are found iteratively. Recall that subspace means that when we don't know what the search vector should be, we choose a search subspace to find it in. Fortunately this subspace is incremental and we look for the next residual to be A-orthogonal to this current search subpace, so the next residual is A-orthogonal to all the previous except the current search vector. This makes it easier to find the next residual. The new search vector will be a linear combination of the next residual and the current search vector.
For a given iteration, the error term can be found by applying the initial eigenvector to the sum of all the previous matrix and the identity matrix. This sum can be expressed as a polynomial of a degree equal to the number of iterations so far. It doesn't matter if the polynomial is expressed in terms of a scalar or a matrix, what we want is to evaluate it.This means that we can evaluate it with eigenvalues as  scalar or with matrix A. This flexible notation of expressing the sum as a polynomial P will let us apply a vector the same to whether P was in terms of A or in terms of a scalar eigenvalue. Note that by definition of an eigenvalue, it is a scalar constant that makes the application of a matrix to a vector the same as the application of this eigenvalue to the same vector. If we keep applying the matrix and the eigenvalue to both sides of this equation, the equation doesn't change.  This lets us write the error term as one that applies a polynomial to the initial error term. If we start with an initial polynomial that evaluates to one in this method,  and express the error term as a linear combination of the orthogonal unit eigenvectors, we find that CG finds the polynomial that depends on how close a polynomial of degree i  can be to zero on each eigenvalue, given the initial constraint of the initial polynomial = 1.

Thursday, November 13, 2014

#codingexercise
Int GetCountUnique (int [] A)
{
if (A == null) return 0;
return A.CountUnique();
}
To complete our post on the Oracle 11G identity access management service, we next look at Identity federation. Identity federation is required when there is a need for single sign on beyond a single internet domain. It consists of federation services that utilizes both AuthN modules and Service provider modules. The language for talking to Identity federation has been SAML which is an industry standard.  SAML is an open framework for sharing security information on the internet through XML documents. SAML provides a standard way to transfer cookie information across multiple internet domains. Thus its a way to implement SSO. A SAML assertion can be used with web services security frameworks such as WS-Security. This identity federation service is also interoperable with CardSpace.  By using proprietary directories or database or SAML assertions with internet directories, it can federate identity. In addition , it offers auditing, logging and monitoring. The OpenSSO implements security token service and supports web services trust language.  Trust in this case, is usually established with WS-Trust and exchanging SOAP/WS-Security messages. Since trust is represented as tokens, there is a service to manage the issuing of tokens. This service issues, renews, cancels and validates security tokens, allows customers to write their own plugins and provides a WS-Trust based API for client and application access. The tokens issued are ones that can be authenticated via username, x.509, SAML and Kerberos.
Next we discuss Entitlement Server which is a fine grained authorization engine that externalizes, simplifies and unifies the entitlement policies. It offers a sophisticated delegated administration model to create modify or report on the entitlement policies.  The administration server is layered above the authorization engine. While the administration server concerns itself with resource management, policy lifecycle, and policy distribution, the authorization engine is the policy framework or decision kernel that works with a publisher subscriber model.
The administration server acts as the policy administration point.
The security modules that implement the PDP communicates with the authorization server and the one that implements the PEP communicates with the authorization engine. The engine may talk to one or more attribute authorities and policy store.
The Adaptive Access Manager provides real time fraud detection, multifactor authentication and unique authentication strengthening.  These are implemented as two modules - one is the authenticator and the other is the risk manager. The Authentication security protects against malware attacks. The risk manager looks at various risk factors simultaneously.
Information Rights management manages the contents produced by the subject. If a user signs into one application and writes one document and signs into another place and writes another,  then they are secured by the IRM. Typically they are sealed and encrypted.
Governance of all of the above services can be facilitated with a common user interface.
Today I want to discuss the whitepaper from Oracle on Identity access management. Oracle 11g provides middleware service complete with identity management as an SOA. It secures the application grid in cloud computing. Resources as well as the processes acting on the management of the processes acting on those resources are protected. Identity access management includes such things as directory services, identity administration, access control, platform and web services security, identity and access governance, operational manageability, and service integration with suites both proprietary and external. The benefits include comprehensive identity services, integration with other services, standards based architecture where modules can be written as plug-ins. By comprehensive, it implies access control, single sign-on, role governance, multi-factor authentication, identity analytics, audits and reports. Integration benefits means each identity management and access control met through a business transaction from applications such as other middle-ware modules  which works seamlessly. This offering leverages and integrates Oracle database through its own directory and identity virtualization services. It also offers Information Rights management to secure content. The standards based benefit implies that the data transfer via Security Services Markup Language and WS-Federation makes it possible for any vendor to customize with plugins.
The Oracle 11g identity access management service provides services for authentication, authorization, roles and entitlements, auditing, directory services, user provisioning, policy store, and session data management - all in a SOA model. It includes an Oracle Authorization Policy Manager for managing authorization policies. It manages both global and application-specific artifacts.  Global artifacts include users, external roles, and system policies. Application specific policies are kept as a logical subset called a stripe  in this policy store. Application specific artifacts include resource catalog, application policies, application roles,  and role categories. The identity manager and the authorization manager both utilize these authorization policies. The only difference is that the policies are chained to identity store and while the identity manager modifies the identity store the authorization manager modifies the policy store. The User and Role API helps manage the identities using the identity governance framework hosted at projectliberty.org and even allows the ability for developers to create their own virtual identity database while retaining the ability to interconnect with enterprise identity services. The Authorization API is mentioned at another ongoing project called OpenAZ at the project liberty which uses the Extensible Access Control Markup Language that can represent attribute values along with an interoperable policy language to interpret them. The Authorization API is used for policy enforcement points, policy information points and policy decision points which issue authorization requests, obtain attributes from an attribute authority or the functionality of existing authorization providers. The Directory services include internet directory and enterprise directory and a virtual directory to provide identity aggregation and transformation without copying. The internet directory provides 1) scalability in terms of say the LDAP servers running on a node, 2) high-availability which is designed to enable continuous service availability at the process and storage level, 3) security in terms of both password and certificate based authentication and including encryption, 4) identity management and monitoring which is streamlined around two complementary components - enterprise manager and directory services manager, 5) directory integration and platform which includes a set of services enabling customers to  synchronize data between various third party data sources 6) External authentication which enables seamless authentication against third party directories, 7) and an SDK for internet directory providers.
Next we discuss access management components which includes a consolidated SSO architecture, a policy simulator, an access manager, a session manager, administration console, a centralize diagnostics, and snapshots. The access management component concerns itself with authentication and identity assertion. The policy service works against all these modules.

Tuesday, November 11, 2014

When discussing conjugate gradient method in the previous post, we mentioned several items. Let's put them all together now. First, this is in fact a conjugate direction method. We don't have to keep track of previous search vectors. Each subspace is incremental and is found by applying a matrix to a vector. The name of the method has nothing to do with gradients because the search directions are not all gradients. It's just a misnomer.
We will next evaluate the convergence of conjugate gradient method. We saw that the progression for the method was linear. Why should we then care for convergence ? This is because there are a few errors encountered during the progress that impedes convergence.  One for example is the floating point errors and the second for example  is that the search vectors lose their A-orthogonality. The first problem could just as well occur in Steepest Descent methods and there are ways to overcome it. However, the second is not easy but there are ways to use this as an iterative procedure where we apply corrections. The convergence analysis therefore helps us evaluate running the method on sample datasets where it is not even possible to complete n iterations.  The first iteration of CG is identical to the first iteration of the Steepest Descent method. We can apply the same conditions here too.  If we look at the iteration equation, the initial search direction is also the initial residual direction formed from the initial subspace A applied to x0. All subsequent  iterations result from the next residual and the current search vector. For convergence, let's start with the initial direction along the axes, the residual points to the center of the ellipsoid, giving a one step convergence to the solution. For a more general convergence, we can rely on the fact that we can find n A-orthogonal search vectors and we can express the error term as a linear combination of these vectors. With that we can find the next residual with some compromise. In fact, that error term is a weighted average from the step-lengths of previous residuals. Because its weighted, some of the contributions from the shorter step-lengths of residuals might increase and that is why this method is called a rougher. By contrast, the Jacobi method is a smoother because there every eigen vector is reduced on every iteration.
#codingexercise
Int GetCount (int [] A)
{
If (A == null) return 0;
Return A.Count();
}
Int GetCountDuplicates (int [] A)
{
If (A == null) return 0;
Return  A.CountDuplicates();
}