Cluster computing

Wednesday, September 18, 2013

This post talks about enterprise library data access block. I was able to go over the documentation on the msdn. Applications that have been using custom connection state and pooling as well as connections usually run into scalability and resource problems. The enterprise library opens and closes the connection as needed leaving you to focus on the dataReader and the data. You can edit the configuration with a visual tool. A database object is created using the DatabaseFactory.CreateDatabase method. The configuration tool stores the settings in the DAAB ( Data Access Application Block). The databases instances node is used to associate one of the instances type such as SQL or Oracle with one of the connection strings. The connection string is stored in the configuration file and provides mechanisms to configure the security for the connection. These can be encrypted so that they are not in clear text. As mentioned the database node is created first and this is done with a factory pattern. In ADO.Net, you open a connection, then fill a dataset or retrieve data through a data reader usually typed to the provider. In this application block, all of these are abstracted. You just instantiate the database object and execute the reader using a command wrapper. This Database class has dozens of methods most notably the ones to execute a stored procedure or SQL statement, return a dataset, a datareader, a scalar value, an xmlreader or nothing, allow for specific parameters to be created and passed in, determine which parameters that it needs, creates them and caches them, and involves commands in transaction.
The database object has some methods like the GetStoredProcCommand, GetParameterValue, AddInParameter, AddOutParameter that give detailed control over the database object
to be executed. ExecuteScalar and ExecuteReader are some of the methods for execution. Since the results from the execution can be read from a dataReader, we can
directly populate objects with the results without having to create an object graph. This reduces a lot of complexity that comes with the object graph refreshes. Direct
manipulation of data is possible with methods like LoadDataSet and UpdateDataSet where you can specify the CRUD operation command and a transaction if necessary. You
could directly get the data adapter that implements the CRUD operation on the data source.

Tuesday, September 17, 2013

Today I tried out the enterpriselibrary.data framework and it was a breeze. To those familiar with entity framework this provides more streamlined access to data. For example, you could wire up the stored procedures results to the collection of models you define so that you can work with them instead of the entire object graph. There is a lot of debate around the performance of entity framework and perhaps in earlier blogs, I may have alluded to the different levers we have there to improve that. However, the enterpriselibrary comes with the block pattern that these libraries have become popular for. Blocks are reusable patterns across applications so that your development time is cut down and it comes with the reliability and performance that is come to be known with these libraries.
I want to bring up the fact that we associate the database by using a convenient DatabaseFactory.CreateDatabase method to work with the existing databases in the sql server. Some of the data access extensions may need to be written to translate the datareader columns to the objects and this helps because you can translate the results of the stored procedure execution directly into a collection of the objects you have already defined as models without the onus of the object graph.
In addition, there is no configuration sections involved in the config files and the assemblies can be installed and added to the solution using the visual studio nuget package manager.

OAuth bearer tokens practice

OAuth bearer tokens may currently be passed in the URL but the RFC seems to clearly call out that this should not be done. Therefore, checks and other mechanisms to safeguard these tokens should be in place. As an example, this parameter could be passed into the request body. or the authorization server may handle client validations. More is based on implementation.
In general, if the server and the clients communicate via TLS and they have verified the certificate chain, then there is little chance of token falling in wrong hands. The URL logging or https proxy are still vulnerabilities but the man in the middle attack is less of an issue if the client and the server exchange session id and keep track of each other's session id. As an API implementation, session Id's are largely site or application based and not the APIs concern but its good to validate based on session id if such is available.
Sessions are unique to the application. Even the client uses refresh tokens or re-authorizations to keep the session alive. At the API level, if the sessions were kept track of, it would not be tied to the OAuth revokes and re-authorizations, hence relying on session id alone is not preferable. At the same time, using session id as an additional parameter to confirm along with each authorization helps tighten security. It is safe to assume the same session prevalance until the next authorization or an explicit revoke. By tying the checks exclusively to the token, we keep this streamlined to the protocol.
OAuth can be improved upon but it certainly enables redirections that make it easier for the user. In addition, the use of expiry dated tokens enable clients to reduce the chat with the authorization server.
In addition, many applications can now redirect to each other for same user authorizations. So the user has to sign in far lesser than before. If the user is signed in to a few sites, he can use the existing signed in status to gain access to other sites. This is not just a mere convenience to the user, it enables same user to float between sites and also enables applications to integrate and share user profile information for a richer user experience.

Monday, September 16, 2013

Tests for the client validation changes include the following
1) specify one client based token grant and access by another client
2) specify token grant to one client and revoke by same client and reuse of a revoked token by the same client
3) specify token grant to one client and revoke by a different client
4) specify token grant to one client, revoke by a different client, and reuse by the original client
5) specify low privileged token grant to one client, specify high privileged token grant to same client, use of both tokens by the same client
6) specify low privileged token grant to one client, access low privileged token by another client
7) specify user privileged token grant to one client, specify token grant by same user to another client, clients exchange token
8) specify user privileged token grant to one client, specify token grant by different user to same client, client swaps token
9) specify user privileged token grant to one client, specify client to request several tokens until a large number.
10) specify user privileged token grant to multiple clients until a large number of clients reached
11) specify user privileged token grant and revoke to same client a large number of times

Delegated tokens or bearer tokens
The RFC makes special provisions for bearer tokens. Bearer tokens can originate from any source to access any resource protected by these tokens. Therefore they should be stored and transmitted with care.
For example, these tokens can be sent in the following ways:

1) When the access token is sent in the authorization header in the http request, a predefined syntax is used which takes the form "Bearer 1*SP b64token" where b64token is base64.
As an aside a base64 string consists only one occurrence of any given Alpha or digit and / or one occurrence of -, ., _, ~, +, /, = special characters.
2) The bearer token could be sent in the request body with the "access_token" using "application/x-www-form-urlencoded"
3) The URI query parameter could also include the "access_token=" however it should be sent over TLS, along with specifying "Cache-Control" header with the private option.
Since URIs are logged, this method is vulnerable and is discouraged by the RFC. It documents current usage but goes so much as saying it "SHOULD NOT" be used and it goes against a reserved keyword.

If the request is authenticated, it could be responded with error messages such as invalid_request, invalid_token, and insufficient_scope as opposed to not divulging any error information to unauthenticated requests.
Threats can be mitigated if the
1) tokens are tamperproof
2) tokens are scoped
3) tokens are sent over TLS
4) TLS Certificate chains are validated
5) tokens expire in reasonable time
6) token exchange should not be vulnerable to eavesdropper
7) Client verifies the identity of the resource server ( this is known as securing the ends of the channel)
8) tokens are not stored in cookies or passed as page URLs

In this post, we talk about client registrations. OAuth mentions that clients be given a set of credentials that they can use to authenticate with the server. This is much like the user name password except that the client password is Base64 encoded and is called client secret. The client id and secret are issued at the time of registration. Therefore the authentications server which also has the WebUI could host this site and thereby reduce the dependency on the proxy. Besides, this will integrate the developers with the regular users of the site.
Every information that the client provides is important. Also, the access token that is issued has some parameters we talked about earlier such as scope, state etc. However, one field I would like to bring up in this post is the Uri field. This is supposed to be the redirection uri and state from the client. This is seldom used but is a great way to enforce additional security.
In the list of things to move from the proxy to the provider, the token mapping table, the validation for each api to ensure the caller is known and the token is the one issued to the caller, the checks for a valid user in each of the authorization endpoints where user authorization is requested. etc are some of the items.
WebUI redirection tests are important and for this a sample test site can be written that redirects to the OAuth WebUI for all users and handles the responses back from the WebUI. A test site will enable the redirects to be visible in the browser.
The test site must test the webUI for all kinds of user responses to the OAuth UI in addition to the testing of the requests and responses from the WebUI.
WebUI testing involves a test where the user sees more than one client that have been authorized. Updates to this list is part of webUI testing, therefore the registration and removal of apps from this list have to be tested. This could be done by using different clientId and clientSecret based authorization requests to the server. The list of clients will come up in html so the html may have to be parsed to check for the names associated with the different clientIds registered.
Lastly, webUI error message handling is equally important. If the appropriate error messages are not provided, user may not be able to take the rectifiable steps. Moreover, the WebUI properties are important to the user in that they provide additional information or self help. None of the links should be broken or mis-spelled on the webUI. The WebUI should provide as much information about its authenticity as possible. This way it will provide additional deterrence against forgery.

Sunday, September 15, 2013

The post involves discussion on APIs to remove all user/me resource qualifiers from the API config routes. If the OAuth implentation doesn't restrict a client from using the notion of a superuser who can access other user profiles based on /user/id that would mean the protocol is flexible.
Meanwhile, this post also talks about adding custom validation via ActionFilterAttributes
For performance, should we be skipping token validation on all input parameters.
This is important because it lowers security in favor of performance and the tradeoff may have implications not just to the customer.
That said even for the critical code path, security has to be applied to the both the endpoints administration as well as token granting endpoints.
The token granting mechanisms also need to make sure the following are correct.
1) the tokens are not rotated or reused again.
2) the tokens hash is generated using the current timestamp.
3) the tokens hash should not be based on userId and clientId.
Should the tokens be encrypted, then they could use userId, clientId so that they can be decrypted.
The third post will talk about client registrations separately since they are currently tied to the proxy and is not in immediate scope.

In this post, we will describe the implementation in a bit more detail.
First we will describe the database schema for the OAuth.
Then we will describe the logic in the controllers that will validate and filter out the bad requests and those explicitly prohibited.
Then we will describe the tests for the webUI Integration. All along the implementation we will use the features available from the existing proxy and stage the changes needed to remove the proxy.
First among these is the schema for the token table. This table requires a mapping for the userId and the clientId together with the issued token. This table is entirely for our compliance with OAuth security caveats and hence user, clients and proxy are unaware of this table and will not need to make any changes on their side due to any business rules associated in this table as long as they are OAuth compliant. Since the token is issued by the proxy, we will need to keep token request and response information in this table. In addition, we will record the apiKey and clientId from the client along with the token even though the proxy may be enforcing this already. (Note clientId is internal and apiKey is public and they are different.) This as described in the previous post helps us know who the token was originally intended for and whether, if any, misuse occurs by a third client. And we will keep the user mapping as optional or require a dummy user since some clients may request credentials only to access non-privileged resources. It could be interesting to note that the userId is entirely owned by our API and retail company but the check for an access token issued on behalf of one user to be used with only that user's resources is currently enforced by the proxy. That means the proxy is keeping track of the issued token's with the user context or is passing the user context back to the API with each incoming token.
But we have raised two questions already. First - Should we treat user context as nullable or should we default to a dummy user. Second - Does the proxy pass back the user context in all cases or should the API implement another way to lookup user given a token for non-proxy callers ?
Let us consider the first. The requirement to have a non-nullable field and have a default value for client credential only calls, is certainly improving validation and governance. Moreover, we can then have a foreign key established with the user table so that we can lookup user profile information directly off the token after token validation. This leads the way to removing the ugly "/user/me" resource qualifier from all the apis that access user privileged resources. The userId is anyways internal usage only so the APIs look cleaner and we can internally catalog and map the APIs to the type of access they require. This means having another table with all the API routes listed and having classified access such as user privileged or general public. This table is not just an API-Security table but also provides a convenient placeholder for generating documentation and checking correctness of listings elsewhere. Such additional dedicated resources for security could be considered an overhead but we will try to keep it minimal here. Without this table we will assume that each API internally applies the UserId retriever ActionFilterAttribute to those that require to access privileged resources and the retriever will apply and enforce the necessary security.
We will also answer the second question this way. The proxy provides user information via "X-Mashery-Oauth-User-Context" in the request header. This lets us know that the token has been translated to a user context and the proxy has looked it up in a token database. This token database does not serve our API Security otherwise we would not be discussing our schema in the first place. So lets first implement the schema and then we will discuss steps 2 and 3.