Cluster computing

Saturday, August 31, 2013

The APIs we mentioned in the previous post for OAuth have resource qualifiers based on clients, users and token since there can be several for each. We have not talked about claims. Claims provide access to different scope such as account, payment, rewards etc. We leave this as an internal granularity that we can expose subsequently without any revisions to the initial set. For the first cut of the APIs we want to keep it simple and exhaust the mapping between the clients, the users and their tokens. The tokens are themselves a hash they have an expiry time of around an hour. So we keep track of the tokens issued for the different clients and user authorizations. The table for the tokens could have a mapping to the user table and the client table based on the user and client ids respectively. The table is populated for every authorization grant. If the tokens are issued their issue time is recorded. So that for every access to the token, we check whether the token has expired. Since the tokens are hashes or strings, its useful to index them and this can help lookup time. An alternative would be to look them up with the date of issue so so that we can retrieve the tokens that were issued only in the last hour and check for the presence of the token presented. Tokens will be of different privileges depending on whether the userId and the clientId fields are populated. There could be fields for say "bearer" token. These tokens are important since OAuth treats these tokens differently. OAuth RFC 6750 describes these in details. The lookup of the token needs to be fast since the tokens will most likely be used more than the number of times they are issued. A cryptographic hash is sufficient for the token since we don't want to tie it to any information other than the mapping we have internally and we do want to make it hard for the hackers to generate or break the hash. This is easy because .Net libraries make it easy to generate such hash. Token once generated should not be removed from the table unless the user requests it. Revoking on user request keeps the table consistent internally and externally because the APIs return the list of tokens that the clients can keep track of. Revoking the token is important because the user or client can choose to revoke one or more tokens simultaneously. So this operation is different from the insert or the lookup. The tokens should not be generated more than once. So retries have to be error proof.

Friday, August 30, 2013

In the previous post, we began with a mention for how the proxy and the API interact as of today and moved on to how we want it to be. Here we re-visit the current interactions. The proxy may work across API providers across providers. However, we will keep our discussions to the API implementation. In the API implementation, we have the following methods provided by OAuth provider :
1) Get the description of a client
2) Create an access token when the API calls the provider
3) Create an authorization code when the API calls the provider
4) Get the applications for a user
5) Revoke an access token
When the token or codes are created, a user context is passed in that is used to track the client authorizations and the responses. This is how the tokens and codes are associated with a client and user. Depending on whether provider is implemented by a Proxy or by the retail company, the methods mentioned above are required to establish the mapping between the user and the client and their tokens/codes. The methods also take in additional parameter to make the token effective such as grantType, scope, response type, inclusion of a refresh token in the response, a URI and a state. These parameters are relevant to cut out the tokens as requested.
However, let's take a closer look at the APIs to see what we could add or improve on these APIs. The first thing is the testability. Do we have all the information exposed so that the data can be tested. No, we could do with a list of tokens assoicated by user. If we have a list of tokens issued for a client and a list of tokens issued for a user, we should be able to see which tokens are issued for clients only and that the tokens associated with the clients only are for low privilege uses as opposed to those that are issued with a user context. This test helps us know the set of tokens that have more privilege than others. By separating out the tokens this way and knowing the full set of tokens issued, we should be able to ensure that all access is accounted for.
Next we could do with a list of clients authorized by user and a list of users that a client has been authorized for. This could help us determine whether the authorization updates both lists. This is relevant because a token for user resources should be issued only when both are present.
Lastly, the list of authorized clients for a user and the list of users for a client should be updated with corresponding revokes. When all the tokens for a user with a given client are revoked, it could be considered equivalent to revoking the client authorization so that the client can request the user to re-authorize via OAuth for the next session. That is upto the discretion of the client and could be facilitated for the user with a checkbox for keeping the client authorized the first time the user signs in with OAuth WebUI.

Wednesday, August 28, 2013

Discussion continued

In the previous post we discussed interactions for OAuth. In this post we focus on the provider side interactions. The proxy and the API both participate in token and code grant. The proxy maintains the token database and the API maintains the user database. When the client calls in without a user context, the grants have lesser privileges. When the client calls in with a user context, the grants have more privileges. The API implements the OAuth endpoints that the webUI calls or clients call and internally there are handshakes between the proxy and the API before a grant is issued. As you can see there's several interactions between the proxy, the client, the user and the API including cross communication. First, the client talks to the token endpoint, then the API talks to the token provider, then the API returns the token as in the case of client credentials. If the user is involved, the user first talks to the WebUI, then the WebUI talks to the API which then talks to the token provider and then the token is passed to the client.
When the user signs on at the webUI through any client, then the API gets a call to an internal or external endpoint for OAuth. At that point, it is a mere convenience for the API to rely on external token provider to return a token. It is better for the API to own the mapping between userId and client Id since such provisioning is easy and brings in the control over what clients can selectively be treated differential-ly. For example, the company's mobile application might be a client that should have high availability over other clients. Also the company's mobile application could have access to internal endpoints. Internal endpoints are different from external in many ways but I would like to draw attention to one call out item. External endpoints could be hosted by any proxy or in a third party cloud hosted by yet another network. There could be significant delays in the responsiveness of the APIs. This is not just a maintenance issue, its actually a significant usability issue. User attention span can be assumed to span a handful of seconds and even on mobile devices, frequent round trips may need to be avoided.
Network delays, heterogeneous networks and cloud services provide a significant challenge to the responsiveness of the APIs to various clients.
When we talk about the company's mobile application, we could consider diverting the traffic to networks within direct control of the company or a direct lease with a cloud provider.
This argument is made in the favor of differentiating clients but in general the point here is that network responsiveness the APIs may be even more important than the resource management such as CPU or memory for the API providers. CDNs could also help but they are very different from what APIs are used for.
Moreover statistics and call history of APIs are better grouped with clients rather than users.

Monday, August 26, 2013

membership providers

membership providers have their own validation routines. For example, MembershipProvider.ValidateUser checks if the user specified is valid. The membership provider changes in OAuth and hence corresponding validations should occur by the OAuth provider or in this case, the website. In OAuth, the sessions are maintained by the other website that requests OAuth. This causes the user to reauthenticate after a specified time. For the most part refresh tokens are sufficient to keep accessing protected resources.
UserId lookup is associated with the WebUI login process. All callers to APIs could require the alias of "me" for the user Id. This way they don't need to know the user Id. However internally for APIs that are secured through OAuth, the access token is associated with a user and client. The OAuth provider should be able to find out what the userId is based on the token and grant access to protected resources or flag the token as inappropriate token.
So let's consider the access between each quadrant of players:

Client | User (Vivian)
(Ravi.com)
----------------------------------------------
Provider(Proxy) | API (Retail.com)

I made the User directly access the API through the retail company's primary website or over the company's mobile application. Users are able to access their resources.
All calls crossing the dotted line must be with registration so that implementation knows who the caller is. The bottom parts constitute the implementation components and are things we will assume we can control and is therefore private in nature. The User and the client can be considered to be public and can be considered public. The public region is largely where security attacks originate from. This is typically launched by a malevolent user or a client. The interaction between a user and a client is one of authorization Without both of them, there is no access for the client to the protected resources in the API. For the non protected resources, the client has to get tokens from the OAuth provider. In this case the client has to register with the OAuth Provider. The retailer has no knowledge of who the client is and does not need to keep track of the client. This can be strict. Virtually all calls to the API are for some user or a guest and the API need not even bother about the clients. If the strictness were to be enforced, the retailer is looking for some kind of userId from the proxy for all client calls. Therefore if the proxy is not presenting a token that maps to a UserId, there is no access to protected resources through the proxy.
Notice that Retail.com could become a client by itself. This is a scenario we can talk about in the improvements. This does not change the fact that the APIs is the true representation of the Retail company.
When all websites including the retailer's access data through the API as a registered client, the API implementation is guaranteed that all accesses are unified in a way such that they are accessed with a token that represents both the client and the user. The retailer's website does not have to be a special client. It just needs to keep its client I'd and secret confidential just like any other web server. In practice, a .com website for a retailer presents a considerable legacy that adds no customer win for registering itself as a client and moreover poses risks during the change which the business may not allow. Furthermore, the .com website and the API provide mechanisms to compare the data between the two.

We will also talk about proxy and/or API sharing user and client mapping.
Finally, I want to list what I see as candidate improvements I can consider for OAuth
1) Token and management endpoints to be redesigned such that one provides token or code by all grant methods and another provides management functionalities
2) the management portion could be exclusively for the WebUI presented to the user for management of the client registrations. Grants and revokes of clients or new user registrations and redirects are what the WebUI provides
3) API implementation should not rely on OAuth to treat it merely as a user based access that could have otherwise been established with a single sign on web server. It was to enable richer experiences customized for different users and clients. OAuth was also not merely about using tokens and codes in place of passwords and sessions but for different clients to provide seamless sessions.
4) when the clients are fully supported andand are treated the same as users, then we could even do away without the notion of users in the implementation.
5) resource management becomes easier with client only management and representing users as groups.

We can the think of a stack such as the following :
User
Client
Proxy
API
and eliminate the proxy with better functionalities within the API

OAuth testing discussion continued

In this post we look at performance considerations to OAuth testing. Much of this testing is targeting the OAuth provider. The authorization end points both token and management rely on the token database maintained by the OAuth provider.

Saturday, August 24, 2013

OAuth testing continued

9) landing page for user authorization
a) users must be able to see client description
b) user acceptance must result in return url with token/code
c) user denial must result in return url with error parameters in query string
d) response type, user_id, client_id, redirect_uri, scope and state parameters should be validated.
e) tokens retrieved should exist in the provider database.
10) user and client mapping
a) client access provisioned on a user by user basis, otherwise only client credential provisioning possible
b) check against cross user profile access via common clients
c) check against admin access clients
d) check correctness of user list maintained by client
e) check correctness of clients authorized by user
11) resource management policies enforcement
a) provision minimal scope authorization and check for external access
b) check against all scope parameters or access range.
c) specify full access range and bearer token to see if different if card balances can be read.
d) set the state and callbacks to see if scope changes
e) check which apis or methods are to be protected with access tokens and if they are all enforced.
f) check mashery or OAuth providers api for token to user or client mapping
12) security validations
a) check for phishing attacks
b) check the http headers for leak of securables
c) check that TLS is required for all APIs
d) check that the server authentication by way of certficates is provisioned.
e) check that client ids and secrets are not leaked
f) check that cross site forgery attacks can be thwarted by callbacks and state.

Friday, August 23, 2013

oauth testing continued

Let's close on the OAuth test matrix here:

1) Implicit Grant

a. missing user id

b. missing client id

c. any user id but Valid client id

d. Valid user id and client id

e. Valid user id but invalid client id

f. Error codes – 400, 403, 404, invalid_request, invalid_token and insufficient_scope

g. Use invalid uri to not get 302 (new)

h. Performance (new)

i. XML and JSON responses

2) Authorization Code Grant

a. Similar to Implicit grant but responseType=code so 1a to 1i will be repeated. (new)

b. Code will be translated to token.

c. Code expiry will not be tested but code revoke will be tested to validate token

3) Client credentials grant

a. Targets token endpoint to get token using client id, client secret, scope (new)

b. Checks for error message for invalid grant (new)

4) Revoke access

a. Revoke token will be tested but not revoke client

b. Revoke an already revoked token

c. Revoke an already revoked client (new)

d. Revoke all tokens for a client ( Get all tokens and validate each) (new)
5) Claim information
a) Get claims based on default scope (null)
b) Get claims based on specific scope (not null)
6) Client Information
a) Get name of client application and check access tokens
b) Get client without name, description or image to see the default rendered to the user
c) Get all access tokens and add or remove tokens to see if the client information is updated
d) Check if revoke all removes all access tokens.
7) Get allowed clients for a user
a) Check if all the clients are listed for the user.
b) Add or remove a client to see the corresponding update to the list
c) Authorize a client for the user but delete the client to check for orphaned entries
8) Check response types
a) check the code
b) check the token