Cluster computing

Saturday, September 7, 2013

One of the tests for an OAuth provider could be to use the access tokens with the API that uses a user information as a parameter. If none of the APIs use user parameter, and only the access token, this test does not apply.however, using the user parameter for the current user whose access token has been retrieved should work. And that of another user with the same access token should not.
Same user could sign in from multiple clients. Since, the tokens once issued are valid for usually a duration of an hour, the provider does not limit the number of calls made in that duration. For the same reason, number of clients used by the user should not be limited. since the API's in discussion are stateless, the number of such calls doesn't matter. That said, a single client may be hogging the provider. Throttling in both cases could be helpful but should not be done on user or client basis but could be done on an API by API basis. This is one of the functions of the proxy provider. Authorization denials are severe measures and should generally be a last resort.
Also, performance test for an OAuth provider is important. If the existing load tests cover a authorizations and revokes from a user on a client repeated say thousand times in one minute and five minute intervals, it should work. Token expiry time is not specified by the user. So a test that involves a revoke prior to a re-authorization should work as an equivalent. Load test could have a variation of the different grant types for authorization. The functional tests or the build verification tests cover these and they might have other tests that could be thrown into a mix. However, single authorization and revoke of a token should be targeted in a separate run if possible. This should involve the authorization grant type that is most common. The test run that includes other kinds of tests could not only include the hand picked from the existing test cases but also include a capture of peak load traffic from external clients.
Tests could also target authorization denials from the API provider and the proxy independently. This should be visible from the responses to the authorization requests. The server property carries the source of the response. This is useful to know whether the token in invalidated because of a invalid user or an invalid client. The status code testing is not enough. Error message if any should also be tested. In the case of OAuth, providing error message that mentions an invalid user or an invalid client could be helpful. A common error message is a developer inactive message. This one is interesting because there seems to be an activation step involved.
Tests could cover spoofing identity, tampering with data, repudiation, information disclosure, denial of service and elevation of privilege.
One of the weakness of this mechanism is that the APIs have to comply in a certain way. For example, none of the APIs should expose the userId parameter. If APIs expose a user parameter, it should be enforced with an alias for client usage even if those aliases are translated internally. Separation of the user parameter for API from the security mechanism that validates the user is important because security is generally considered a declarative aspect and not the code of the API.
If the two were tied together where the user information for the API is looked up via security tokens translation in the API implementation instead of outside it as a parameter, each API requiring that may need to do the same. Instead, it is probably more convenient to maintain a list of API secured by privileged access tokens. For example, if an endpoint is marked internal, it should be enforced. It could enforce by making sure that the callers are internal or that it is packaged in an assembly that is not exposed, etc. Test should verify that all APIs are marked for use only with an access token even if the tokens are not user privileged.

Friday, September 6, 2013

pivoting pushed to SQL

In the previous post, we talked about pivoting logic to be implemented in the .Net logic. If we had sufficient access and if this was a small scale deployment where privileges to the tfs database can be provided, then the same pivoting operation described in the earlier post can be pushed down to SQL.
This is how it would look like:

CREATE TABLE [dbo].[Source](
[ID] [int] IDENTITY (1, 1) NOT NULL,
[owner] [nvarchar[(50) NOT NULL,
[WorkItemID] [int] NOT NULL,
[Priority] [int] NOT NULL) ON PRIMARY

INSERT INTO pvt VALUES (1,"abc",1);
INSERT INTO pvt VALUES (2,"def",2);
INSERT INTO pvt VALUES (3,"ghi",2);
INSERT INTO pvt VALUES (4,"jkl",3);

SELECT [Owner],
[1],
[2],
[3]
FROM
(SELECT Owner, WorkItemID, Priority FROM [dbo].[Source])
AS Src
PIVOT
(
COUNT(WorkItemID)
FOR Priority in ([1], [2], [3])
) AS PivotTable;

TFS queries over the web

TFS provides web access. Queries can be specified in the following manner.
http://Server:Port/q.aspx?pname=Project&wiql=Work Item Query Language.
The Work Item Query Language does not need to be specified by hand. The TFS query can be edited in say Visual Studio 2012 and then saved as a file. This file is like an xml and has the Work Item Query Language. The white spaces in the URI may have to be substituted with %20 and the @aliases with the project names. By the way, if you right click on the query name tab in Visual Studio and click copy Full path, you get the http uri for the query. An example looks like this:
http://server:8080/tfs/web/qr.aspx?puri=vstfs%3a%2f%2f%2fClassification%2fTeamProject%2f260d948e-6a1e-4841-97a9-805b79c39cf8&path=Project_name%2fShared+Queries%2fAllActiveBugs
However, if you were to try any of the URIs with say Fiddler, you get Unauthorized or resource NotFound error.
Note that the server resources are named q.aspx and qr.aspx. The error messages I have encountered when using
http://server:8080/tfs/web/q.aspx are:
TF400907: The URL specified appears to be from an earlier version of Web Access. The format was not recognized : http://server:8080/tfs/web/q.aspx
There doesn't seem to be any documentation on the web for this error.
If you give the URI as
http://teamsystem:8080/tfs/web/qr.aspx?pname=ProjectName&wiql=SELECT%20[System.Id],%20[System.CreatedDate],%20[System.Title],%20[Microsoft.VSTS.Common.Priority],%20[Microsoft.VSTS.Common.Severity],%20[System.AssignedTo],%20[System.State],%20[System.AreaPath],%20[System.History]%20FROM%20WorkItems%20WHERE%20[System.TeamProject]%20=%20ProjectName%20and%20[System.WorkItemType]%20=%20'Bug'%20and%20[System.State]%20=%20'Active'%20and%20not%20[System.Title]%20contains%20'Build%20Failure'%20ORDER%20BY%20[Microsoft.VSTS.Common.Priority]
TF400367: The request could not be performed due to a host type mismatch. Please check any connection information and verify the information is correct. The request was executed against a Deployment, Application.
The Team foundation version control tool also does not seem to have an option to execute a query.

Thursday, September 5, 2013

I will describe another way to achieve the task in the previous post. This is assuming that you want to package the logic to meet the objective with as little dependencies as possible. One way to do this is to just rely on source data as it refreshes each day and handles all the steps from transforming to a pivot table to sending out the daily mail. This could be written as an executable in C# and deployed on a server to be invoked as a scheduled job. Since the logic is self-contained, it provides a single point of maintenance.

The steps involved are as follows:

1) Read or fill a data set directly with a webclient request to the tfs server or the database. If there is access to the database, a stored procedure to pivot the data can be authored. Alternatively, there could be an excel layer available via excel apis that generates the pivot table and saves it on a worksheet which can be opened. The pivot table has to be marked for refresh on each opening. If the pivot table could alone be published from the workbook to a web page, the web page could be read to take the contents as html. This way we could skip step 2.

2) If the data is not already in the pivoted format, the data can be pivoted in the .Net layer. Here is the code to transform the data to a pivot

var input = new DataTable("Item-Priority");
input.Columns.Add("ID", typeof(int));
input.Columns.Add("Owner", typeof(string));
input.Columns.Add("Priority", typeof(int));
input.Rows.Add(new object[] { 1111, "abc", 1 });
input.Rows.Add(new object[] { 2222, "def", 2 });
input.Rows.Add(new object[] { 3333, "def", 2 });
input.Rows.Add(new object[] { 4444, "jkl", 3 });
var pivot = new DataTable();
var dict = new Dictionary<string, int>();
pivot.Columns.Add("owner", typeof(string));
int index = 0;
foreach (var r in input.Rows)
{
var row = r as DataRow;
var key = int.Parse(row.ItemArray[0].ToString());
var owner = row.ItemArray[1].ToString();
var pri = int.Parse(row.ItemArray[2].ToString());
if (!pivot.Columns.Contains(row.ItemArray[2].ToString()))
pivot.Columns.Add(row.ItemArray[2].ToString(), typeof(int));
if (dict.ContainsKey(owner) == false)
{
dict.Add(owner, index);
pivot.Rows.Add(new object[] { owner });
index++;
}
}
foreach (var r in input.Rows)
{
var row = r as DataRow;
var key = int.Parse(row.ItemArray[0].ToString());
var owner = row.ItemArray[1].ToString();
var pri = int.Parse(row.ItemArray[2].ToString());
int count = 0;
int.TryParse(input.Rows[dict[owner]].ItemArray[2].ToString(), out count);
int old = 0;
int.TryParse(pivot.Rows[dict[owner]][pri.ToString()].ToString(), out old);
pivot.Rows[dict[owner]][pri.ToString()] = old + 1;
pivot.Rows[dict[owner]].AcceptChanges();
if (pivot.HasErrors == true) throw new Exception();
pivot.AcceptChanges();
}

3) Finally, we can prepare the html with the insertions for the pivot table and send it out by mail as follows:
var client = new SmtpClient("your_smtp_server");
var from = new MailAddress("your_email_address");
var to = new MailAddress("your_recipient_address");
var message = new MailMessage();
message.Body = "your_prepared_html";
message.Subject = "your_title";
message.IsBodyHtml = true;
client.Send(message);

Excel and Sharepoint can be used to create reports. Excel can publish items, sheets and workbook to Sharepoint and they can be accessed on he Internet.

Wednesday, September 4, 2013

Today I want to make a post aside from the ones I have been lately. Much of this has been learned in an effort to automate a report and I want to write it down here. First, an introduction to the objective and then a choice of steps and a discussion on their tradeoffs followed by a possibly expedient solution.
The objective in this case was to generate daily automated mails of a pivot table. The table comes from a TFS query that lists work items assigned to owners with the priority of items say 1, 2, 3. The generated pivot table has the count of work items for each priority listed for every owner. This table should flow daily to a distribution list of subscribers so that they can get it in their inbox.
Although there could be several ways to do this task, I will list the things that need to be considered.
The data could change every day.
The pivot table format could change.
TFS queries may not support both the listing of the data and the pivoting.
THis is a custom report. By that I mean we cannot use the canned reports from a TFS connection because they don't pivot the data.
The items in a pivot table should be in HTML format for mail out.
The transformation of the data from tables to xslt to any other reporting or the invocation of tools on dedicated local servers are subject to maintenance and add to costs.
Ease of use such as with Excel to pivot the table or odor charts will help with changes in the format of the output over time.
Server that sends out the mails should preferably do this daily. It's alright to do it more than once and to a designated person so that one of the can be selected to send out to a wider audience. This lets some amount of manual control if desired. This is optional.
The solutions could include and in no particular order:
1) write SQL server reporting services RDL reports so that they can be viewed from the TFS server
2) write custom tool to generate the output and the mail using libraries and plugins. The executable can be scheduled to be invoked with a task scheduler
3) TFS queries can be saved and made accessible over the Internet at which point the source table is already in HTML and available via web requests
Out of the three, the tradeoffs are in terms of flexibility to changes, availability of data and report online and the cost of ownership involved. Users who are familiar with these will spot them right away and for those who are new, this could be explained as the more moving pieces there are, the more it costs to own them or transform the data flow between them.
Lastly, the solution that could be considered expedient is to have the excel application publish just the pivot table item to and not the sheets or the workbook. This item could appear directly on say a share point website which provides http access to the pivot tables. This can be requested with a web request and mailed out with a script that is invoked by the task scheduler or a reporting server.

SHA-1 hashing

SHA-1 hashes are 160 bits or 20 bytes long. It comprises of hexadecimal numbers 40 digits long. The message digest is similar to the Rivest design for MD4 MD5. Take 5 blocks of 32 bits each, unsigned and in Big-Indian. Then do a preprocessing to the message. Append the bit 1 to the message. Append a padding of upto 512 bits so that the message aligns with 448. Append the legneth of the message as an unsigned number. Then do the processing on successive 512 bit chunks. For each chink, break the chunk into sizteen 32bit big endian words. Initializes the hash value for this chunk as h0, h1, h2, h3 and h4. Extend the sixteen 32 bit words into eighty 32 bit words this way: for each of the 16 to 79 th word, XOR the word that appears 3, 8, 14, and 16 words earlier and then leftrotate by 1. Initialize the hash value for this chunk as set of 5. In the main loop for i from 0 to 7, for each of the four equal ranges, apply the and and the or to the chunks in a predefined manner specified differently for each range. Then recompute the hash values for the chunks by exchanging them and re-assinging the first and left rotating the third by 30. At the end of the look, recompute the chunk's hash to the result so far. The final hash value is the appending of each of these chunks.
.Net Library packages this up with ComputeHash method:
A hash can be computed like this:
var bytesToSign = Encoding.UTF8.GetBytes(content);
HMAC hmacSha256 = new HMACSHA256(secretKeyBytes);
byte[] hashBytes = hmacSha256.ComputeHash(bytesToSign);
return Convert.ToBase64String(hashBytes);