Wednesday, March 20, 2013

Test harness and drivers

Frequently, we develop modules for automted test execution but don't integrate all of them. We do this mostly because we are short on time or resource. A test framework that's useful for manual tests can also be useful for automated tests. Just the road-map of the integration needs to be worked out. This is where a technical specification and planning tools help. Because it lists all the dependencies, the layers, the object model and mappings, the translations and operations, the stages of development, the costs for development and test and the trade-offs and the deliverables from sprint planning. With a knowledge of the means and methods to integrate test tools and framework captured in written format, resources can pick them up or execute independent of each other. The benefits of integration is mutual to the individual software products being integrated and adds overall value to existing products. I want to be able to do this.

Tuesday, March 19, 2013

automated cars

Why are automated cars better than manual ?
There are several reasons why automated cars are better than manual.
First, the driver has less things to remember and practice for the ride. There are fewer chances for driver manoeuvre errors and hence more safe to drive. Driver weariness and machine faults are a significant contributor to accidents. If the transmission is automated, it reduces a factor in the cause for accidents and improves safety when driving.
Second, the bulk of city driving is in stop and go traffic and these are better done with automatic transmission. The stop and go traffic requires the gears to be shifted by reaching neutral before going to higher gears and this adds to constant wear and tear.  Besides, the driving is smoother when the gear shifts are not noticeable.
Third, with the gear shifting is smoother in automated cars as opposed to manual. There are less chances of improper use or damage. Even in the case of emergency responses, the driver may occassionally wear out the transmission since it takes more time to respond. Experimental studies have shown that the first 1.8 seconds of an impending collision detection and response are crucial in avoiding a fatal accident.
Fourth, the car has less associated costs in terms of insurance and maintenance.  Insurance companies love the norm and automated transmission has become popular in sales. Car prices are affected when the transmission is automatic and so is the resale value of the car.
Fifth, the control over the torque that comes with different gear shift is not necessary in most city driving  in the United States. This control is great for say racing or driving on rugged terrains but usually can be avoided for city driving.

Monday, March 18, 2013

.net source for standard query operators

The standard query operators implementation is available for download from .Net. For a collection of query clauses, a grouping can be made. These are treated as subtrees in an expression tree. Each item of the result is run through the decision tree to be included in the result.  The choice to include an item depends on the success of a predicate and the logical operation with the current evaluation. 

expression tree

An expression tree is an efficient data representation of a query  lambda expression. In terms of LINQ to SQL, the expression tree gets converted to SQL to ensure the filtering and ordering are performed on the server. For lambda expressions, IL code or an expression tree can be generated. If an operator accepts an expression for a method delegate, IL code is emitted.If an operator accepts an expression of a method delegate, the expression tree is returned. LINQ to SQL objects generate IL code because it takes in a generic delegate, whereas LINQ to SQL implementation takes an expression tree that gets converted to SQL to be executed by SQL Server.

Sunday, March 17, 2013

Longest common subsequence

When there is a recursive algorithm based on the subproblems but the total number of subproblems is not too great, it is possible to cache all solutions in a table. In the case of a longest common subsequence detection, running through all the subsequences would take exponential time. Instead we can define a recursion based on the following :
Let c[i, j] be the length of the longest common subsequence of the prefix  of X[1..i] and Y[1..j]. Recursion:
c[i, j] = { 0 if i =  0 or j = 0
          = { c[i-1, j-1] + 1 if i,j > 0 and xi = yj
          = { max( c[i,j-1], c[i-1,j] ) otherwise
This way we utilize the solutions we computed. We compute the table c[i,j] bottom-up and also store the value b[i,j] that captures whether c[i,j-1], c[i-1,j] or c[i-1,j-1] + 1is optimum.  We therefore find the longest common subsequence by walking backwards on the path from the last cell of b[i,j] to the first.

Saturday, March 16, 2013

decision trees


From the work of some sorting algorithms, we can derive a decision tree.  A decision tree can be visualized as nodes for each comparison and left and right sub-trees for yes or no choices. Each execution of an algorithm gives a downward path in the tree. Worst-case time is lower-bounded by longest downward path (height) in this binary tree. A tree of height h has at most 2h leaves. If a tree has L leaves, then its height is at least log L.

Greedy algorithms have no general recipe. In general, greedy algorithms assume that there is an objective function f(x1, x2, …, xn) to optimize ( say maximize) that depends on some choices. Second, there is a way to estimate the contribution of each choice to the final value but without taking into account how our choices will constrain later choices. Finally, the algorithm is greedy if it still makes the choice with the best contribution. Greedy algorithm is frequently not the best choice but sometimes it is.

There are cases of algorithm problems when nothing is better than trying out all possibilities. This is especially helpful when the number of possibilities is bounded. For example we can use exhaustive search to find elements of a maximum independent set in a graph. A maximum independent set is one in which no two vertices have a connecting edge. Independent set is NP-complete.

Courtesy study material  uploaded by Peter Gacs, Boston university.

Friday, March 15, 2013

How do you design a text mining system ?


We begin by encapsulating the core concept of a text mining system which is a concordance. Next we support a management layer over the concordance. We streamline the insertion, deletions and updates. We encapsulate every parameter we let the system depend on. We build supportability and stack trace capture for diagnosis. We write unit-tests side by side with the existing implementation. Sounds exciting.