Wednesday, October 22, 2014

#codingexercise
Int MaxR (ref List <int>  items, int start, int end)
{
If (start > end || items == null || items.length == 0 || start < 0 || end >= items.length) throw new Exception();
If (start == end) return items [start];
Int mid = checked ( (start + end) / 2);
Int left = MaxR (ref items, start, mid);
Int right = MaxR (ref items, mid +1, end);
Return left > right ? Left : right;
}

In continuation of our posts on the applications of machine learning from the book Programming Collective Intelligence, we will now see an interesting application of optimization in the real flight searches. The Kayak API is a popular vertical search engine for travel. After registering for developer access to the Kayak package which gives us a developer key, we get a new Kayak Session from the Kayak API server. With this session, we can do different flight searches. Note that the need for a session, the use of XML based response, and the extra long URLs based with query strings comprising of several different search parameters, are all reminiscient of the APIs of those times. With just the query parameters for session Id, destination and depart date, it returns a search id with no results that can be used for polling. With the search id and the session id, we can write the polling function as one that requests the results until there are no more. There is a tag called morepending which has a value true until its all done. The prices, departures and arrivals list retrieved from the query is zipped into a tuple list. The flights are returned primarily in the order of price and secondly in order of time. When we try to create the schedule for the arrivals and departures of the Glass family where the family members visit from different places and leave after the holiday, this flight search is simply performed over and over again for their outbound and return flights for each individual.  Then we can optimize the flights using the actual flight data in the costing function.  The Kayak searches can take a while, we are merely considering this search for the first two passengers. The costing function now takes the domain as a range multiplied by the count of the outbound and return flights of these two passengers and run the same optimization as discussed earlier using this domain. The actual flight data gives us more interesting data ways to expand our optimization.
This far we have discussed optimization based on costs. However, there is such a thing as preferences which may or may not be directly tied to costs. Preferences are user attributed.and this is a ranking that makes the people happy or as little annoyed as possible. The information is taken from the users and combined to produce the optimal result. For example, everyone may want a window seat on the same flight but that may not be possible.  If we make the cost function to return very high values for invalid outcomes, we can resolve this conflict. 

Tuesday, October 21, 2014

#codingexercise
Convert numbers to Roman
public string GetRoman(int t)
{
List<int> numbers = {1000, 900, 500, 400, 100, 90, 50, 40, 10, 9, 5, 4 ,1 };
List<str> numerals = { “M”, “CM”, “D”, “CD”, “C”, “XC”, “L”, “XL”, “X”, “IX”, “V”, “IV”, “I”};
String ret = string.empty;
int i = 0;
while (t > 0)
{
  int digit = t / numbers[i];
  For (int j = 1; j < digit; j++;)
  {
   ret += numerals[i];
  }
  
  t -= digit * numbers[i];
  i = i + 1;


Return ret;
}

#codingexercise
Public int getInteger(string numeral , dictionary<string, int> dict)
{
Int I = 0;
Int number = 0;

While ( I < numeral . length )
{
If ( I + 1 < numeral.length && dict[numeral [i+1] ]> dict [numeral[i]])
{
Number += dict [numeral.substring(i,2)];
I = I + 2;
}
Else
{
Number += dict [numeral [i]];
I = I + 1;
}
}
Return number;
}
Public string getRoman (int)

Monday, October 20, 2014

#codingexercise
int RomanToInt(string s)
{
 if (String.IsNullOrEmpty(s)) return 0;
//regex match string with +[IVXLCDM]
  var dict = new Dictionary<char, int> ();
        dict['I'] = 1;
        dict['V'] = 5;
        dict['X'] = 10;
        dict['L'] = 50;
        dict['C'] = 100;
        dict['D'] = 500;
        dict['M'] = 1000;
   int ret = 0;
   char temp;

   stack<char> st;
   st.push(s[0]);

  for (int I = 1; I< s.Length; I++)
  {
    if (st.count > 0 && dict[s[I]] > dict[st.Peek()])
    {
        temp = st.top();
        st.pop();
        ret = ret - dict[temp];
       st.Push(s[I]);
    }
    else
      st = st.Push(s[I]);
  }

while(!st.empty)
{
 ret += m[ st.Peek()];
 st.pop();

}
 return ret;
}
// there are other ways of doing this but I'm just listing the earliest one.

int RomanToInt(string s, int index, Dictionary<char, int> dict)
{
// validate s as not empty, not null and roman numerals only
 if (index == s.length) return 0;
 if (index - 1 > 0 && dict[s[index]] > dict[s[index-1]])
     return (0-2 * dict[s[Index-1]]) + dict[s[Index]] + RomanToInt(s, index+1, dict);
else
    return dict[s[Index]] + RomanToInt(s, index + 1, dict);
}

Sunday, October 19, 2014

Non-negative matrix factorization: This is a technique used for extracting the important features of the data. This technique involves matrix multiplication where a matrix with  r number of columns  is multiplied with another that has the same number of rows r and a cell in the resulting matrix is  scalar product of the values in the corresponding same row in the first matrix with same column in the second matrix. It also involves transposing where the matrix elements are written in column order first.
Given a set of articles containing words, we can write the usual article matrix where the articles will be the rows and the words appearing in them will be the columns. The words are the properties for the articles and we take words that are common but not too common :
wordvec = []
for w,c in allw.items():
   if c > 3 and c < len(articlew) * 0.6:
    wordvec.append(w)
This technique  is about factorizing the article matrix into two matrices. a features matrix and a weights matrix. The features matrix has a row for each feature and a column for each word.  The weights matrix tells us how much each feature applies to each article. Each row is an article and each column is a word.
The articles matrix may be large. The purpose of this technique is to reduce the matrix to a smaller set of common features. This smaller set of features could be combined with different weights to get back the article matrix. Although it is unlikely to get back the original, this technique attempts to get back as closely as possible.
Its called non-negative because it returns features and weights with no negative values. Since we count the inclusions or occurrences of a word, this technique is not applied such that we consider the exclusions of words. This approach doesn't look for the best factorization but its results are often easier to interpret.
The algorithm we use in this approach would have a utility function to determine how close the reconstruction is. This is done with a function that sums the squares of the differences between two equal sized matrices. Then we gradually update the matrix to reduce the cost function.
Here we could use a number of the optimization techniques such as annealing but we discuss multiplicative update rules.
 Give the original matrix which this optimization technique calls data matrix, we make four new update matrices    
hn = the transposed weight matrix multiplied by the data matrix.
hd = the transposed weights matrix multiplied by the weights matrix multiplied by the features matrix.
wn = the data matrix multiplied by the transposed features matrix.
wd = the weights matrix multiplied by the features matrix multiplied by the transposed features matrix.
The weights and features matrix are initialized with random values.
To update the features and weights matrices, all these matrices are converted to arrays.
Every value in the features matrix is multiplied by the corresponding value in hn and divided by the corresponding value in hd. Likewise, every value in the weights matrix is multiplied by the corresponding value in wn and divided by the value in wd.
This is repeated until the number of features are found or the number of iterations is exceeded.
def factorize(v,pc=10,iter=50):
    ic=shape(v)[0]
    fc=shape(v)[1]
    # Initialize the weight and feature matrices with random values
    w=matrix([[random.random( ) for j in range(pc)] for i in range(ic)])
    h=matrix([[random.random( ) for i in range(fc)] for i in range(pc)])
    # Perform operation a maximum of iter times
    for i in range(iter):
        wh=w*h
        # Calculate the current difference
        cost=difcost(v,wh)
        if i%10==0: print cost
        # Terminate if the matrix has been fully factorized
        if cost==0: break
        # Update feature matrix
        hn=(transpose(w)*v)
        hd=(transpose(w)*w*h)
        h=matrix(array(h)*array(hn)/array(hd))
        # Update weights matrix
        wn=(v*transpose(h))
        wd=(w*h*transpose(h))
        w=matrix(array(w)*array(wn)/array(wd))
    return w,h

Recall that an Eigenvector and corresponding eigenvalue is another way to factor the matrix into vectors and since we know that the det(A-Lambda.I) = 0, we can solve for the Eigen values and Eigen vectors. This may be useful for finding a single feature.
 Another way to do the optimization could be to use the updates of the matrix using the Laplace transform of a matrix valued function as mentioned here (http://see.stanford.edu/materials/lsoeldsee263/10-expm.pdf). This is useful when we don't want to find the Eigen values and we can use the resolvents. Moreover the statewise transitions are linear.
For example, the harmonic oscillator function can be written as
x-dot = [   0   1 ]  x
             [  -1  0 ]
and the state transitions are x(t) = [cos t  sin t] x(0)
                                                       [-sin t cos t]

Similarly the matrix exponential solution is x(t) = e ^(tA) x(0)
where the matrix exponential is e ^ M = I + M + M^2 / 2! + ...

for t = 1 and A = [ 0  1] 
                            [ 0  0], the power series has M^2 = M^3 = ... = 0
and e^A = I + A =  [ 1 1 ]
                               [ 0 1 ]

We can also explore weighted  laplacian  matrix for the partitioning.
• Larry Hardesty, Short algorithm, long-range consequences, MIT News, 1 March 2013 is an interesting read on the subject.
#codingexercise
Get the last minimum value in a set of positive integers
Int getLastlocalMin (int [] v)
{
If (v == null) return -1;
for ( Int u = v.length - 2; u >= 0; u--)
     If (v [u] > v [u+1]) return v [u+1];
Return v [0];
       
}