Monday, October 13, 2014

#codingexercise
Int memcmp  (char* src, char* dest, size_t n)
{
  If (!src || !dest) return 0;
    If (src == dest) return 1;
    While (n)
      {
         If (*src != *dest) return 0;
         n--;
       }

    Return 1;
}
hi-level implementation of Inverted Lists in Key-Value Stores:
(Python and MongoDB)
import re
def getwords(doc):
   splitter = re.compile('\\W*')
   #split the words by non-alpha characters
   words = [s.lower() for s in splitter.split(doc)]
   return words;

class Vocabulary:
def makeInvertedIndexSortedByKey(doc):
  i = 0;
  for w in getwords(doc):
   Position = i++
   Frequency = (doc.id, freq(w)++, offset(Position))
   PostingsList = (Field, w, freq(doc.id), offset(Frequency))
   FieldDef = (Field,  indexed=true, stored=true, offset(PostingsList))




#codingexercise
Median of two sorted arrays

int GetMedian(List<int> list1, List<int> list2)
{
  if (list1 == null || list1.Length <= 0) throw new Exception();
  if (list2== null || list2.Length <= 0) throw new Exception();
  int first = 0; // index for first
  int second = 0; // index for second;
  var merged = new List<int>();

 while (first <list1.Length && second < list2.length)
     if (list1[first] < list2[second])
     {
       merged.Add(list1[first]);
       first++;
      }
     else
       {
        merged.Add(list2[second])
        second++;
       }

  while (first<list1.Length){
        merged.Add(list1[first]);
        first++;
   }

   while (second < list2.Length) {
        merged.Add(list2[second]);
        second++;
   }
   
 int mid = (first + second) / 2;
 if  (mid %2 == 0)
  {
     if (mid -1 > 0)  return (merged[mid-1] + merged[mid]) / 2;
  }
 return merged[mid];
}

Sunday, October 12, 2014

#codingexercise
TwoSum is a method that finds the indexes of two numbers in ascending order in an array that add upto a target. Numbers are positive integers.

List<int> TwoSum(List<int> numbers, int target)
{
if (numbers.Length <= 0 || target == 0) return null;
var ret = new List<int>();
for (int i = 0; i < numbers.Length, i++)
{
  int index = numbers.IndexOf(i+1, target-numbers[i]);
  if (index != -1)
 {
   ret.Add(i);
   ret.Add(index);
 }
}
return ret;
}

DBExplorer investigates the use of keyword search in relational databases.  It mentions several interesting problems that are encountered in keyword search as opposed to structured queries. For example, SQL applications require knowledge of schema and the keyword search doesn’t. Secondly, given a set of keywords, a match is found by joining several tables on the fly. Thirdly, Indexes need to be leveraged for efficient keyword search. Fourthly, common data structures used for document collections such as inverted lists are replaced by symbol table. For each keyword, a list of rows is kept in the symbol table that contains the keywords. Alternatively, for each keyword, a list of columns can be kept that contains the keywords.  Search is performed across multiple tables using a graph where the nodes are the tables and the edges are the foreign-key relationships. When looking for a co-occurrence, the tables are joined on the fly by exploiting the schema as well as the content.

Saturday, October 11, 2014


#codingexercise
Text justification
This problem requires the text in sentences to be broken and laid out in a way where each line has L characters except for maybe the last line. Words that break on a line move on to the next and padding introduced evenly between the remaining words.
Let X and Y denote sentences with count of characters less than or equal to L and more than L respectively
The test cases are
X
Y
XX
XY
YX
YY

List<string> Justify(string text, int L)
{
 if (string.IsNullOrEmpty(text) || L < 1) return null;

var ret = new List<string>();
string candidate = string.Empty;

var words = text.split();
 for (int I = 0; I < words.Count; I++)
 {
   // add word  
   if (candidate.Length + words[I].Length  +  1 <= L)
   {
     candidate += words[I] + " ";
     continue;
    }

   // add padding
   if (candidate.Length > 0)
   {
   int padLen = L - candidate.Length;
   if (padLen > 0)
   {
   var w = candidate.Split();
   candidate = string.Empty;
   for (int k = 0; k < w.Count; k++)
   {
       candidate += w[k] + " ";
       for (int l = 0; l <  padLen / w.Count && candidate.Length < L; l++)
          candidate += " ";    
   }

   if (w.Count > 0 && padLen > 0)
   for (int l = 0; l < padLen % w.Count && candidate.Length < L; l++)
         candidate += " ";
   ret.Add(candidate);
   candidate = string.Empty;
   }

    if (candidate.Length + words[I].Length  +  1 <= L)
   {
     candidate += words[I] + " ";
    }
   else
   {
     candidate += words[I].substring(0,L-1) + "-";
     ret.Add(candidate);
     candidate += words[I].substring(L) + " ";
   }

 }
ret.Add(candidate);
return ret;
}