Thursday, April 18, 2013

Interview question answers continued

Q: design a file system object model
A: a file system exposes files and folders in a tree form. Both these are file system objects that are manageable in terms of resolving the location, securing their access and iterating over their contents. Internally, their is a layer of file manager, file, block and disk based access.

class File : private FileSystemObject
{
public:
 // data operations
byte[] ReadAllBytes();
string[] ReadAllLines();
fstream Read(FileMode m);
void Encrypt();
void Decrypt();

// metadata operations
void SetAccessControl(FileSecurity s)
void SetCreationTime(DateTime t)
}

class Directory : private FileSystemObject
{
public:
 // access operations
void GetFiles();
void GetDirectories();
 // metadata operations
void SetAccessControl(DirectorySecurity s);
void SetCreationTime(DateTime t)
}
 

Wednesday, April 17, 2013

interview question answers continued

Q: How do you determine if a binary tree is binary search tree ?
A: bool IsBST( node* root, int* prev)
{
bool ret = true;
if (root == null) return ret;

ret = IsBST(root->left, prev);
if (ret==false) return ret;

if (*prev >= root->val) return false;
*prev = root->val;

ret = IsBST(root->right, prev);
if (ret==false) return ret;

return ret;
}

Tuesday, April 16, 2013

Chapter 24 DB systems textbook

To build a decision tree:
Input : node n, parition D, split selection method S
Output: decision tree for D rooted at node n

Top-Down decision tree induction schema:
BuildTree(Node n, data partition D, split selection method S)
   Apply S to D to find the splitting criterion
   if ( a good splitting criterion is found)
       Create two children nodes n1 and n2 of n
       Partition D into D1 and D2
       BuildTree(n1, D1, S)
       BuildTree(n2, D2, S)
   endif

Clustering:
Input : node n, partition D, split selection method S
Output: decision tree for D rooted at node n

Top-Down decision Tree induction Schema:
BuildTree(Node n, data partition D, split selection method S)
(1a) Make a scan over D and construct the AttributeValue Classlabels (AVC) group of n in-memory.
(1b) Apply S to the AVC group to find the splitting criterion

A partitioning clustering algorithm partitions the data into k groups such that some criterion that evaluates the clustering quality is optimized. k is specified by the user. A hierarchical clustering algorithm generates a sequence of partitions of the records. Starting with a partition in which each cluster consists of one single record, the algorithm merges two partitions in each step until one single partition remains in the end.

Monday, April 15, 2013

keyword clustering

Wartena 08 describes topic detection by clustering methods. The paper suggests that the topic for the whole text can be found based on a probabilistic latent semantic analysis. Keywords are clustered based on statistical similarity measure. The conditional probability distributions that indicate the latent topic are the clusters that this approach finds. Clustering is based on a hierarchical top down method which requires that the center is recomputed in each step. Two elements are selected that have the largest distance and are used as seeds for the clustering. Next all other items are assigned to the center closest to one of the two seeds. After all the items are assigned, the centers are recomputed. The new centers serve as new seeds and the process is repeated until two centers are converged. If the clusters are larger than a threshold, the whole procedure is repeated on that cluster and thus we find a binary tree of clusters. The choice of distance functions used in this paper for two terms t and s are the cosine similarity of the document distribution, the cosine similarity of vector tf, idf values of keywords, the Jensen Shannon distributions between the document distributions and the Jensen Shannon distributions between the term distributions pt and ps.
When clustering using any distance function, the cluster centers could be chosen among the data points and then centers recomputed in each step.

Saturday, April 13, 2013

interview question answers continued

Q: How do you update the game of a black and white coin game such as Othello ?
A: In the update mode, find the longest sequence of the coin opposite of the color to be played. If this sequence has options to be bounded, place the coins at one of the ends. Choice of ends can be based on the side that has odd number of open spaces. The set of sequence can be based on the state of the game. and can generally be identified by expanding out from the center.

 Q: Given the root of a binary search tree along with a node inside it, find the successor node for the given node.
A: The structure of a binary search tree lets us determine the successor of a node without ever comparing keys. If the right subtree of node x is non-empty, then the successor of x is the leftmost node of the right subtree.If the right subtree of node x is empty and x has a successor y, then y is the lowest ancestor of x whose left child is also an ancestor of x.

Q: Given a lot of intervals [ai, bi], you have to find  a point with the intersection of most number of intervals.
A: We can maintain a count of inclusions in intervals for each point as we traverse the numberline from the minimum ai to the maximum bi.

Friday, April 12, 2013

interview question answers continued

Q: Give the four basics of object oriented Programming.
A: These are some of the tenets of object oriented programming
1. Encapsulation - define a concept by keeping the data and its operations together
2. Inheritance - defines an "is-a" relationship between derived and base class.
3. Composition - defines a "has-a" relationship between a container and contained class.
4. Polymorphism - defines the same method for different contexts

Q: Given two integers m, n, find the first integer r that is divisible by both m and n.
A: Find the factors of m and n. From the factors choose two or more factors such that the product is bigger than both m and n and less than or equal to m * n.

Q: List and compare the various sorting algorithms
A: 1. Bubble sort : make multiple pass and fix the order of elements. It has an average O(n^2) complexity/
2. Selection sort : finds the minimum and swaps with the first and repeat for consecutives. Another O(n^2) algorithm
3. Insertion sort : played like the cards in a deck, take elements one by one and insert them in position in the sorted list. Average O(n^2)
4. Heap sort : Use the array as a heap and use the heap algorithm to sort. O(nlogn) complexity
5. Merge sort : bottom up merge between pairs O(nlogn) complexity
6. Quick sort : divide and conquer based on partitioning on a pivot value. O(nlogn) complexity

Wednesday, April 10, 2013

interview question answers continued

Q: Given an integer array find if there exists three integers that add up to zero.
A:  This is another integer array problem where we are interested in a particular sum. Since we are looking for three integers the naiive solution looks O(n^3) but this is not the case. First we could just sort the elements, then for each element we encounter we know the complement that makes the sum zero.For each complement, we know the fractions has to be earlier than the number we have just encountered. Thus we can proceed better in this case if the elements are sorted.

Q: Given a set of login and logout pairs  of users, find the number of users online at any given point of time.
A: From the start of the list of logins and logouts, populate a dictionary of users and their active logins by incrementing a counter for every login of that user and decrement for every logout of that user until the last entry. This gives the number of active logins for each user.

Q: Another integer array is given where there are repeating zeros. Compact the array by erasing the zeros.
A: The problem hints on using the same storage to get the resultant array. So we can lean on multiple passes on the array if we want. Hence, we could make one pass to find the index of all occurances of zero. Then for each occurance, shift all the elements following it. We could shift more than one at the same time if the occurances are consecutive. Another improvement could be to shift only the elements upto the next occurance and keep the zeros consecutive at the next occurance.

Q: Given a set of non-overlapping integer range and an input integer, now design a data structure to store them and have operations of insert, delete and search
A: A double dimensional array seems sufficient for the operations where the integer ranges could traverse one or more of the jagged array elements. For each element we keep track of the end of a range and the start of another.

Q:How can you tell if a given string is of the form aZb where a is the reverse of b ?
A: If a and b can be empty strings almost all strings can satisfy that form otherwise compare start and end index elements and update the indexes on a match.