Cluster computing

To continue on the coding exercise, there are several other ways to find a number in the matrix like we have described. The one we described had a binary chop method to search for the number in each row of the matrix. The binary chop operation on a single row takes log N time and if there are n rows we are repeating to this to a total complexity of O(N log N) times. We make no use of the sorted order of numbers in the column in this approach and the convergence is linear to the number of rows.
We could go binary chopping across rows as well because the elements are sorted there too and iterate down the columns. That would however put us at no better benefit over the one we have already. It would be a pure chance if one were to work better than the other unless we knew which one of the two - row or column is smaller.
Another technique that does not recognize the sorted order of the numbers and instead uses the divide and conquer approach - is to partition the matrix. One example of this would be to split the matrix into two or more sub-matrices and then recursively apply the solution to each of the sub-matrices. We could easily eliminate a matrix if the number we are looking for lies outside the right bottom entry in the matrix since this is the largest per the description of the matrix.
If we choose for example a point in the center of the matrix, say mid = (row /2, col/2) and then get four sub matrices of different sizes, then we can easily exclude regions of matrices pretty quickly. By excluding submatrices we attempt to converge onto the number in the bigger matrix logarithmically. This is way better but it involves parallel and recursive computation and consequently more resources. The approach is still log N. However note the termination condition of this matrix. It is not that we can say that the bottom right number is greater than the search number and the top left number is smaller than the search number, the number can be checked along the diagonal. To the contrary, if we take the example [ 53 78
87 92 ] and looking for a number 62, we see that this is not sufficient to go diagonally. We have to exhaust the rows one after another anyways. While it is easier to exclude the matrix by looking at the top left and bottom right numbers in the matrix, the same cannot be said for when a number is bound within them. In such a case there is additional time complexity of going linearly from row to row which may take O(N) time anyways.
This leads us to a potentially faster solution by enumerating the elements of the array in sorted order until we reach the search number. In the example above, we traverse row order or read column order which ever gives us the next higher number and compare it with the previous position. For example,
if we took a matrix
23 34 45 46 68 89
27 35 47 69 97 110
32 46 48 65 98 112
then we start from the top left and progress down the next higher number as if we were snaking our way up the progression. In the above example, we progress down the number from the top left and compare the row order as well as the column order and keeping track of the starting point we return to the number adjacent to the starting point. For example we progress down from 23, 27, 32 and come back to 34 after we have exhausted column order. Similarly we start from 34 and note the next number 45 and traverse down the column to read 34, 35 until we encounter a number that is higher than 45. Notice that the front for the propagation is a diagonal which in this case is 46, 47 and 46. If we keep track of the elements coming out of the front, we can easily sort all the entries in the matrix in a linear order. Each row will contribute one number to the front and we update the front on this row as we include the numbers in our sorted list.. We therefore burn the matrix by the contents across the front and this is another way to solve the problem in O(N) time. Note that if all the numbers in the front are higher than the desired number, we stop right then instead of burning through the rest of the matrix. Thus there is some optimization possible as well
Lastly, I want to add that there is perhaps modifications possible where instead of keeping track of the front, we simply update the start and the end for each row based on the findings from the preceding row. Thus we optimize our original solution to not repeat the entirety of each row but to take triangularly smaller portions as we go down the rows one after another.

bool Exists ( int[,] matrix, int number)
{
// assuming parameter validation
// we go row wise and do a binary chop

int start = 0;
int end = matrix.GetLength(0);
for (int row = 0; row < matrix.GetLength(0); row++)
{
start = 0;
If ( BinaryChop(matrix, row, number, start, ref end))
Return true;
}
return false;
}

bool BinaryChop(int[,] matrix, int row, int number, int start, ref int end)
{
while (start < end)
{
Int mid = checked ( (start + end) / 2);
if (matrix[row, mid] == number) return true;
if (matrix[row, mid] < number)
start = mid + 1
else
end = mid - 1
}
if (matrix[row, start] == number) return true;
if (matrix[row, end] == number) return true;
return false;
}

Cluster computing

Saturday, November 1, 2014

No comments:

Post a Comment