Cluster computing

Tuesday, December 27, 2016

Today we continue discussing the problem of finding largest sub matrix of 1 in a boolean matrix using graph based method. we were saying that the depth first search allows us to explore islands of ones one after the other. we used the islands to find the submatrix. We said we could stretch the submatrix rectangle within each island to cover most of the island. We showed a method to push the extremes in terms of minimum row and column and maximum row and column to determine the boundaries of the submatrix. We could find the top left corner for the submatrix but the same method also finds every other corner of the submatrix. Moreover, submatrix thus founds answers the question of finding the largest submatrix in the original matrix.
However, we are not limited to the largest such sub-matrix. The graph based depth first search allows us to explore the islands one after the other. Therefore we can rank the islands based on size. This helps us find the kth largest submatrix as well with the same ease as we did with the largest matrix. The smallest submatrix is usually of size one. where a single cell occurs as an island. This is also listed with the islands encountered and since we can sort them based on the size, we have the smallest submatrix appearing at the tail of the list.
This method finds the number of submatrices, their locations inside the matrix alongwith their sizes. Consequently we can tell which part of the matrix has the highest number of such submatrices included. None of the submatrices overlap otherwise they wouldn't be islands, consequently we can determine a bounding rectangle that contains more than a threshold number of submatrices we are interested in. This bounding rectangle gives the portion of the original matrix that is heavier in 1s and consequently determines how lop sided the matrix is. The fraction of the contained submatrices within a half of the original matrix to the total number of submatrices encountered gives us a quantitative number for how heavy or lop sided the original matrix is. Finding this bounding rectangle is a matter of arranging the submatrices by their top left corners and progressively increasing the bottom right of the bounding rectangle until a threshold number of rectangles are completely included within the bounding rectangle or half the original square is reached depending on the choice of our selection criteria.

Cluster computing

Tuesday, December 27, 2016

No comments:

Post a Comment