We were discussing the parallel Breadth-First-Search.
Graph-based computations are pervasive in modern day applications where important entities need to be identified and ranked, anomalous patterns need to be detected, sudden changes in networks need to be detected or tightly interconnected cluster of entities need to be found. In addition, they may be highly data intensive. For example, an undirected graph may have upwards of 4 billion vertices and 40 billion edges. A breadth first traversal that can traverse billions of edges / second is then required. This is possible only if we break away from the time blocking sequential access.
There are two highly tuned parallel approaches for BFS on large parallel systems:
1) a level synchronous strategy that relies on a simple vertex based graph partitioning and
2) a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
We discuss the second approach today. This involves 2D partitioning. Here the idea is that we each BFS iteartion is equivalent to a sparse matrix -sparse vector multiplication. The adjacencies can only be atmost N nodes so the vector can only have N dimensions and not all of them are used.
The exploration of level k in BFS is equivalent to transposing the adjacency matrix and performing a matrix vector multiplication on an element wise multiplication of a complement to dense parents array. Since the parents array is a boolean array, a complement operation inverts it elementwise. The algorithm always picks the parent with the vertex as the highest label.
The algorithm is defined as:
Procedure BFS_2D(A, s)
f(s) <- s
for all processors P(i,j) in parallel do
while f != null do
TransposeVector(fij)
fi <- all_gather_v( fij , P(:, j) )
ti = Aij matrix-multiply-with fi
tij = all_to_all(ti, P(i,:))
tij = tij element_wise_multiply pi(i, j) complemented
pi(i,j) = pi(i,j) + t(i,j)
fij <- tij
Full discussion : https://1drv.ms/w/s! Ashlm-Nw-wnWtkxOVeU-mbfydKxs
This section courtesy Buluc, Madduri et al.
Graph-based computations are pervasive in modern day applications where important entities need to be identified and ranked, anomalous patterns need to be detected, sudden changes in networks need to be detected or tightly interconnected cluster of entities need to be found. In addition, they may be highly data intensive. For example, an undirected graph may have upwards of 4 billion vertices and 40 billion edges. A breadth first traversal that can traverse billions of edges / second is then required. This is possible only if we break away from the time blocking sequential access.
There are two highly tuned parallel approaches for BFS on large parallel systems:
1) a level synchronous strategy that relies on a simple vertex based graph partitioning and
2) a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
We discuss the second approach today. This involves 2D partitioning. Here the idea is that we each BFS iteartion is equivalent to a sparse matrix -sparse vector multiplication. The adjacencies can only be atmost N nodes so the vector can only have N dimensions and not all of them are used.
The exploration of level k in BFS is equivalent to transposing the adjacency matrix and performing a matrix vector multiplication on an element wise multiplication of a complement to dense parents array. Since the parents array is a boolean array, a complement operation inverts it elementwise. The algorithm always picks the parent with the vertex as the highest label.
The algorithm is defined as:
Procedure BFS_2D(A, s)
f(s) <- s
for all processors P(i,j) in parallel do
while f != null do
TransposeVector(fij)
fi <- all_gather_v( fij , P(:, j) )
ti = Aij matrix-multiply-with fi
tij = all_to_all(ti, P(i,:))
tij = tij element_wise_multiply pi(i, j) complemented
pi(i,j) = pi(i,j) + t(i,j)
fij <- tij
Full discussion : https://1drv.ms/w/s!
This section courtesy Buluc, Madduri et al.
No comments:
Post a Comment