Cluster computing

Saturday, May 19, 2018

We were discussing the parallel Breadth-First-Search using the 2D partitioning approach.
This is the latter case of two highly tuned parallel approaches for BFS on large parallel systems:
1) a level synchronous strategy that relies on a simple vertex based graph partitioning and
2) a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
The first approach was about partitioning
The second approach was about mitigating communication overhead
In the second approach, the idea is that with each BFS iteration is equivalent to a sparse matrix -sparse vector multiplication (SpMSV). The matrix is the adjacency matrix which is decomposed equally into a number of processors. The vector is the frontier with integer variables. The algorithm does not store the previous frontiers explicitly as multiple sparse vectors. Instead it computes the breadth first spanning tree by returning a dense parents array. The parent is always chosen to be a vertex with the highest label. The parents array is used in the element-wise with the result of SpMSV. The inner loop performs a single level traversal
All vectors and input matrix are distributed 2D wise. For example, f which is an initially sparse vector. Each n/p sized piece of the vector takes a progressive section of the adjacency matrix. The vector operations are also parallelized using loop-level parallelism.
In the 2D processor assignment, the subvectors are accumulated at all the processors on the jth column and each processor after performing its SpMSV scatters the corresponding piece of its intermediate vector to its owner along the ith processor row.
#codingexercise
Write a function to reduce the keys with 0 values in a javascript object:
var items = { "1" : 0, "2": 0, "3": 0, "4":0, "5": 55, "6": 0, "7":0, "8" : 0, "9": 0 }
Object.entries(items).forEach(([key, value]) => {
if (value === "" || value === null || value === 0){
delete items[key];
}

});
document.write(JSON.stringify(items));
{"5":55}

Friday, May 18, 2018

We were discussing the parallel Breadth-First-Search using the 2D partitioning approach.
This is the latter case of two highly tuned parallel approaches for BFS on large parallel systems:
1) a level synchronous strategy that relies on a simple vertex based graph partitioning and
2) a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
In the second approach, the idea is that with each BFS iteration is equivalent to a sparse matrix -sparse vector multiplication. The adjacencies can only be atmost N nodes so the vector can only have N dimensions and not all of them are used.
The algorithm computes the breadth first spanning tree by returning a dense parents array. The inner loop performs a single level traversal
The exploration of level k in BFS is equivalent to transposing the adjacency matrix and performing a matrix vector multiplication on an element wise multiplication of a complement to dense parents array. Since the parents array is a boolean array, a complement operation inverts it elementwise. The algorithm always picks the parent with the vertex as the highest label.
All vectors and input matrix are distributed 2D wise. For example, f which is an initially sparse vector. Each n/p sized piece of the vector takes a progressive section of the adjacency matrix. Notice the algorithm is written in a gather-appy-scatter strategy Since the matrix and the vectors are distributed, the algorithm proceeds parallely.
There are two distinct communication phases in a BFS algorithm- an expand phase over the processor column and a fold phase over the processor row.
In any parallelization, we have partition of tasks and communication overhead. If we have non-overlapped partition through out the algorithm, we can have least communication. This works for batch processing map-Reduce method in Big Data. For BFS algorithm there is a tradeoff that needs to be struck between the partitioning and the communication. This approach cited here made the communication efficient but there are other ways to tune a hybrid approach.

Thursday, May 17, 2018

We were discussing the parallel Breadth-First-Search.
Graph-based computations are pervasive in modern day applications where important entities need to be identified and ranked, anomalous patterns need to be detected, sudden changes in networks need to be detected or tightly interconnected cluster of entities need to be found. In addition, they may be highly data intensive. For example, an undirected graph may have upwards of 4 billion vertices and 40 billion edges. A breadth first traversal that can traverse billions of edges / second is then required. This is possible only if we break away from the time blocking sequential access.
There are two highly tuned parallel approaches for BFS on large parallel systems:
1) a level synchronous strategy that relies on a simple vertex based graph partitioning and
2) a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
We discuss the second approach today. This involves 2D partitioning. Here the idea is that we each BFS iteartion is equivalent to a sparse matrix -sparse vector multiplication. The adjacencies can only be atmost N nodes so the vector can only have N dimensions and not all of them are used.
The exploration of level k in BFS is equivalent to transposing the adjacency matrix and performing a matrix vector multiplication on an element wise multiplication of a complement to dense parents array. Since the parents array is a boolean array, a complement operation inverts it elementwise. The algorithm always picks the parent with the vertex as the highest label.
The algorithm is defined as:
Procedure BFS_2D(A, s)
f(s) <- s
for all processors P(i,j) in parallel do
while f != null do
TransposeVector(fij)
fi <- all_gather_v( fij , P(:, j) )
ti = Aij matrix-multiply-with fi
tij = all_to_all(ti, P(i,:))
tij = tij element_wise_multiply pi(i, j) complemented
pi(i,j) = pi(i,j) + t(i,j)
fij <- tij

Full discussion : https://1drv.ms/w/s!Ashlm-Nw-wnWtkxOVeU-mbfydKxs
This section courtesy Buluc, Madduri et al.

Wednesday, May 16, 2018

We now proceed to the discussion on parallel Breadth-First-Search.
Graph-based computations are pervasive in modern day applications where important entities need to be identified and ranked, anomalous patterns need to be detected, sudden changes in networks need to be detected or tightly interconnected cluster of entities need to be found. In addition, they may be highly data intensive. For example, an undirected graph may have upwards of 4 billion vertices and 40 billion edges. A breadth first traversal that can traverse billions of edges / second is then required. This is possible only if we break away from the time blocking sequential access.
There are two highly tuned parallel approaches for BFS on large parallel systems:
1) a level synchronous strategy that relies on a simple vertex based graph partitioning and
2) a two-dimensional sparse matrix partitioning-based approach that mitigates parallel communication overhead.
These approaches are also used to benchmark supercomputers. For example, Graph-500 is used to rank them on their performance with BFS. Distributed memory systems are a hallmark of supercomputers.
BFS on a distributed memory systems architecture involves communications between the processors and a distribution of graph partitions among them. These graphs are generally sparse and the number of edges is usually to the scale of a constant factor times the number of vertices. The average path length in these graphs is a small value bounded by log of the number of vertices.
The approach 1 is straightforward where the iterations over each vertex and its adjacencies are parallelized.
The distance update and the enqueing are atomic. There is a barrier synchronization step for each level. The optimizations involved include:
1) ensuring that the edge visit steps are load-balanced.
2) the synchronization costs are reduced
3) the locality of memory references is improved.

Full discussion : https://1drv.ms/w/s!Ashlm-Nw-wnWtkxOVeU-mbfydKxs

Tuesday, May 15, 2018

We revert back to the discussion on incremental. Breadth-First-Search.
In large social engineering graphs, there are updates of about 86,400 operations per second.The store-and-static-compute model worked because updates were batched and then graph processing applied on static snapshots from different points in time. It worked so long as the graph modifications were less frequent than static processing. With dynamic graph processing, we need a new framework. One such framework proposed was GraphIn which introduces a new programming model called Incremental-Gather-Apply-Scatter.
Incremental-Gather-Apply-Scatter can be applied to sub-graphs.

Full discussion : https://1drv.ms/w/s!Ashlm-Nw-wnWtkxOVeU-mbfydKxs

#codingexercise

int GetLongestIncreasingSubsequenceCount(List<int> A)
{
int n = A.Count;
int best = new List<int>(Enumerable.Repeat(0,n));
for (int i = 1; i < n; i++)
for (int j = 0; j < i; j++)
{
if (A[i] > A[j] && best[j] + 1 > best[i]){
best[i] = best[j] + 1;
}
}
return best.max();
}

Monday, May 14, 2018

Today I present the summary of a book:
Book summary:
Title: Behaving Badly
- New Morality in Politics, Sex, and Business
Author: Eden Collinsworth
About the Author - Ms.Collinsworth has been helping the Chinese business people on western deportment. She is also the author of "I stand corrected: How Teaching Western Manners in China Became its own unforgettable lesson".
She chose this topic because she seems sure that the culture in China resides comfortably in what the west considers moral ambiguity. At the same time she feels the shame in the west has become less and less when it comes to money
This book is a collection of true anecdotes that explore new and emerging trends in morality and their self-justification.
Part one:
"Wherein I Begin with the Definition of the Word": The author moved from China to England and sold her apartment in US to find a perfect place to start. She decided mailing people for their stories and interviewing them was better than researching in a library.
"According to a Convicted Murderer, It Has to Do with Character": A convicted murdered talks about learning morality in prison and it has to do with character. He seems to say it comes between impulse and action.
"A Neuroscientist Explains the Evolutionary Origins of Morality": This chapter tries explaining the role of natural selection where there are certain genetic traits that win over others.She discusses the "selfish gene theory" and "reciprocal altruism" theory where the motivation to get along with others comes from selfish expectations.
"A Brief History of Mankind's Attempts to Rein in Bad Behavior": The author makes the observation religions are more convinced of their own version of the moral truth and it becomes particularly noteworthy when they pronounce something incredible and ludicrous.
The next set of chapters describes a scorecard for morality.
"The Editor of the Financial Times Provides a Cost-Benefit Analysis of Principles": Business is rife with examples where cheating improves bottomline. Morality becomes beside the point in the institutional making, spending and the accumulation of money. Moral calculation even involves cost-benefit of the scenario where a perpetrator is caught and the fines are paid but the outlook improves.
"Instructions on How Not To Cheat": The author has heard that you can't impose a moral standard on people who don't wish to be moral. Nash's game thory is explained to her as where the participants will view any premise as a system with a framework of fixed points and variables that can be manipulated to achieve a specific result and they will try to maximize personal benefit and minimize risk.
"Pros and Cons of Doing the Right Thing": This chapter focuses on whistleblowers and why they do it. She argues that morality sustains itself only as long as we are determined to do the right thing. In particular she brings up an example of a westerner who challenges the unconditional loyalty in Japan that cloaks gross embezzlement.
"The Law: Tools of Control, or Instruments of enlightenment": This section focuses on the difference between law and justice. She brings up examples where we try to outwit the law and yet we maintain an inner higher standard where even if we don't break the law, we don't engage in it. Ethics on the other hand is what an organization decides to hold itself to. The author asserts that ethics is local and regional.
"The Political Function of Ethics": This section focuses on why the politics have no relation to morals. She uses mass shootings as an example. She also brings examples of "Make America Great Again" and from international politics.
The next set of chapters discusses Sex as a moral provocateur
"Monogamy (Not So Much Anymore)": These are easy reading examples of where the notion of marriage and till death do us apart has been adapted to suit one's needs. She talks about research in marital infidelity.
"The Screen as a Siren": Much of the examples the author mentions here go to show that technology has created a hyper-sexualized culture. She brings up cases where free pornography has proliferated or where certain people were aware of the codes of conduct but could not live up to them.
"Immoral women: Or just those having a better time": She brings up several stories of international divas and what they uphold or the impact they are having even in the face of the perception they are held to.
The next section focuses on taking the bother out of morality:
"Celebrities as Standard-bearers": This is probably the most riveting section of the book where she talks about the matriarch in the Kardashians family taking a sex tape and starting a multi-million dollar business that would have otherwise been forgotten
"Reality Redefined": This section talks about the role of artificial intelligence to go through an indulgence in sense gratification of all sorts. Some examples are in fact revealing in how we take new experiences as something not to be judged for.
"The Web Wonders What's So Great About The Truth": She talks about extensive research in how new technology has been changing the way we think and process information. She talks about neurological tools that enable us to remain moral. Self-control is one. Empathy is another. The author admits to a guarded reservation in thinking of the internet as something to futher measured intelligence.
"Ethically Sanitized Warfare": This talks about the morality in drone attacks - in particular, the argument that some lives are not as important as others and where as well as how they are decided.
"Immorality's Black Sun": This talks about some of the survivors and perpetrators of the Holocaust. The author returns to New York to some of them.
The next section talks about "The Future, or Something Like It":
"The Moral Vagaries Of Making Babies": This chapter cites some examples of the "edgy variations" on an alternative lifestyle particularly with regard to creating and even raising a family.
"Mapping a post-gay culture": This chapter explores the church's unwillingness to recognize sex, gender, current events and social power leading it to retreat on several fronts.
"Is it Progress If We Barter With Ethics": This is a discussion on population and food. Some examples include FDA condoning of techniques to enable DNA to be altered and edited.
"Programming Morality in Robots": This section focuses on how robots and software can detect moods and how our responses alter and play with moods.
"So Who Exactly Gets To Set The New Rules": This section introduces us to singularity and cosmism. The former is the notion to further human abilities in a man-machine unified way. The latter is the theory of human evolution where concepts of truth, reality, freedom and happiness will be deeply revised.
"WhereIn I Conclude By Looking Forward": This section talks about the next generation's perspective and not just that of the author.
"Epilogue": The book concludes with a take on modern day news, views and events.

Sunday, May 13, 2018

The user-defined SQL queries comparable to the user-defined functions in GraphIn include:
1. meta_computation:
This computation assigns vertex parent id and vertex degree. The corresponding SQL query
UPDATE Node
SET n.Degree = t.Count
FROM Node n
INNER JOIN (select e.Dest as ID, count(*) as Count
from Edge e
group by e.Dest
having e.Dest = NodeID) as t
on n.ID = t.ID
WHERE n.ID = NodeID

The LevelID is assigned based on the previously visited element. If there are no previously visited elements these are initialized to special values
[Aside: we could use recursive queries here for a levelwise tree]
WITH recursiveNode( level, id, parent_id, degree) AS
(SELECT 1, parent.id, parent.parent_id, parent.degree
FROM Node parent
WHERE parent.parent_id is NULL
UNION ALL
SELECT parent.level+1, child.id, parent.id, child.degree
FROM recursiveNode parent, Node child
WHERE child.id in (SELECT e.Dest FROM Edge e WHERE e.Src = parent.id))
SELECT level, id, parent_id, degree
FROM recursiveNode;

2. build_inconsistency_list:
This builds an inconsistency list that contains vertices with incorrect depth values with levels maintained in a minimum priority queue

3. check_property
The BFS depth property is checked in each case. This processing determines whether the current approach of windowing or incremental graph updates can be tolerated or whether the incremental approach should be abandoned in favor of a global update. When the frontiers are all downstream, a BFS can make progress across two snapshots – current and next because the visited nodes are not going to be visited again. A check_property in GraphIn determines based on a threshold such as the number of nodes affected in the immediate adjacencies. This is a constant time lookup versus a check of whether a node has been visited. Let us say we took materialized views: one for node and another for edges, for all the nodes visited, their parent_id might be assigned so a check for whether a node was visited merely involves checking if the node is root or has a parent_id.
Should the check_property fail, the views are dropped. If this check passes, the processing is incremental.
Since materialized view updates are costly due to disk access, it helps to make writes progressive as BFS unfolds. Optimistic frontier lookup may be cheaper than visited nodes lookup if we tolerate eventual consistency. The GraphIn approach does not state that the check_property evaluates the presence of adjacencies in the inconsistency list. Its goal is to determine if large portions of the graph are in an inconsistent state so as to avoid IGAS processing in favor of GAS processing. It allows the user to set a heuristic such as switch to static BFS if > 30% of inconsistent vertices have BFS depth < 2. This heuristic is based on the data as well as the type of updates being made to the evolving graph. But a heuristic does not guarantee consistency and idempotent operations.
With the help of materialized views and stream processing, we have the option of making this consistent as long as we can progress incrementally from start to finish without failing our check. If our check fails, we begin all over again. To check whether the graph evolved for the visited nodes, we can compare timestamps of our visit and those of the modified. Since the updates to our views are expected to be more recent than the original update to the node, we know if the upstream nodes have changed During the check we also need to process notifications from subscription to the inconsistency list that has evolved over the last duration. Making a clone of the entire graph nodes in our views is a way of separating evolution of the graph from analysis.
The logic above can also work with a starting point as a global snapshot of the graph and receiving an update activity set in batches each of which helps build the inconsistency list. This is different from receiving list of nodes that have been updated from the update activity during the analysis activity. In the static computation case, when the update is directly handled after a static pass, the inconsistency list results in changes in parent_id, level of affected nodes to the result from the static pass. Each iteration may grow or shrink the inconsistency list. At any point, the inconsistency list may determine if the update needs to be run statically on a new snapshot.
GAS works per vertex. On the other hand, all SQL queries can work on partitions when the user defined aggregates permit it. For example, we can partition vertices along with their outbound edges while synchronizing at the next level.
4. activate_frontier
The frontier is activated with the nodes having inconsistent depth
5. update_inconsistency_list
The frontier vertices are removed and inconsistent successors are added to inconsistency list
6. merge_state
All insertions and deletions to the Graph are merged. We do not exclude any because BFS requires both information before vertex level is recalculated.