Cluster computing

Friday, July 13, 2018

We were discussing the difference between structured and unstructured P2P networks.

In a structured topology, the P2P overlay is tightly controlled usually with the help of a distributed hash table (DHT) The location information for the data objects is deterministic as the peers are chosen with identifiers corresponding to the data object's unique key. Content therefore goes to specified locations that makes subsequent query easier.

Since the query executes locally at the node, performance is greatly improved as compared to distributing the query over many peers.

Unstructured P2P is composed of peers joining based on some rfules and usually without any knowledge of the topology. In this case the query is broadcast and peers that have matching content return the data to the originating peer. This is useful for highly replicated items but not appropriate for rare items. In this approach, peers become readily overloaded and the system does not scale when there is a high rate of aggregate queries.
Sensor data is one which utilizes the Acquisitional Query Processing. Based on query needs, AQP techniques intelligently determine which nodes to acquire data from. Sensor Database generally manifest four different types of querying:
First, the time-series analysis queries are primarily focused on detecting trends or anomalies in archived streams.They can specify sort order, anomalies and not contiguous change patterns. Incident detection falls under this category.
Second, the similarity search queries. In this class of queries, a user is interested in determining whether the data is similar to a given pattern. Again using archived data, the pattern matching helps determine events of interest. Surveillance data querying falls under this category.
Third, the classification queries are related to similarity search queries but these run classification algorithms that group and tag the event data. Determining grades of data is one example of this query processing.
Fourth the signal processing queries. These are heavy computations performed on the data directly such as Fast Fourier Transform and filtering. These enable interpretations that are not possible via grouping, sorting, searching and ranking techniques mentioned earlier.

#codingexercise
void PrintPairsForGivenSum (List <int> sorted, int sum)
{
validate(sorted);
for (int I = 0; i < sorted.length; i++) {
int index = binarySearch(sorted, sum - sorted [i], i);
if ( index != -1 && index != i) {
Console.WriteLine ("{0} {1}", sorted [i], sorted [index]);
}
}
}

Cluster computing

Friday, July 13, 2018

No comments:

Post a Comment