Cluster computing

Friday, December 21, 2018

Today we continue discussing the best practice from storage engineering:

205) Social engineering applications also have a lot of load balancing requirements and therefore more number of servers may need to be provisioned to handle their load. Since a storage tier does not necessarily expose load balancing semantics, it could call out when an external load balancer must be used.

206) Networking dominates storage for distributed hash tables and message queues to scale to social engineering applications. Whatsapps’ Erlang and FreeBSD architecture has shown unmatched symmetric multiprocessing (SMP) scalability

207) Unstructured data generally becomes heterogenous because there is no structure to hold them consistent. This is a big challenge for both data ingestion as well as machine parsing of data. Moreover, the data remains incomplete as its form and heterogeneity increases.

208) Timeliness of executing large data sets is important to certain systems that cannot tolerate wide error margins. Since the elapsed and execution time differ depending on the rate at which tasks get processed, the overall squeeze meets rock hard limits unless there are ways to tradeoff between compute and storage.

209) Spatial proximity of data is also important to prevent potential congestion along routes. In the cloud this was overcome with dedicated cloud network and direct access for companies but this is not generally the norm for on-premise storage.

210) Location-based services are a key addition to many data stores simply because location gives more interpretations to the data tha solve major business use cases. In order to facilitate this the location may become part of the data and maintained routinely or there must be a service close to the storage that relates them

Cluster computing

Friday, December 21, 2018

No comments:

Post a Comment