Cluster computing

Sunday, September 16, 2018

Object Storage is perceived as backup and tertiary storage. This may come from the interpretation that this storage is not suitable for read and write intensive data transfers that are generally handled by filesystem or database. However, not all data needs to be written deep into the object storage at once.The requirements for object storage need not even change while the reads and writes from the applications can be handled. There can be a middle layer as a proxy for a file system to the application while utilizing the object storage for persistence. This alleviates performance considerations to read and write deep into the private cloud each time. That is how this Cache Layer positions itself. It offers the same performance as query plan caching does to handle the workload and while it may use its own intermediate storage, it works as a staging for the data so that the data has a chance to age and persist in object storage.

Cache service has been a commercially viable offering. AppFabric is an example of a cache service that has shown substantial improvements to APIs. Since objects are accessed via S3 Apis, the use of such cache service works very well. However, traditional cache services have usually replayed previous request with the help of amortized results and cache writes have been mostly write-throughs which reach all the way to the disk. This service may be looked at in the form of a cloud service that not only maintains a proxy to the object storage but is also a smart as well as massive service that maintains its own storage as necessary.

Cache Service works closely with a web proxy and traditionally both have been long standing products in the marketplace. Mashery is an http proxy that studies web traffic to provide charts and dashboards for monitoring and statistics. This cache layer is well-positioned for web application traffic as well as those that utilize S3 APIs directly. It need not event require to identify callers and clients by requiring apikeys over S3 APIs. Moreover it can leverage geographical replication of objects within the object storage by routing to or reserving dedicated virtual data center sites and zones for its storage. As long as this caching layer establishes a sync between say a distributed or cluster file system and object storage with duplicity-tool like logic, it can roll over all data eventually to persistence

#codingexercise
def discounted_cumulative_gain(relevance, p):
sum-p = relevance(1)
for i in range (1, p+1):
sum-p += relevance(i) / log_base_2(i)
return sum-p

Cluster computing

Sunday, September 16, 2018

No comments:

Post a Comment