Cluster computing

Address Book:
Most address books are proprietary in nature or at least tied to a software stack. Active Directory for instance does not let you access the address book without tightly integrating it with their network, enabling access only with their protocol and query language and requiring their own access routines.
Contact Books from Phone Applications to membership directories are proliferating on the other hand and often taking up space and consuming resources on the other hand while remaining local and often out of date.
Somewhere between these two extreme examples of address books, we see the need for universally accessible cloud storage for a directory representing the address book and its natural querying over the World Wide Web from any device anywhere.
Since the address book is primarily an unstructured data on the wire, there is very little need to see it as one requiring the onus of a structured storage at least for the majority of the use cases. Besides these access requirements, object storage brings out the best in storage engineering to keep the data durable and available in the cloud.
An object store offers better features and cost management than most competitors such as relational databases, document data stores and private web services. The existence of data library product in the market on a variety of stacks and not just the Address Book software providers indicate that this is a lucrative and a commercially viable offering. MongoDB for instance competes well in this market. From database servers to an appliance, the data library product is viewed as a web accessible storage but it provided with form of constraints to its access. Object storage not only facilitates this web access but also provides no restraints on its access making it the most flexible platform for dynamic queries and analysis. It does not treat the physical data stores as a mere content while providing storage engineering best practice on all the data.
At the same time, a document stores provides all aspects of data storage for storing and querying an address book. It is not a traditional address book but it addresses most of the requirements. A traditional address book has been bound to a database for long and we have argued that it does not need to be so for non-transactional read and write. In json, these appear as nested fields and are pre-joined into objects. The frequently accessed objects can be in the cache as well. A search engine provides search over this catalog. Since the address book entries are independent just like in a document or catalog, we can refer its access to be similar to that for a catalog. Therefore, functional data access is provided by the Catalog API. The API and the engine separately cover all operations on the catalog. The API can then be used by downstream units. A search engine allow shell like search operators which can be built on Lucene/Solr Architecture. A Lucene index keeps track of terms and their occurrence locations but the index needs to be rebuilt each time the catalog changes. The Catalog API can retrieve results directly from the catalog or via the search engine. In both cases, the customer only issues a single query.
In a sense, this is very similar to how we want the object storage to store the catalog and make it web-accessible. The data is available for retrieval and iteration directly out of the store while hierarchy and dimensions may be maintained by a web service over the Object Storage.

Cluster computing

Monday, June 10, 2019

No comments:

Post a Comment