Monday, January 27, 2014

In this post, we continue our discussion on AWS particularly the storage options. As we have seen, AWS is a flexible cost-effective easy to use cloud computing platform. The different choices for storage are memory based storage such as file caches, object caches, in-memory databases, and RAM, message queues that provide temporary durable storage for data sent asynchronously between computer systems or application components etc. Other options include storage area networking (SAN) where virtual disk LUNs often provide the highest level of disk performance and durability, Direct attached storage where local hard disk drives or arrays residing in each server provide higher performance than a SAN, Network attached storage which provide a file level interface that can be shared across multiple systems. Finally, we have the traditional databases where structured data resides as well as a NoSQL non-relational database or a data warehouse. The backup and archive include non-disk media such as tapes or optical media.

These options differ in the performance, durability and cost as well as in their interfaces. Architects consider all these factors when making choices. Sometimes, these combinations form a hierarchy of data tiers. These Amazon simple storage service is storage for the internet. It could store any amount of data, at any time, from within the compute cloud. or from anywhere on the web. Writing, reading and deleting objects of any sizes is now possible. It is also highly scalable allowing concurrent read write access. Amazon S3 is commonly used for financial transactions and clickstream analytics or media transcoding

The performance of Amazon S3 from within the Amazon EC2 in the same region is fast. It is also built to scale storage, requests and users. To speed access to the relevant data, Amazon S3 is also often paired with a database such as DynamoDB or Amazon RDS. Amazon S3 stores the actual information while the database serves as a  repository for metadata. This metadata can be easily indexed and queried. In this case it helps to locate an object's reference with the help of a query.

Durability is guaranteed via automatic and synchronous saving of data across multiple devices and multiple facilities. Availability for mission critical data is designed for such a high percentage that there is miniscule or no downtime. For non-critical data, reduced redundancy storage in Amazon S3 can be used.

No comments:

Post a Comment