Cluster computing: cloud storage

Friday, May 31, 2013

cloud storage

When we deploy applications and database to the cloud, we have to pay attention to the size of the data stored. If the data is in the form of files or in the database, they can arbitrarily large. When the data exceeds several Gigabytes and grows at a considerable rate, the local storage in a VM does not suffice. At that point, a dedicated Storage area network is usually the norm because it can provide a capacity much larger than any disks. Typically for production database servers SAN is preferred. But this is true even for file shares. There are storage appliances that provide large secondary and tertiary storage while making them available as a mounted share visible to all Active Directory users. This is different from the Network Access storage that is spread out in the form of VMs. Data in the case of NAS does not reside on a single virtual host and there is management incurred in finding the VM with the data requested.
That means we need to plan rollout of our software with appropriate configurations. The storage media, the local or remoteness of the storage, the escalation path for incident reports, all need to be decided before deployment. This is important not just for planning but for the way the application software is written. For example, data traffic and network chattiness can be reduced if the round trips and redundancies between the application and input was reduced. Redundancy in operations is often ignored or undetected for want of features. For example, data files are copied from one location to another and then to a third location. If this were to be eliminated by either reducing the data to the metadata that we are interested in and/or copying the file only once to the target destination, we avoid redundancy.

Cluster computing

Friday, May 31, 2013

cloud storage

No comments:

Post a Comment