Cluster computing

Wednesday, August 8, 2018

We were discussing the cloud-first strategy for newer workloads as well as migrating older workloads to Object Storage. We did not mention any facilitators of workload migrations but there are many tools out there that help with the migration. We use the IO capture and playbook tools to study the workload profiling. This we can perform in a lab environment or in production as permitted. In addition, there are virtualizers that take a single instance of an application or service and enable it to be migrated without any concern for the underlying storage infrastructure.

It is these kinds of tools we make note of today. These tools provide what is termed as "smart availability" by enabling dynamic movements of workloads between physical, virtual and cloud infrastructure. This is an automation of all the tasks required to migrate a workload. Even the connection string can be retained when moving the workload so long as the network name can be reassigned between servers. What this automation doesn't do is perform storage and OS level data replication because the source and destination is something the users may want to specify themselves and is beyond what is needed for migrating the workloads. Containers and shared volumes come close to providing this kind of ease but they do not automate all the tasks needed on the container to perform seamless migration regardless of the compute. Also, it makes no distinction between Linux containers and docker containers. These tools are often used for high availability and for separating the read only data access to be performed from the cloud.

It should be noted that the application virtualization does not depend on the hypervisor layer. There are ways to do it but it is not required. In fact, the host can be just about any compute as long as the migration is seamless which means it can be on-premise or in the cloud. There is generally a one to one requirement for the app to have a host. One application to many hosts seamless execution is excluded unless they are running in serverless mode. Even so, different functions may be executed one on one over a spun-up host. The host is not taken to be a cluster without some automation of which nodes execute the serverless functions. An application that is virtualized this way is agnostic of the host. This is therefore an extension of server virtualization but with the added benefits of fine-grained control.

We noted that workload patterns can change over time. There may be certain seasons where the peak load may occur annually. Planning for the day to day load as well as the peak load therefore becomes important. Workload profiling can be repeated year round so that the average and the maximum are known for effective planning and estimation.
Storage systems planners know their workload profiles. While deployers view applications, services and access control, storage planners see workload profiles and make their recommendations based exclusively on the IO, costs and performance. In the object storage world, we have the luxury of comparision with file-systems. In a file-system, we have several layers each contributing to the overall I/O of data. On the other hand, a bucket is independent of the filesystem. As long as it is filesystem enabled, users can get the convenience of a file system as well as the object storage. Moreover, the user account accessing the bucket can also be setup. Only the IT can help determine the correct strategy for the workload because they can profile the workload.

Cluster computing

Wednesday, August 8, 2018

No comments:

Post a Comment