Oracle Solaris ZFS demonstrated the use of resources and quotas. With the evolution towards cluster-based computing, Network Accessible Storage significantly widened the horizon for unlimited disk space because capacity could now be added in the form of additional nodes and their disks. This represented a virtualized storage but there were no reservations. While ZFS demonstrated effective resource management with isolations for workloads, the cluster would not keep up with the same practice without some form of roles in the form of control and data nodes. In addition, the reservations need not be governed exclusively by the user. They can be decided by system as quality-of-service levels. To achieve service levels, on the same shared disks, we can create disk groups.
There will be times when node disk groups become
overloaded by I/O requests. At such time, it will be difficult to identify
where the I/O requests are predominantly originating from so that those
accounts could be throttled while well-behaved accounts are not affected. Each
node disk group keeps track of accounts that issue the I/O requests. The system
can then use a Sample-Hold algorithm to track the request rate history of the
top N busiest accounts. This information can then be used to determine whether
an account is well-behaved or not. If the traffic reduces when the account is
throttled, it becomes well-behaved. If a node disk group is getting overloaded,
it can use this information to selectively limit the incoming traffic,
targeting accounts that are causing the issue. For an example of a metric to
serve this purpose, a node disk group can compute a throttling probability of
the incoming requests for each account by determining the request rate history
for the account. If the request rate is high, it will have a higher probability
of being throttled. The opposite also holds. When the metric builds up a
history of measurements, it will be easier to tell if the account is
well-behaved.
Load balancing will continue to keep the servers
loaded within an acceptable limit. If the access patterns cannot be load
balanced, then there is probably high traffic. In such cases, the accounts will
be limited, and they will be well-behaved again.
A node may have more than one disk group so it
can form logical partitions in its existing disk space while not requiring
cluster level changes in providing service levels from its disks. This is the
intelligent node model. If the cluster can perform more effectively by grouping
data nodes or their disks and with the help of dedicated control nodes, then
the data nodes are freed up to focus on the data path and merely read-write
from disk groups. In either case the group id is merely a part of the metadata.
The word group is generally used with
client-facing artifacts such as requests and the system resources such as disks
are referred to as pools. By that definition, I’m sorry for using the word
group and the above argument could be read as for disk pools.
No comments:
Post a Comment