This is a continuation of series of
articles on hosting solutions and services on Azure public cloud with the most
recent discussion on Multitenancy here This article discusses the
architectural considerations in using CosmosDB for multitenant solutions.
A multitenant application solution
can involve a CosmosDB storage for the following reasons: for near-limitless
storage, as a document store, for multi-region availability and separation of
read-write and read-only accesses, for high throughput and universal web
access. Features of Azure Cosmos DB that support multitenancy include: 1)
Partitioning , 2) Managing request
units, 3) Customer-managed keys, 4) isolation models and 5) hybrid approaches.
We begin partitioning with CosmosDB
containers. We create containers that are shared across multiple tenants. The
tenant identifier can be used as a partition key or multiple partition keys can
be used for a single tenant. A well-planned partitioning strategy implements
sharding pattern. A shard typically contains items that fall within a specified
range determined by one or more attributes of data. These attributes form the
shard key and should be static since they are not based on the data that might
change. This helps with physically organizing the data. All data accesses are
directed towards the appropriate shard.
The sharding logic can be implemented as part of the data access code or
it could be implemented by the data storage system. CosmosDB spreads the
tenants across multiple physical nodes to achieve a high degree of scale.
CosmosDB’s pricing model is based on
the number of request units per second that are provisioned or consumed. A
request unit is a logical abstraction of the cost of database operation or
query. Throughput is a defined number of request units per second that is
provisioned for a workload. The determination of throughput affects the
performance and price of CosmosDB resources.
When separate containers are
provisioned for each tenant within a shared database, the request units can be
provisioned for the entire database and all the tenants can share them. But this approach may be susceptible to the
noisy neighbor problem because a single tenant’s container might overuse the
shared provisioned request units. Once the noisy tenants are identified, they
can be put into a dedicated container and given their own request units.
CosmosDB also provides a serverless tier which is suitable for workloads that
have intermittent or unpredictable traffic. These approaches can be mixed and
matched.
Some tenants might require their own
encryption keys. Such tenants need to be deployed using dedicated CosmosDB
accounts.
There are several forms of isolation
models. These include: shared containers with partition keys per tenant,
containers with shared throughput per tenant, containers with dedicated
throughput per tenant and a database account per tenant. Each of these models
have isolation options.
The shared containers with partition
keys per tenant allows us to share throughput across tenants grouped by
container which lowers cost on noisy tenants or enable easy queries across
tenants by putting those tenants in a container since the queries are bounded
by containers.
The container with shared throughput
per tenant allows throughput across tenants to be shared because they are
grouped by database or provide easy management of tenants since a container can
be dropped when a tenant leaves.
The container with dedicated
throughput per tenant has independent throughput options and eliminates noisy
neighbors or allows the tenants to be grouped within database accounts based on
regional needs.
The database account per tenant has
independent geo-replication knobs since it is per tenant or it could provide
multiple throughput options where the dedicated throughput eliminates noisy
neighbors.
These constitute
some of the approaches for designing multitenant solutions with CosmosDB
storage.
Reference: Multitenancy: https://1drv.ms/w/s!Ashlm-Nw-wnWhLMfc6pdJbQZ6XiPWA?e=fBoKcN
No comments:
Post a Comment