Cluster computing

Saturday, August 27, 2022

Databases as a Multitenant Provider

Persistence of data mandates a database for the guarantees of Atomicity, Consistency, Isolation and Durability as well as for querying. Tenants can be served their own databases in which case the database becomes a multi-tenant data platform and the technology stack for handling the data might vary from tenant to tenant. Groups of database can be put into a container to manage them as a single unit. The benefits of a multitenant database include the following:

1. High consolidation density where containers of database makes it easier to manage databases than the conventional single database per tenant

2. Rapid provisioning and cloning using SQL where the database can be created, populated and cloned using SQL scripts.

3. Rapid patching and upgrades where the patching of individual databases inside a container can be accomplished by plugging out and plugging in the database. The patching of the container is also fast.

4. Converting and consolidating existing databases into containers so that they can be managed as a unit.

5. Resource management between pluggable databases in a container

Multitenancy brings standardization which reduces the operating expense. Cost often increases with variations in size and traffic on databases. Customers benefit when the databases can scale out and remain elastic. Both operational expense and capital expense reduces.

It is also possible to host distinct application backends into the same database and this approach is called schema-based consolidation. This reduces the number of databases and increases the consolidation density. There are a few caveats to this approach which include 1. Name collision hampers schema-based consolidation, 2. It brings weak security 3. Per application backend point-in-time recovery is prohibitively difficult. 4. Resource management between application backends is difficult, 5. Patching for a single application backend is disallowed and 6. Cloning a single application backend is not permitted.

The difference between dedicated database architecture and this multitenant architecture is that the latter brings true within-database virtualization. This is implemented by partitioning the data dictionary horizontally. Within database virtualization removes the drawbacks of schema-based consolidation. The physical separation between the container and the databases brings pluggability. This unplug/plug action set brings a new paradigm for patching. The sharing model and the background processes is the same for conventional, pluggable databases and application backends. The container or root is the new tenet of the multitenant architecture. Just as operating systems orchestrate provisioning tasks, SQL statements executed against the root implement them for pluggable databases.

In this case, multitenancy becomes a deployment choice with neither the application backend nor the client code required to change.

Reference: Multitenancy: https://1drv.ms/w/s!Ashlm-Nw-wnWhLMfc6pdJbQZ6XiPWA?e=fBoKcN    

Friday, August 26, 2022

Databases as a Multitenant Provider

The previous article described the multitenancy in databases using schema-based consolidation and within-database virtualization by partitioning the data dictionary horizontally. This article focuses on elastic pools, sharding patterns, row-level security and key management.

Elastic pools share compute resources between several databases on the same server. This helps to achieve performance elasticity of each database. The sharing of provisioned resources across databases reduced their unit costs. There are built-in protections against noisy neighbor problems.

Resource management in dense elastic pools is achieved by implementing resource governance.

Within a database there can be multiple resource pools and workload groups, with resource limits set at both pool and group levels. User workloads and internal workloads are classified into separate resource pools and workload groups. User workload on the primary and readable secondary replicas into a designated pool and partitioned workload groups for various internal workloads. Other pools and workload groups may be reserved for various internal workloads.

In addition, job objects may be used for process level resource governance and File Server Resource Manager may be reserved for storage quota management.

Resource governance is hierarchical in nature. From top to bottom, limits can be enforced at various levels using their level appropriate mechanisms starting with the operating systems, then the resource pools and the workload groups. Data I/O governance limits both the read and the write physical I/O against data files of a database. IOPS limits are set for each service level to minimize the noisy neighbor effect. This approach allows customers to use dense elastic pools to achieve adequate performance and major cost savings. The only shortcoming with this approach is that dense competition gives rise to significant resource contention which can impact internal processes. One of the following three mitigation actions can be chosen by the customers. First, the query workload can be tuned to reduce resource consumption. Second, the pool density can be reduced by moving some databases to another pool and 3. the pool can be scaled up to get more resources.

The Sharding pattern enables to scale the workloads across multiple databases. Tools provided by the databases support the management of shard maps which track the tenants assigned to each shard. They also initiate and track queries and management operations on multiple shards by using elastic jobs.

These jobs can be periodically executed against one or many databases to run queries and perform maintenance tasks. The scripts must be defined, maintained, and persisted across a group of databases

Row-level security is useful for enforcing tenant level isolation in sharded tables. Group memberships or execution contexts are used to control access to the rows in a database table. It simplifies the design and coding of security in the application and implements restrictions on data row access. This access restriction logic is based out of the database tier rather than one spanning the application tier. This system is made more reliable and robust by reducing the surface area of the security system.

End-to-end encryption of data at rest and in transit is achieved through encryption keys and separation of databases for each tenant and always enabling the always encrypted feature.

Storage and data approaches for multitenancy must consider scale, performance predictability, data isolation, complexity of implementation, complexity of management and operations, costs, patterns and anti-patterns and best practices.

Thursday, August 25, 2022

Sample program to add claim to token in delegated auth use case:

using System.IO;

using IdentityClaim = Microsoft.IdentityModel.Claims.Claim;

using IdentityClaimTypes = Microsoft.IdentityModel.Claims.ClaimTypes;

using IdentityClaimsPrincipal = Microsoft.IdentityModel.Claims.ClaimsPrincipal;

using ClaimsIdentityCollection = Microsoft.IdentityModel.Claims.ClaimsIdentityCollection;

IClaimsIdentity claimsIdentity = new ClaimsIdentity(Thread.CurrentPrincipal.Identity);

var claimValue = string.Format("claim://{0}@{1}", TargetResourceRole.PrivilegedDeploymentOperator, "sample-resource-folder-test");

var identityClaim = new IdentityClaim(IdentityClaimTypes.Role, claimValue);

claimsIdentity.Claims.Add(identityClaim);

ClaimsIdentityCollection claimsIdentityCollection = new ClaimsIdentityCollection(new List<IClaimsIdentity>() { claimsIdentity });

var newIcp = IdentityClaimsPrincipal.CreateFromIdentities(claimsIdentityCollection);

Thread.CurrentPrincipal = newIcp;

The above example uses the Microsoft.IdentityModel namespace to describe the elevation of privilege to run some code.

Now for the delegated auth use case:

string homeSecurityTokenService = ConfigurationManager.GetSetting("HomeSecurityTokenService");

string SecurityTokenServiceRealm = ConfigurationManager.GetSetting("SecurityTokenServiceRealm");

string serviceName = ConfigurationManager.GetSetting("ServiceName");

var serverHomeSecurityTokenService = new ServerHomeSecurityTokenService(

new Uri(SecurityTokenServiceRealm),

homeSecurityTokenService,

null);

var serviceIdentity = new ServiceIdentity(

serviceDnsHostName: targetDnsName,

serviceNames: new string[] { serviceName });

WebSecurityTokenAuthenticator authenticator = new WebSecurityTokenAuthenticator(serverHomeSecurityTokenService, serviceIdentity);

ClaimsIdentityCollection collection = authenticator.Authenticate(authorizationHeader, resourceName);

var claimValue = string.Format("claim://{0}@{1}", TargetResourceRole.PrivilegedDeploymentOperator, payload.Properties.Folder);

collection.Add(new ClaimsIdentity(new List<Claim>() { new Claim(ClaimTypes.Role, claimValue) }));

var authContext = new Microsoft.IdentityModel.Clients.ActiveDirectory.AuthenticationContext(

tokenIssuanceUrl, true);

StringBuilder sb = new StringBuilder();

collection.ForEach(x => x.Claims.ForEach(c => sb.Append(c.Value + ",")));

var claims = sb.ToString().Trim(',');

var authenticationResult =

authContext.AcquireTokenAsync(resourceName, clientCredential.ClientId, new Uri("https://DstsInternalNativeClient"), new PlatformParameters(PromptBehavior.Auto), userIdentifier, extraQueryParameters, claims, synchronizationContext);

var newDelegatedToken = authResult.AccessToken;

Wednesday, August 24, 2022

This is a continuation of a series of articles on hosting solutions and services on Azure public cloud with the most recent discussion on Multitenancy here. The previous articles introduced virtual SAN followed by the operational efficiencies and this follows up on the design points and the integrated systems.

HCI products can also be combined into an integrated offering with the opportunity to provide a solution to customers. They are usually built on one or more proven building blocks for the Software-defined data center. The vSphere and vSan provide two such building blocks and the integrated appliance is preconfigured and pretested. It delivers features for resiliency, quality of service, and centralized management functionality, enabling faster, better, and simpler management of consolidated workloads, virtual desktops, business critical applications and remote office infrastructure.

HCI capabilities can also be offered with a mix and match of storage offerings from different vendors for the purposes of replication, backup and cloud tiering, at no additional cost. These appliances can also integrate with cloud management platform and end-user computing solution. Such appliances can also be promoted to visibility in the management plane. Integrated systems can also combine compute, storage and networking products with certified partner hardware into HCI appliances.

The advantages of a HyperConverged Infrastructure include the following:

- It can be used to cut acquisition cost, maintenance cost and operational expenses. Both CapEx and OpEx can reduce

- From acquisition to commission, the HCI requires fewer steps and fewer people than conventional techniques

- It can enable dynamic responsiveness especially to fluctuations in traffic. The heavier the workload, the more resources are required. The changes are transparent and automatic due to Storage based policy management.

- It improves precision and granularity because storage services are consumed on a pay-as-you-go basis. The performance, capacity and protection are provided just as much as needed.

- Consumption can also be made flexible with the partnership between admins. For example, storage admins can provide large datastores for VMs to be created and the virtual infrastructure admins can set policies for individual VMS.

- Both scale-up and scale out are supported by the HCI system for capacity and performance on a grow-as-you-go basis

- Newer applications with dynamic resourcing requirements can be quickly accomodated. The popularity of containers and cloud applications is maintained by HCI.

- There is consistent performance for every application from business critical applications to cloud native applications.

- There is high availability from these systems even when failures occur.

- HCI provide a building block towards the private cloud with its modular architecture.

Finally, HCI participates in making the design modular for the foundation of a private cloud.

Tuesday, August 23, 2022

Virtual SAN provides the ability to take operations down to the virtual machine level. This fine-grain control is required from the application delivery perspective that matter most to enterprise IT users. Users are concerned about their applications and not about the infrastructure. HCI and vSAN recognize this requirement and are aligned with it. All storage services and the VM hosts that they reside on can be adjusted.

Storage systems have been traditionally defined to be application agnostic and providing multiple layers of abstraction over the raw bytes on the disk. The user facing organizational units differ from the storage facing ones and there is usually a table or a directory to map one with the other. When storage services are configured by application requirements rather than storage constraints, it gives greater control to the users.

More efficient operations are enabled with the use of storage policies to drive automation. The policies can be set for an application’s requirement on capacity, performance, availability, redundancy, and such others. The management control plane can automate the VM placement by finding the datastores that have capacity to provision them. Together the automation and policy-based management simplify storage management. It helps to quickly deliver value to the customers and those who use IT services.

It doesn’t mean application admin and storage admin can be one and the same. The former view the storage as a service without being slowed down by the service-fulfillment bottlenecks and might even expect a pay-as-you-go billing. The latter are primarily interested in automation and operational efficiencies. Similarly, virtual infrastructure administrators and storage administrators participate in a symbiotic relationship. Although they might be involved in a tug-of-war, software defined storage elminiates the reasons for it mostly with the help of storage policy based management. The storage admin is responsible for the upfront setup of the storage capacity and data services which are published as the so-called virtual datastore. The Virtual Infrastructure admin uses the virtual datastore as a menu of storage items to serve the needs of the Virtual Machines.

No operational model is complete without unified management. IT operations have been dogged by swivel chair operation where the operator must jump from one interface to another to complete all aspects of a workflow. An HCI has the potential to provide a single pane of glass management instead of managing compute, storage and networking individually. Not al HCI provide this and some even come up with a mashup of underlying technology management but seasoned HCI will generally have a unified management. When it comes to HCI, system monitoring is critical to aid management. Although custom dashboards and historical trends might be available from metrics providers, a built-in monitoring system for an HCI goes hand in hand with its management portal. End-to-end monitoring and the whole picture of the software and its environment not helps with proactive monitoring but also troubleshooting and reducing costs

Monday, August 22, 2022

This is a continuation of a series of articles on hosting solutions and services on Azure public cloud with the most recent discussion on Multitenancy here. The previous articles introduced virtual SAN but this one delves into the operational efficiencies.

vSAN brings additional efficiencies in the form of deduplication, compression, and erasure coding. Processors and SSD provide the performance to run these data reduction technologies. These features improve the storage utilization rate which means less physical storage is required to store the same amount of data.

The delivery of maximum value and flexibility occurs only when the data center is completely software defined and it should not be tied to a specific hardware platform. This can be done in two ways:

First, allow the flexibility via choices in the commodity hardware used for the hypervisor and vSAN in terms of the x86 servers and their vendor.

Second, allow a fast-track of the HCI through turnkey appliances.

In addition to flexibility, a vSAN must be designed for nondisruptive and seamless scaling. Usually this is not a problem when the server additions do not affect the initial lot but it does get impacted if the hypervisor and vSAN must be reconfigured over the adjusted base. Recent improvements in cluster technologies have provided easier scalability options via addition of nodes without impacting the memberships of existing nodes in the ensemble. In such a case, the cluster provides the infrastructure while the datacenter becomes more like a cloud service provider. It must be stressed that a vSAN must be allowed to both scale up and scale out otherwise infrastructure management via platforms like Kubernetes are poised to take on more of the infrastructure management routines from the storage technologies. Most businesses investing in vSAN are keen to pursue a “grow-as-you-go” model.

vSAN’s focus is on storage not infrastructure. It is more disk oriented than servers on which the storage is virtualized. It must be configurable as an all-flash or hybrid storage. In the hybrid mode, it pools HDDs and SSDs to create a distributed shared datastore even though it might internally prefer to use Flash as a read cache/write buffer to boost performance and HDD for data persistence. While flash prices are declining rapidly allowing more possibilities, organizations cannot retire their existing inventory. Hybrid approach is often a necessity even when workloads can be segregated to take advantage of new versus existing.

Sunday, August 21, 2022

This is a continuation of a series of articles on hosting solutions and services on Azure public cloud with the most recent discussion on Multitenancy here. The previous articles introduced multitenancy via hyperconvergence specifically with examples of on-premises technologies. This article continues with a specific example of virtual SAN.

Storage industry and databases have a rich history and tradition of providing multitenancy. As the hardware and software evolved and transformed the storage industry, hyperconvergence and multitenancy also changed. While there are many examples and products that have their own tales to tell as the exhibits in a museum, this article studies virtual SAN and the databases it powered.

Traditional databases particularly the eCommerce catalogs of enterprises required large Storage Area Networking (SAN) because the databases were designed to persist everything to disk from the database objects, prepared plans, materialized views and others. When the disks of a few gigabytes don’t suffice, Storage Area Networking offered the possibility of near limitless storage for that perspective. vSAN took it a notch higher with its ability to virtualize SAN over several devices.

A market leader in vSAN, for instance, pools together server-attached storage (SSDs, HDDs and other flash devices). It creates a shared data store with advanced data services designed for virtual environments. This datastore is highly resilient with no single point of failure. It is optimized for the latest flash technologies. Spread over many VMs with these disks, the virtual SAN can expand to large storage capabilities tolerating failures and providing a single shared datastore.

This vSAN provider integrates the virtual SAN by building it into the kernel of the hypervisor. Hyperconverged solutions use the hypervisor to support and deliver storage functions and storage networking in software eliminating the need for dedicated storage hardware. Since it is embedded in the kernel, it can deliver the highest levels of performance without taxing the CPU with additional overhead. The in-kernel architecture also simplifies management and eliminates risk associated with extra components and points of integration. In this way, it differs from the many virtualized storage appliances that run separately on top of the hypervisor.

Since storage is a key contributor to performance and efficiency, the load passed on by the hypervisor to virtual SAN storage must be dealt with adequately. In this regard, the vSAN is matured over the Flash storage for nearly a decade. The software that used to be implemented on a disk array, moves onto the hosts. A hyperconverged storage is built from the grounds up to integrate and leverage all the functionality of the hypervisor without operational overhead or any reduction of core functionality. The virtualization layer provides features such as high availability and the live migration of running virtual machines from one physical server to another with zero downtime, continuous service availability and complete transaction integrity. This creates a dynamic, automated, and self-optimizing data center.

#algorithms

Determine cycles in a graph

Bellman-Ford(G, w, s)
Initialize-single-source(G,s)
for i = 1 to number of vertices -1
    for each edge (u,v) belongs to edges E in G
          relax(u,v,w)
for each edge(u,v) belongs to edges E in G
     if (v.d > u.d + w(u,v))
         return False
return True

Friends Pairing problem:

Given n friends, each one can remain single or can be paired up with some other friend. Each friend can be paired only once so ordering is irrelevant

The total number of ways in which the friends can be paired is given by;

Int GetPairs(int n)

{

If (n <=2) return n;

Return GetPairs(n-1) + GetPairs(n-2)*(n-1);

}