Cluster computing

Thursday, November 10, 2022

Multitenant architecture patterns

Several patterns can help plan and build the data architecture for SaaS applications. A well-designed SaaS application can demonstrate scalability, configurability, zero downtime and multi-tenant efficiency. These qualities cannot be mutually exclusive. For example, optimizing for multitenant efficiency in a shared environment must not compromise the level of security safeguarding data access. A security pattern to resolve this conflict involves the use of “virtual isolation” mechanisms such as permission, SQL views and encryption.

Trusted database connections:

Access to data stored in databases is secured using one of two methods: impersonation and trusted subsystem account. The former enables users to access different database objects. The latter is for applications to connect to database using process identity and involves additional security to be implemented in the application itself. For multitenant applications where each tenant grants access to end user accounts, a hybrid approach is justified.

Secure database tables

This involves granting select, update, insert, delete on [TableName] for [UserName] and must be done once during the tenant provisioning process. It is appropriate for separate database and separate schema approaches.

Tenant View Filter:

SQL Views can be used to grant individual tenants access to some of the rows in each table, while preventing them from accessing other rows. A predicate is added to filter the records from say a SELECT statement. This predicate can use a built-in function to determine the security identifier of the user account accessing the database and matched with the column values corresponding to a tenant. Unlike secure database tables pattern, this uses shared schema with tenant qualification.

Tenant Data Encryption:

A way to further protect tenant data is by encrypting it within the database. Encryption can be done with both symmetric as well as asymmetric key. In symmetric cryptography, a key is used to encrypt and decrypt data. In asymmetric cryptography, two keys are used, namely, the private key and the public key. Data is encrypted with the public key but decrypted with the private key. Public key cryptography requires significant more computing power. A better approach might be to use a key wrapping system that combines the advantage of both systems.

Extensibility patterns include custom columns and preallocated fields. Since different organizations have their own unique needs, some customizations are required. Preallocated fields is a technique to simply include a preset number of custom fields in every table. These additional fields are used differently by different tenants.

Custom fields are limited by their number. An alternative technique is to use tagging with name value pairs. When metadata defines separate labels and data types for each of the tenants’ custom fields, the data model can be extended arbitrarily. The main drawback is that it adds a level of complexity for database functions such as indexing, querying, and updating records.

Custom columns are those that can be added to the tenant’s tables directly. Custom rows can be added to a dedicated table without altering the data model for other tenants.

Data model extensions help only with the storage and not the operations. Any extension must be paired with a mechanism for integrating the additional fields into the application’s functionality.

Scalability patterns are useful for large scale enterprise software. Scalability is even more important because data belonging to all the customers must be supported. ISVs building on-premises software might be familiar with shifting minor leagues to majors, but the game also changes because the scope widens to supporting a vast user base. Databases can be scaled up or out and it is important to differentiate between scaling the application and scaling the data.

Wednesday, November 9, 2022

3 distinct Multitenant data architecture:

Introduction:

The earlier articles focused on developing a rigid definition and framework for evaluating and comparing different multitenant data architectures. A set of seventeen criteria were used to comprehensively articulate the effects of the choices made in the multitenant architectures and these choices manifested in combinations of application and database variations as (AD, DD), (AS, DD), (AI, DD), (AD, DS), (AS, DS), (AI, DS), (AD, DB), (AS, DB), (AI, DB), (AD, DC), (AS, DC), (AI, DC) where the notations are explained in the legend. This article talks about the best practices in terms of patterns for realizing the data architectures.

Description:

A number of patterns can help plan and build the data architecture for SaaS applications. A well designed SaaS application can demonstrate scalability, configurability, zero downtime and multi-tenant efficiency. These qualities cannot be mutually exclusive. For example, optimizing for multitenant efficiency in a shared environment must not compromise the level of security safeguarding data access. A security pattern to resolve this conflict involves the use of “virtual isolation” mechanisms such as permission, SQL views and encryption.

The table below lists some of these patterns:

Approach	Security Patterns	Extensibility Patterns	Scalability Patterns
Separate databases	Trusted database connections Secure database tables Tenant data encryption	Custom columns	Single Tenant scaleout
Shared database, Separate schema	Trusted database connections Secure database tables Tenant Data encryption	Custom columns	Tenant-based horizontal partitioning
Shared database, shared schema	Trusted database connections Tenant View Filter Tenant Data Encryption	Preallocated fields Name-Value pairs	Tenant-Based Horizontal Partitioning

Trusted database connections:

Secure database tables

Legend:

AD- A dedicated application server is running for each tenant, and therefore, each tenant receives a dedicated application instance.

AS – a single application server is running for multiple tenants and each tenant receives a dedicated application instance.

AI – a single application server is running for multiple tenants and a single application instance is running for multiple tenants.

DD – a dedicated database server is running for each tenant and therefore the database is also isolated.

DS – a single database server is running for multiple tenants and each tenant gets an isolated database.

DB – a single database server and a single database is being used for multiple tenants

DC – a single database server is running for multiple tenants and data from multiple tenants is stored in a single database and in a single set of tables with same database schema but separation based on records.

Tuesday, November 8, 2022

Challenges with multitenant architecture choices:

Between single-tenancy and multitenancy, the challenges are different and somewhat more complex. A few are listed below and called out for their specificity. The architectural choices in terms of application and databases are:

AD- a dedicated application server is running for each tenant, and therefore, each tenant receives a dedicated application instance.

AS – a single application server is running for multiple tenants and each tenant receives a dedicated application instance.

AI – a single application server is running for multiple tenants and a single application instance is running for multiple tenants.

DD – a dedicated database server is running for each tenant and therefore the database is also isolated.

DS – a single database server is running for multiple tenants and each tenant gets an isolated database.

DB – a single database server and a single database is being used for multiple tenants

DC – a single database server is running for multiple tenants and data from multiple tenants is stored in a single database and in a single set of tables with same database schema, but separation based on records.

1. Sharing of resources and higher than average hardware utilization, performance may be compromised. It must be ensured that all tenants get to consume resources. If one tenant clogs up resources, the performance of all other tenants may be compromised. This is specific to multitenancy and not single tenancy. In a virtualized instances situation, this problem is solved by assigning an equal amount of resources to each instance. The solution may lead to very inefficient utilization of resources and may not suit all multitenant systems.

2. Scalability: When tenants share the same application and database, scalability suffers. An assumption with single-tenancy is that tenants do not need more than one application or database but there are no such limitations that exist when placing multiple tenants on one server. Tenants from various geographies can use the same application which affects its scalability. In addition, geographies pose constraints, legislations and regulations. For example, EU mandates that invoices sent from within the EU must be stored within the EU. Additional constraints can be brought by tenant to place all the data on the same server to speed up queries.

3. Security: When security is compromised, the risk for data stealing is high. In a multitenant environment, a security breach can result in the exposure of data to other, possibly competitive tenants. Data protection becomes an important challenge to tackle.

4. Zero-downtime: Introducing new tenants or adapting to changing business requirements of existing tenants brings along the need for constant growth and evolution of a multi-tenant system.

Monday, November 7, 2022

Comparisons based on multitenant architecture

With the differentiation in level sets of application and database in a multitenant architecture, as follows:

AD- a dedicated application server is running for each tenant, and therefore, each tenant receives a dedicated application instance.

AS – a single application server is running for multiple tenants and each tenant receives a dedicated application instance.

AI – a single application server is running for multiple tenants and a single application instance is running for multiple tenants.

DD – a dedicated database server is running for each tenant and therefore the database is also isolated.

DS – a single database server is running for multiple tenants and each tenant gets an isolated database.

DB – a single database server and a single database is being used for multiple tenants

this part of my book now focuses on the comparisons and rules of thumb that can be drawn from the pairwise combinations of the above application and database variations. The consequences of the choices and the selection of subsets of patterns then becomes easier for software architectures. The highest level of sharing with (AI,DC) shows that time behavior, recoverability, confidentiality, diverse SLA and software complexity suffers the most and with the opposite end of the spectrum of (AD,DD) they improve the most. The converse is also available to see with other sets of criteria from the seventeen listed earlier. Resource utilization, number of tenants, maintainability, deployment time, and monitoring improve significantly with (AI, DC) while they suffer on (AD, DD).

This kind of comparisons-based matrix can be drawn out in detail and it helps to formulate rules of thumb as follows:

1. Focus on the database dimension – The effect of different architectures on decision criteria is largest on the database dimension. Focusing on database first before application for multitenancy.

2. Sharing database tables enables serving of many tenants but harms robustness – an architecture in which the database schema is shared is beneficial if there are many tenants and end-users, but variability suffers, and unintentional sharing occurs

3. Sharing application instances help maintainability and performance but harms variability - Decision makers can decide to share the application instance between tenants.

4. Ease of implementing variability differs greatly per architecture model. In fact, variability has the highest distinction factor because variability and sharing are contradicting to each other.

Dedicated servers improve performance and variability but hamper scalability. The recovery, variability, and confidentiality are better on dedicated infrastructure. The scalability suffers due to costs.

Sunday, November 6, 2022

Levels in multitenant architecture

Although Application and Database are the most frequently referred to layers for multitenancy, there are in fact several more. The right choices for multitenant architecture depend on the considerations at all these levels. Evaluation of the viability of an architecture pattern for multitenancy can be based on a list of comprehensive criteria that is drawn from all these levels.

This part of my book focuses on the levels of the multitenant architecture and the comparisons of the criteria. There were references to seventeen criteria earlier that included Time Behavior, Resource Utilization, Throughput, Number of tenants, Number of end-users, Availability, Recoverability, Confidentiality, Integrity, Authenticity, Maintainability, Portability, Deployment Time, Variability, Diverse SLA, Software complexity and Monitoring. All of these are quantifiable attributes that can help differentiate one model from another. These criteria can evaluate differences within these layers.

A stack of different levels from top to bottom would include Hardware, Virtual Machine, Operating System, Database Server, Database, Database Schema, Middleware, Application Server, and Application Instance. When multitenancy is applied at a specific level, the levels below that level are shared among the tenants. Isolation occurs at the levels above that level.

Even when tenants are consolidated in a single database, each tenant can operate on isolated set of tables with proper naming conventions. In a schema-level, isolation occurs at row-level of table. Conceptually, it is easy to differentiate between the levels as Application related layer set and Data related layer set. Within each of these layer sets, it is possible to segregate ascending levels of sharing. With the help of following notations, it is easy to compare multitenant architectures:

AD- A dedicated application server is running for each tenant, and therefore, each tenant receives a dedicated application instance.

AS – a single application server is running for multiple tenants and each tenant receives a dedicated application instance.

AI – a single application server is running for multiple tenants and a single application instance is running for multiple tenants.

Similarly, for the data related layer set:

DD – a dedicated database server is running for each tenant and therefore the database is also isolated.

DS – a single database server is running for multiple tenants and each tenant gets an isolated database.

DB – a single database server and a single database is being used for multiple tenants

Some common occurrences of Multitenant architectures include the following pairs from the application and the data related layer sets: (AD, DD), (AS, DD), (AI, DD), (AD, DS), (AS, DS), (AI, DS), (AD, DB), (AS, DB), (AI, DB), (AD, DC), (AS, DC), (AI, DC)

Saturday, November 5, 2022

Criteria for evaluating multitenant architecture:

The right choices for multitenant architecture are difficult even for software architects. Bad choices result in poor performance, low scalability, limited flexibility, and obstruct software evolution. architecture patterns are compared to support decision making but the rigor to compare the models falls short because a list of comprehensive criteria is often overlooked.

This part of my book focuses on the criteria to evaluate multitenant architectural patterns. There are seventeen of them. They are quantifiable attributes that can help differentiate one model from another. Although Application and Database are the most frequently referred to layers for multitenancy, there are in fact several more. These criteria can evaluate differences within these layers.

These criteria include:

1. Time Behavior - an architecture where the tenant sees a perceptible difference between a single application server running for multiple tenants and a dedicated application server for each tenant highlights this criterion.

2. Resource utilization – The solution provider must pass on the bill to the tenant and there is little justification for underutilized resources. Higher density can improve resource utilization.

3. Throughput – This criterion helps with the traffic served even if the utilization of the resources is similar.

4. Number of tenants – An architecture that supports a higher number of tenants is better prepared for cost efficient scalability than others.

5. Number of end-users – Tenants may vary in the number of users they support so a criterion must separately measure the number of end-users.

6. Availability – High availability is not just a design criterion for the SLA given to tenants but also the infrastructure concern for maintaining it in the long run.

7. Recoverability – Certain faults do occur from time to time. Recoverability is a measure of whether the mean time between failures is low or not. Given a Boolean up or down value over a timeline, the difference between a down from an up must be averaged out and brought to a small value when an architectural model exhibits recoverability.

8. Confidentiality – Data isolation and protection cannot be covered by any one criterion but this one help to ensure that a system with high confidentiality can guarantee that for tenants.

9. Integrity – If the system or the data can be compromised by virtue of sharing the resources, then architectural models differ on this criterion.

10. Authenticity – In a dedicated model, there is little, or no intervention or handling of the data and the tampering of the model is affected.

11. Maintainability – A dedicated system can be considered simple, but it is must more costly to maintain than those that involve sharing

12. Portability – A dedicated system might have high portability, but it also depends on its technology stack.

13. Deployment time – This has a business value for the tenants and can vary between systems that require rigorous repeated creation versus instantiation.

14. Variability – An architecture where the variations are huge between tenants indicates that the infrastructure is not able to address some of the customer requirements.

15. Diverse SLA – SLA is not just about availability and duration; it involves many factors. A dedicated system might be best equipped to provide high SLA.

16. Software complexity – The lighter the software the heavier the resources and vice versa. This criteria measures complexity

17. Monitoring – There are many components that can be monitored but monitoring is a criterion by which different architectures can differentiate from one another.

#codingexercise

https://ideone.com/mIKj1i

Friday, November 4, 2022

Re-engineering of multitenant applications

This part of the book focuses on the challenges of reengineering into multi-tenant SaaS applications.

SaaS is a business model where companies can do away with their proprietary infrastructure. Their subscription to the solution provider requires only internet access to use the services. This architectural pattern requires a single instance of a software that continues to work for different tenants not in a multi-user mode but one where tenants can bring their own customizations.

Multitenancy is about sharing provisioned resources for customers. It is often explained as:

virtualization + resource sharing = multi-tenancy

Tenancy is about customers not users. Multiple users from a single organization can form a single tenant. Examples of multi-tenant applications include Business-to-Business solutions, Business-to-Consumer solutions, and Enterprise-wide platform solutions.

A set of requirements must be drawn before a single-tenant application can be converted into a multi-tenant application. These can be enumerated on a component-by-component basis.

First, the authentication layer must support membership directories for the tenants. Let us take an example of an authentication layer as comprising of a ticket generation mechanism and an authentication module. The Kerberos ticket issuing web service has enough information in the tickets to allow tenant identification throughout the application without requiring continuous calls to the service. It is also possible to add extra information to the ticket when necessary. The authentication module, when implemented as an ASP.Net HTTP module verifies whether every request comes from a valid tenant otherwise a login screen is displayed.

Second, the configuration library could consist of a layout component, a general configuration component, and a file I/O component. In this case, the layout component loads a tenant specific master page. The general configuration is a library that must be integrated with the portal source code. The file I/O is a cross cutting concern and is tedious because source might use various constructs for file I/O.

Third, the database layer could comprise of a store or a custom library that is integrated into the portal code. The query adapter can augment the tenant predicate to the queries.

From these requirements, it must become clear that a conceptual reengineering approach is required to support the process. This approach must consider scalability, configurability, version support, completeness, support for different applications and threats to validity.

Scalability is facilitated by introducing little or no computational overhead even from cross cutting concerns or it could require an implementation of a standalone service. Although databases are designed to be extremely efficient, their usages make them performance bottlenecks in several cases. Use of a load balancer and a database pool mitigates this. This still does not take into account the scalability of the single-tenant’s business logic.

Configuration requirements depend heavily on the type and implementation of the application. It is, however, a key aspect of most applications and therefore one that must be part of the approach.

Version support cannot be an after thought after the first release. It must an essential feature in the design of the multi-tenant application.

Completeness can be facilitated with extensions to single-tenancy and it also involves validation and testing that must be specific.

Finally, even if the multi-tenant application is written from scratch, these concepts must be factored into the design prior to its release.