Cluster computing

Thursday, November 3, 2022

A new security attack vector on multitenant solutions

Multitenancy is about sharing provisioned resources for customers. It is often explained as:

virtualization + resource sharing = multi-tenancy

Tenancy is about customers not users. Multiple users from a single organization can form a single tenant. Examples of multi-tenant applications include Business-to-Business solutions, Business-to-Consumer solutions, and Enterprise-wide platform solutions.

One of the ways to manage resources is resource governance. Resource governance is hierarchical in nature. From top to bottom, limits can be enforced at various levels using their level-appropriate mechanisms starting with the operating systems, then the resource pools and the workload groups. Data I/O governance limits both the read and the write physical I/O against persisted data. IOPS limits are set for each service level to minimize the noisy neighbor effect.

While noisy neighbors are a common scenario, it is not an attack vector. Those are determined based on threat model. A common methodology to study threats is STRIDE analysis and threat mitigation. It is an acronym for the following: Spoofing Identity – is the threat when a user can impersonate another user. Tampering with data- is the threat when a user can access Kubernetes resources or modify the contents of security artifacts. Repudiation – is the threat when a user can perform an illegal action that the Kubernetes cannot deter. Information Disclosure – is the threat when, say a guest user can access resources as if the guest was the owner. Denial of service – is the threat when say a crucial component in the operations of the Kubernetes is overwhelmed by requests so that others experience outage. Elevation of privilege – is the threat when the user has gained access to the components within the trust boundary and the system is therefore compromised.

One of the unique threats posed to multitenant solutions is that the attacker and the victim can share the same server. Such a setup cannot be mitigated by traditional security measures. When the attacker and the victim are in the same provider but located on separate servers, the attack vector is only penetrating the virtualization layer. When they are collocated, the attack vector is penetrating resource sharing. Traditional network security fails to provide adequate protection in this regard.

Consider the case when an attacker begins the attack with network probing and follows it up with a brute force attack to take advantage of the multitenancy effect by allocating the attacker’s virtual machine besides the victim’s virtual machine. When this is achieved, a side channel attack can take advantage of system characteristics with which the attacker extracts the data of the victims. Hypervisor and operating systems do not help mitigate this adequately.

Potential mitigations of this attack vector include the case where a resource allocation technique is employed that determines the location of resources randomly. Another way to provide mitigation to select tenants could be to restrict the number of usages of those resources. In these cases, the attackers’ cost, effort, and time are dramatically increased.

Not all multitenant solutions need to focus on such advanced mitigations. With proper network security at higher levels, these attacks not only become rare, but they become insignificant to mitigate. East-West network security is often planned for with the security compliance efforts of the multitenant solution provider.

#codingexercise

https://ideone.com/TTu3xF

Wednesday, November 2, 2022

The multi-tenant Frontend case study with ASP.Net stack

This part of my book delves into an example of writing an ASP.Net Application with multi-tenancy. One of the very common asks is for an application to be made available in a different appearance for another business in the parent organization’s portfolio. Occasionally, this business might have its own active directory domain and missing federation.

The choice of technology stack only affects the development of the software since multi-tenancy is an architectural decision. If they were different applications for each business, they could implement independent technology stacks regardless of whether the data provider services, and store are common to the businesses. When there is a user interface involved, the technology stack leans towards write-once-run-anywhere paradigm. ASP.Net fits this bill very well because it separates the client and the server code very well. The client code can be developed to suit all platforms – mobile, web, native mobile and native desktop. Web workers and server-side rendering makes it super-fast and bring security and performance to the server. Finally, ASP.Net is developer friendly in that it has simple constructs for model-view-controller and templates can be used in views. This makes it easy to get functionality right out of the box and spend less time making code work.

The switch between single tenancy to multi-tenancy is handled by a few components within an ASP.Net application. First, there is a request interceptor that sets the appropriate tenant in the context. Second, this must be communicated to the server part of the application. Finally, the tenant-based customizations, automatic injection of services specific to the tenant domain, and reconfiguration of routes and redirects as per the domain of the tenant and a few other features must be enabled.

The URLs for the tenant-specific domain name hosted services usually follow a pattern that can be made well-known across the application code base so that the same code works for every tenant. If the pattern and the whitelisting must be avoided, the tenant information can be persisted and looked up as an alternative from a tenant registry. ASP.Net defines an ApplicationDBContext and a service so these are added to the service.

Then there will be methods provided for each of the following in the service: translation of the hostname/domain name to the tenant, adding a tenant information specifying header to the set of headers on the incoming request and defining enumerations of the tenants if the resolution is based on pattern and literal matching.

The server side of the application will resolve multi-tenancy and inject custom services based on different database contexts. The extensions serve this purpose. It must resolve the connection string based on the domain name of the tenant. The connection string is passed to the ApplicationDBContext and a custom service for that domain is then added to the Service collections. A dedicated service, say one for logins, will be created for the tenant to target Active Directory and the domains can be maintained independently for each tenant within the same Active Directory instance.

The login service cannot typically be injected directly like the custom service for the custom DB context in the server startup code. Instead, the AddDefaultIdentity method is invoked on the service collection and supplied with the default user interface and entity framework store. Similarly, the service collection must be initialized with a pattern for the controller and view resolutions such as with MVC.

Tuesday, November 1, 2022

The multi-tenant Frontend case study

This part of my upcoming book delves into an example of writing an Angular Application with multi-tenancy. One of the very common asks is for an application to be made available in a different appearance for another business in the parent organization’s portfolio. Occasionally, this business might have its own active directory domain and missing federation.

The choice of technology stack only affects the development of the software since multi-tenancy is an architectural decision. If they were different applications for each business, they could implement independent technology stacks regardless of whether the data provider services, and store are common to the businesses. When there is a user interface involved, the technology stack leans towards write-once-run-anywhere paradigm. Angular JS fits this bill very well. It can be developed across all platforms – mobile, web, native mobile and native desktop. Web workers and server-side rendering makes it super-fast and offloads the data model building to push-model frameworks like Immutable.js. Finally, Angular is developer friendly in that it has simple declarative templates and the language for templates can be extended. This makes it easy to get functionality right out of the box and spend less time making code work. Google uses Angular for some of its largest applications.

The switch between single tenancy to multi-tenancy is handled by a few components within an Angular application. First, there is a request interceptor that sets the appropriate tenant in the context. Second, this must be communicated to the server part of the application. Finally, the tenant-based customizations, automatic injection of services specific to the tenant domain, and reconfiguration of routes and redirects as per the domain of the tenant and a few other features must be enabled.

The URLs for the tenant-specific domain name hosted services usually follow a pattern that can be made well-known across the application code base so that the same code works for every tenant. If the pattern and the whitelisting must be avoided, the tenant information can be persisted and looked up as an alternative from a tenant registry. Angular defines a module and a service so these are added to the service.

The server side of the application will resolve multi-tenancy and inject different services based on the resolution. The interceptor serves this purpose. It must be registered with the module. The service itself, say one dedicated to logins, will have separate instances for the tenant so that they can target different membership providers like the Active Directory. There can be some efficiency with a base class and deriving the independent service implementations. Angular applications require that this be defined in module and within the module, when the provider is configured, a factory can be configured that gets the correct instance of the service.

The login service cannot typically be injected directly as an @Injectable but the use of a factory allows us to refer to the correct instance of the service. The module definition must be updated with the login module. Lastly, there must be switching between the routes based on the tenant service in the login module.

Monday, October 31, 2022

Datacenter operations

As part of building a public cloud from the grounds up, I have always been interested in Datacenter operations. The following is a summary of some of the routines performed in this regard.

IT organizations building a private cloud have a lot in common with the datacenter operations for a public cloud. There used to be a focus primarily on the agile and flexible infrastructure which became challenging with the distributed nature of the applications deployed by the Enterprises. Their operations evolved with the tools that transform how IT operates but these organizations continued to be measured by the speed, simplicity, and security to support their business objectives.

The speed is a key competitive differentiator for the customers of the infrastructure. The leveraging of data center locations as well as the service centric cloud operations model has become critical. Fueled by the transformations in the work habits of the workforce to work from anywhere at any time, the business resiliency and agility depended on a connective-fabric network.

The network connects the on-premises, cloud, and edge applications to the workforce, and it is a is a multi-disciplinary effort among NetOps, SecOps, CloudOps, and DevOps teams. Each one has a perspective into building the infrastructure such as the tools that manage where the workloads are run, the service level objectives defining the user experience, and implementation of zero trust security to protect vital business assets

Enablement of these teams requires real-time insights usually delivered with an automation platform. Both the cloud and the datacenter operations can be adapted to the new normal of shifting workloads and distributed workforces. Delivering a consistent simplified experience to the teams with such a platform, empowers them to align and collaborate more efficiently than before.

Some datacenter automations can be fabric agnostic but they all must have some common characteristics. These include providing a unified view into proactive operations with continuous assurance and actionable insights, an orchestrator to coordinate activities, and a seamless access to network controllers and third-party tools or services. The orchestrator can also enforce policies across multiple network sites and enable end-to-end automation across datacenter and networks. A dashboard offers the ability to view all aspects of management through a single pane of glass. It must also define multiple personas to provide role-based access to specific teams.

Some gaps do exist between say NetOps to DevOps which can be bridged with a collaborative focal point that delves into integration with ticketing frameworks for incident management, mapping compute, storage, and network contexts for monitoring, identifying bottlenecks affecting workloads, and consequent fine-tuning.

Automation also has the potential to describe infrastructure as a code, or infrastructure as a resource or infrastructure as a policy. Flexible deployment operations are required throughout. Complexity is the enemy of efficiency and tools, and processes must be friendly to the operators. Automation together with analytics can enable them to respond quickly and make incremental progress towards their goal.

Sunday, October 30, 2022

QoS and billing for cloud resources

Abstract:

Many cloud solutions written by customers of public cloud services rely on billing and costing of their resources from the service provider, but they have absolutely no differentiation over their usages. The pay-as-you-go billing model is inherently dependent on the monitoring of the underlying cloud resources and that for the logic deployed by the customer but the differentiation of the usages cannot be made by the cloud infrastructure unless it is specified by the end-user application. Even if it did, there is no inherent mechanism to color the usages across the cloud resources to provide enhanced billing. This article provides a glimpse into the service which could honor customer differentiation of usages which could pave the way for assignment of quality of service over the public cloud resource consumption.

Description:

Central to this proposal is the notion of classification and quota management for the end-usages where the classification is not only performed by the cloud resource provider based on connection attributes but also helpfully tagged by the customer applications using those resources. Customers can augment any resource usage with custom web request headers that introduce tags in the values with which to classify. They also set the user defined rules with which to classify. There is a clear separation between the classification rules and the resource plans. This is because the classification rules are dynamic in nature and could change for assigning the connections to different groups. Groups of connections share the same pool of resources. The cloud only needs to keep track of the resource plans. These plans are determined by the customer for the public cloud and are actively looked up by the cloud when assigning resources to workload. To the public cloud, the requests do not matter, they belong to a group, but the resources matter since the cloud must account for all resources and divide them between groups. The groups are a label for different connections and was an identifier to denote how much resources could be guaranteed to the cloud. The default guarantee is all inclusive and permissible with the customer provided tags used for accounting. The rules are for connections and connections are transient. In comparison, the resource plans are more stable, and cloud defined. Second the connections could have different characteristics and the classification based on connection properties could change with the next connection. The classifier is a simple user defined function with system defined rules included that assigns the incoming connection to a group. The classifier has visibility into the tags provided. This function evaluates the connections based on program order and in the form of a decision tree. This classifier function can be modified and updated independently from the resource plans. By its nature, the classifier is code while the resource plan is data. Furthermore, the resource plan data for the cloud is constantly read when assigning the resources and require cloud reconfiguration at resource provider level after each change since it affects resource throttling, monitoring and billing for incoming connections. However, the cloud does not need to know anything about the connections or persist any connection properties since these have been evaluated to a group. The group is a label for the cloud that is used to assign incoming connections on which a policy is applied as defined by the resource plans. The groups can be hierarchical as well while the resource plans are discrete and flat. The resource plans also must tally up to the full cloud capability. Therefore, it is owned and enforced by the cloud. The user connections, on the other hand, are mapped only once to different pools. In addition, it is written as any other user defined function. Although in practice this is done by the administrator and requires cloud reconfiguration since the cloud needs to know that the memberships to the groups have changed, the classifier is connection facing and hence qualifies as just one other user defined function. The decision to reconfigure the cloud at the resource provider level after every classifier change is an important one. It is not merely sufficient to change the classifier to impact the next incoming connection, it is important for the cloud to know that the memberships to groups are being redefined. This means that connections that were previously coming to a group might now be classified to another group and the plans for that may deny resources or switch billing categories to the new connection. The cloud treats the classifier and the plan definitions as together constituting the resource management policy. So, if one of them changes the cloud's resource management behavior changes. In a way this is a way to tell the cloud that the policy has changed and is intended as a control for the administrator. Lastly, the policies and the plans are different because there are checks placed on the plan whereas the policies are arbitrary and have no relevance to the cloud. The checks on the plan, however, determine whether the next billing cycle is changed. The calculations by the resource provider are dependent on the plan information and this is a state that's persisted so that the cloud can automatically pick up the state between restarts. Thus, the resource policies and plans are treated differently. This feature differs from the Azure resource manager in that ARM has multiple resources, role-based access control, custom tagging and self-service templates which affect create, update and delete of resources whileresource usage management is the focus of this feasibility study.

Saturday, October 29, 2022

There are N points (numbered from 0 to N−1) on a plane. Each point is colored either red ('R') or green ('G'). The K-th point is located at coordinates (X[K], Y[K]) and its color is colors[K]. No point lies on coordinates (0, 0).

We want to draw a circle centered on coordinates (0, 0), such that the number of red points and green points inside the circle is equal. What is the maximum number of points that can lie inside such a circle? Note that it is always possible to draw a circle with no points inside.

Write a function that, given two arrays of integers X, Y and a string colors, returns an integer specifying the maximum number of points inside a circle containing an equal number of red points and green points.

Examples:

1. Given X = [4, 0, 2, −2], Y = [4, 1, 2, −3] and colors = "RGRR", your function should return 2. The circle contains points (0, 1) and (2, 2), but not points (−2, −3) and (4, 4).

class Solution {

public int solution(int[] X, int[] Y, String colors) {

// find the maximum

double max = Double.MIN_VALUE;

int count = 0;

for (int i = 0; i < X.length; i++)

{

double dist = X[i] * X[i] + Y[i] * Y[i];

if (dist > max)

{

max = dist;

}

for (double i = Math.sqrt(max) + 1; i > 0; i -= 0.1)

{

int r = 0;

int g = 0;

for (int j = 0; j < colors.length(); j++)

{

if (Math.sqrt(X[j] * X[j] + Y[j] * Y[j]) > i)

{

continue;

}

if (colors.substring(j, j+1).equals("R")) {

r++;

}

else {

g++;

}

if ( r == g && r > 0) {

int min = r * 2;

if (min > count)

{

count = min;

}

return count;

}

Compilation successful.

Example test: ([4, 0, 2, -2], [4, 1, 2, -3], 'RGRR')

Example test: ([1, 1, -1, -1], [1, -1, 1, -1], 'RGRG')

Example test: ([1, 0, 0], [0, 1, -1], 'GGR')

Example test: ([5, -5, 5], [1, -1, -3], 'GRG')

Example test: ([3000, -3000, 4100, -4100, -3000], [5000, -5000, 4100, -4100, 5000], 'RRGRG')

Friday, October 28, 2022

There are heights for bars from a bar-chart provided. Find the maximum area of a contiguous rectangle bounded by the bar chart in a streaming manner where the bars appear to the right along the x-axis.

public void getMaxRectangleByStream(List<int> A, List<int> sums, int current)

{

for (int i = 0; i < sums.size(); i++)

{

for (j = i; j < sums.size(); j++)

{

if (A[j] <= current && sums[j] >= A[j] * (sums.size()-j))

{

sums[j] += A[j];

}

break;

}

int sum = current;

for (int j = sums.size() - 1; j >= 0; j--)

{

if (A[j] >= current)

{

sum += current;

}

else

{

break;

}

sums.Add(sum);

}

class Solution {

public int solution(List<int> A) {

var sums = new List<int>();

for (int i = 0; i < A.length(); i++)

{

getMaxRectangleByStream(A, sums, A[i]);

}

return sums.stream().max();

}

A: 4, 6, 2, 4, 12, 7, 4, 2, 2, 2

sums: 8, 6, 20, 16, 12, 14, 16, 20, 20, 20