Cluster computing

Saturday, February 18, 2023

Extending datacenters to the public cloud:

A specific pattern used toward hybrid computing involves extending datacenters to the public cloud. Many companies have significant investments in their immovable datacenters and while they can create a private cloud such as a VMWare cloud within the public cloud, they might find it costly to maintain both an on-premise cloud and one on the public cloud. A reasonable approach between these choices is to extend the existing datacenters to the public cloud. This article explores this pattern.

Although technology products are not referred to by their brands or product names in a technical discussion of an architectural pattern, it simplifies this narrative by providing a specific example of the technology discussed. Since many technological innovations are patented, it’s hard to refer to them without using product names. In this case, we use the example of a private cloud with VMWare cloud and refer to its products for manageability. A VMWare vCenter is a centralized management utility that can manage virtual machines, hosts, and dependent components. VMWare vSphere is VMWare’s virtualization platform, which transforms datacenters into aggregated computing infrastructures that include CPU, storage, and networking resources.

The pattern to extend the datacenter to VMWare Cloud on AWS uses Hybrid Linked Mode. Inventories in both places can be managed through a single VMWare vSphere Client interface. This ensures consistent operations and simplified administration and uses a VMWare Cloud Gateway Appliance. It can be used to manage both applications and virtual machines that are on-premises.

There are two mutually exclusive options for configuration. The first option installs the Cloud Gateway Appliance and uses it to link from the on-premises vCenter server to the cloud SDDC. The second option configures Hybrid Linked Mode from the cloud SDDC. The Hybrid Linked Mode can only connect one on-premises vCenter Server Enhanced Linked Mode domain and supports on-premises vCenter Server running more recent versions. When a cloud gateway appliance is connected to the Hybrid Linked Mode, there can be multiple vCenter Server connected to the appliance but when the cloud SDDC is directly connected to the Hybrid Linked Mode, there can be only one vCenter Server.

Different workloads can be migrated using either a cold migration or a live migration with VMWare vSphere vMotion. Factors that must be considered when choosing the migration method include virtual switch type and version, the connection type to the cloud SDDC, and the virtual hardware version.

A cold migration is appropriate for virtual machines that experience downtime. These virtual machines can be shut down, migrated and then powered back on. The migration time is faster because there is no need to copy active memory. This holds true for applications as well. A live migration, on the other hand, uses vMotion to perform rolling migration without downtime and is advisable for mission critical applications. The idea behind vMotion is that a destination instance is prepared and made ready and the switching from source to destination happens near instantaneously.

This pattern establishes promotes the visibility of existing infrastructure to the cloud.

IT organizations building a presence in the cloud have a lot in common with the datacenter operations for a private cloud. There used to be a focus primarily on the agile and flexible infrastructure which became challenging with the distributed nature of the applications deployed by the various teams within the company. The operations of these application stacks evolved with the tools that transformed how IT operates but these organizations continued to be measured by the speed, simplicity, and security to support their business objectives.

The speed is a key competitive differentiator for the customers of any infrastructure – either on-premises or in the cloud. The leveraging of datacenter locations as well as the service centric cloud operations model has become mission critical. Fueled by the transformations in the work habits of the workforce to work from anywhere at any time, the business resiliency and agility depended on a connective-fabric network.

The network connects the on-premises, cloud, and edge applications to the workforce, and it is a multi-disciplinary effort among NetOps, SecOps, CloudOps, and DevOps teams. Each one has a perspective into building the infrastructure and the tools that manage where the workloads are run, the service level objectives defining the user experience, and implementation of zero trust security to protect vital business assets.

Enablement of these teams requires real-time insights usually delivered with an automation platform. Both the cloud and the datacenter operations can be adapted to the new normal of shifting workloads and distributed workforces. Delivering a consistent simplified experience to the teams with such a platform, empowers them to align and collaborate more efficiently than before. Architectural patterns and manageability interfaces that unify and simplify these administrative routines are more than welcomed given the scale of the inventory.

Some datacenter automations can be fabric agnostic but they all must have some common characteristics. These include providing a unified view into proactive operations with continuous assurance and actionable insights, an orchestrator to coordinate activities, and a seamless access to network controllers and third-party tools or services. The orchestrator can also enforce policies across multiple network sites and enable end-to-end automation across datacenter and networks. A dashboard offers the ability to view all aspects of management through a single pane of glass. It must also define multiple personas to provide role-based access to specific teams.

Some gaps do exist between say NetOps to DevOps which can be bridged with a collaborative focal point that delves into integration such as with ticketing frameworks for incident management, mapping compute, storage, and network contexts for monitoring, identifying bottlenecks affecting workloads, and consequent fine-tuning.

Automation also has the potential to describe infrastructure as a code, or infrastructure as a resource or infrastructure as a policy. Flexible deployment operations are required throughout. Complexity is the enemy of efficiency and tools, and processes must be friendly to the operators. Automation together with analytics can enable them to respond quickly and make incremental progress towards their goal.

Friday, February 17, 2023

As enterprises and organizations survey their applications and assets to be moved to the cloud, one of the often-overlooked processes involve the entrenched and almost boutique build systems that they have invested in over the years. The public clouds advocate the use of cloud native DevOps pipeline and automations that work well for new repositories and small projects but when it comes to billion dollars plus revenue generating source code assets, the transition of build and deployment to cloud become surprisingly challenging.

New code projects and businesses can start out with a code repository in GitHub or GitHub Enterprise with files conforming to the 100MB limit and the repository sizes conforming to the 5GB limit. When we start clean, on the Cloud based DevOps, managing the inventory to retain only text in the source and move the binaries to an object storage is easy. When the enterprises have accrued massive repositories over time, even a copy operation becomes difficult to automate. What used to be robocopy on windows involving large payloads, must now involve a transfer over S3.

One of the first challenges in the movement of build and capacity planning infrastructure to the cloud is to prepare the migration. External dependencies and redundancies can cause these repositories to become very large, not to mention branches and versions. Using a package manager or their equivalents to separate out the dependencies into their packages can be helpful to their reusability. Bundler, Node’s package manager and Maven are testament to this effect. Object storage or Artifactory and their equivalents can store binary data and executables. Backup and restore can easily be added from Cloud Services when they are not configurable via the respective cloud services.

Another challenge is the proper mapping of infrastructure to handle such large processes involved in Continuous Integration and Continuous Deployment. GitHub Enterprise can provide up to 32 cores and 50000 minutes/month for public repositories of sizes up to 50GB. The cloud, on the other hand, is limitless compute, storage and networking, all with the convenience of pay-as-you-go billing. If there is effective transformation of DevOps automations, both the infrastructure required and the automations they support become easier to host in the cloud. As with the first challenge, the ability to take stock of the inventory for infrastructure resources and automation logic becomes daunting. Consequently, some form of organization and nomenclature to divide up the inventory into sizeable chunks can help with the transformation and even parallelization

A third challenge involved is environmental provisioning and manual testing. Subscriptions, resource groups, regions and configurations proliferate in the cloud when such DevOps are transformed and migrated. These infrastructure and state become veritable assets to guard just the same way as the source that are delivered with the DevOps. From importing and exporting these infrastructure-as-a-code templates as well as their states and forming blueprints that can include policies and reconcile the state become a necessity. A proper organization and naming convention are needed for these as well.

Other miscellaneous challenges include but are not limited to forming best practices and centers of excellence, creating test data, providing manual deployments and overrides, ensuring suppliers, determining governance, integrating an architecture for tools say in the form of runbooks, manual releases, determination of telemetry, determining teams and managing accesses, supporting regulatory compliance, providing service virtualization, and providing education for special skillsets. In addition, managing size and inconsistencies, maintaining the sanctity as a production grade system and providing an escalation path for feedback and garnering collaboration across a landscape of organizations and teams must be dealt with.

Finally, people, process and technology must come together in a planned and streamlined manner to make this happen. These provide a glimpse of the roadmap towards the migration of build and deployments to the cloud.

Thursday, February 16, 2023

One of the architectural patterns for application migration is about managing AWS Service Catalog products in multiple AWS Accounts and AWS Regions. AWS Service Catalog is used to create, share organize and govern the curated IaC templates. Governance and distribution of Infrastructure is simplified and accelerated. AWS uses CloudFormation Templates to define a collection of AWS resources aka stacks required for a solution or a product. StackSets extend this functionality by enabling us to create, update or delete stacks across multiple accounts and AWS Regions with a single operation.

If a CloudFormation template must be made available to other AWS accounts or organizational units, then the portfolio is typically shared. A portfolio is a container that includes one or more products.

On the other hand, this architectural pattern is an alternative approach that is based on AWS CloudFormation StackSets. Instead of sharing portfolio, we use AWS StackSet constraints to set AWS regions and accounts where the resources can be deployed and used. This approach helps to provision the Service Catalog products in multiple accounts, OUs and AWS Regions, and managed from a central location which meets governance requirements.

The benefits of this approach are the following:

1. the product is provisioned and managed from a primary account, and not shared with other accounts.

2. This approach provides a consolidated view of all provisioned products (stacks) that are based on a specific set of templates.

3. The use of a primary account makes the configuration with AWS Service management Connector easier

4. It is easier to query and use products from the AWS Service Catalog.

The steps to provision products across accounts include: 1. Creating a portfolio say with the AWS command line interface 2. Create the template that describes the resources, 3. Create a product with version title and description and 4. Apply constraints to the portfolio to configure product deployment options such as multiple AWS accounts, regions and permissions and 5. Provide permissions to users so that they can launch the products in the portfolio.

Wednesday, February 15, 2023

The Application Migration scenario for serving static content.

Single Page applications and static websites are quite popular for hosting code. By their nature, they are extremely portable and can run from filesystems as well as internet open directories. Yet, security and performance are not always properly considered for their deployments. This articled delves into an architectural pattern to host static website content in the public cloud.

We choose Amazon AWS as the public cloud for this scenario but the pattern is universal and applies across most major public clouds.

When static content is hosted on AWS, the recommended approach is to use an S3 bucket as the origin and the CloudFront to distribute the content geographically. There are two primary benefits to this solution. The first is the convenience of caching static content at edge locations. The second involves defining web access control lists for the CloudFront distribution. This helps to secure requests to the content with minimal configuration and administrative overhead.

The only limitation to this standard recommended approach is that in some cases, virtual firewall appliances may need to be deployed in a virtual private cloud to inspect all content. The standard approach does not route traffic through the virtual private cloud. Therefore, an alternative is needed that still uses a CloudFront distribution to serve static content in an S3 bucket but the traffic is routed through the VPC by using an Application Load Balancer. An AWS Lambda function then retrieves and returns the content from the S3 bucket.

The resources in this pattern must be in a single AWS region, but they can be provisioned in different AWS accounts. The limits apply to maximum request and response size that the Lambda function can receive and send, respectively.

There must be a good balance between performance, scalability, security and cost-effectiveness when using this approach. While Lambda can scale for high availability, the number of concurrent executions must not exceed the maximum quota otherwise requests will be denied.

The architecture lays out the CloudFront as facing the client and communicating with a firewall and two load balancers – one in each availability zone of the region hosting all these resources. The load balancers are created in the public subnet within a virtual private cloud and both communicate with a Lambda function that serves content from a private S3 bucket. When the client requests a URL, the CloudFront distribution forwards the request to a firewall which filters the request using the web ACLs applied to the CloudFront distribution. If the request cannot be served from the internal cache within the CloudFront, it is forwarded to the load balancer which has a listener associated with the target group based on a Lambda function. When the Lambda function is invoked, it performs a GetObject operation on the S3 bucket and returns the content as a response.

The deployment of the static content can be updated using a Continuous Integration / Continuous Deployment facilitating pipeline.

The Lambda function introduced in this pattern can scale to meet the load from all the load balancers and its security can be tightened by specifying the origin as the S3 bucket and similary for the distribution to have the origin as the load balancer. Instead of IP, dns name can be used to refer to the resources.

The following code demonstrates the Lambda function:

var AWS = require('aws-sdk');

exports.handler = function(event, context, callback) {

var bucket = process.env.S3_BUCKET;

var key = event.path.replace('/', '');

if (key == '') { key = 'index.html'; }

// Fetch from S3 var s3 = new AWS.S3();

return s3.getObject({Bucket: bucket, Key: key}, function(err, data) {

if (err) { return err; }

var isBase64Encoded = false;

var encoding = 'utf8';

if (data.ContentType.indexOf('image/') > -1) {

isBase64Encoded = true;

encoding = 'base64'

}

var resp = { statusCode: 200, headers: {

'Content-Type': data.ContentType,

body: new Buffer(data.Body).toString(encoding),

isBase64Encoded: isBase64Encoded

};

callback(null, resp);

} );

};

Tuesday, February 14, 2023

The benefits of this approach are the following:

1. the product is provisioned and managed from a primary account, and not shared with other accounts.

2. This approach provides a consolidated view of all provisioned products (stacks) that are based on a specific set of templates.

3. The use of a primary account makes the configuration with AWS Service management Connector easier

4. It is easier to query and use products from the AWS Service Catalog.

The architecture involves an AWS management account and a target Organizational Unit (OU) account. The CloudFormation template and the service catalog product are in the management account. The CloudFormation stack and its resources are in the target OU account. The user creates an AWS CloudFormation template to provision AWS resources, in JSON or Yaml format. The CloudFormation template creates a product in AWS Service Catalog, which is added to a portfolio. The user creates a provisioned product, which creates CloudFormation stacks in the target accounts. Each stack provisions the resources specified in the CloudFormation templates.
The steps to provision products across accounts include: 1. Creating a portfolio say with the AWS command line interface 2. Create the template that describes the resources, 3. Create a product with version title and description and 4. Apply constraints to the portfolio to configure product deployment options such as multiple AWS accounts, regions and permissions and 5. Provide permissions to users so that they can launch the products in the portfolio.

Monday, February 13, 2023

How to draw a graph image?

A simple way to do it is to use prebuilt libraries using algorithms like the Kamada-Kawai and Fructerman-Reingold layout algorithms

Sample implementation:

From igraph import *

g = Graph()

vertex_labels=[‘A’, ‘B’, ‘C’, ‘D’, ‘E’]

attributes={}

attributes[“label”] = vertex_labels

g.add_vertices(5, **attributes).add_edges([(0, 1),

(1, 2),

(2, 3),

(3, 4),

(4, 0),

(2, 4),

(0, 3),

(4, 1)]

layout = g.layout(“kamada_kawai”)

plot(g, layout=layout)

If we control spacing ourselves to spread the edges with little or no crossing lines, we could anneal the costs of overlapping.

# The following program lays out a graph with little or no crossing lines.

from PIL import Image, ImageDraw

import math

import random

vertex = ['A','B','C','D','E']

links=[('A', 'B'),

('B', 'C'),

('C', 'D'),

('D', 'E'),

('E', 'A'),

('C', 'E'),

('A', 'D'),

('E', 'B')]

domain=[(10,370)]*(len(vertex)*2)

def randomoptimize(domain,costf):

best=999999999

bestr=None

for i in range(1000):

# Create a random solution

r=[random.randint(domain[i][0],domain[i][1]) for i in range(len(domain))]

# Get the cost

cost=costf(r)

# Compare it to the best one so far

if cost<best:

best=cost

bestr=r

return r

def annealingoptimize(domain,costf,T=10000.0,cool=0.95,step=1):

# Initialize the values randomly

vec=[float(random.randint(domain[i][0],domain[i][1]))

for i in range(len(domain))]

while T>0.1:

# Choose one of the indices

i=random.randint(0,len(domain)-1)

# Choose a direction to change it

dir=random.randint(-step,step)

# Create a new list with one of the values changed

vecb=vec[:]

vecb[i]+=dir

if vecb[i]<domain[i][0]: vecb[i]=domain[i][0]

elif vecb[i]>domain[i][1]: vecb[i]=domain[i][1]

# Calculate the current cost and the new cost

ea=costf(vec)

eb=costf(vecb)

p=pow(math.e,(-eb-ea)/T)

# Is it better, or does it make the probability

# cutoff?

if (eb<ea or random.random( )<p):

vec=vecb

# Decrease the temperature

T=T*cool

return vec

def crosscount(v):

# Convert the number list into a dictionary of person:(x,y)

loc=dict([(vertex[i],(v[i*2],v[i*2+1])) for i in range(0,len(vertex))])

total=0

# Loop through every pair of links

for i in range(len(links)):

for j in range(i+1,len(links)):

# Get the locations

(x1,y1),(x2,y2)=loc[links[i][0]],loc[links[i][1]]

(x3,y3),(x4,y4)=loc[links[j][0]],loc[links[j][1]]

den=(y4-y3)*(x2-x1)-(x4-x3)*(y2-y1)

# den==0 if the lines are parallel

if den==0: continue

# Otherwise ua and ub are the fraction of the

# line where they cross

ua=((x4-x3)*(y1-y3)-(y4-y3)*(x1-x3))/den

ub=((x2-x1)*(y1-y3)-(y2-y1)*(x1-x3))/den

# If the fraction is between 0 and 1 for both lines

# then they cross each other

if ua>0 and ua<1 and ub>0 and ub<1:

total+=1

for i in range(len(vertex)):

for j in range(i+1,len(vertex)):

# Get the locations of the two nodes

(x1,y1),(x2,y2)=loc[vertex[i]],loc[vertex[j]]

# Find the distance between them

dist=math.sqrt(math.pow(x1-x2,2)+math.pow(y1-y2,2))

# Penalize any nodes closer than 50 pixels

if dist<50:

total+=(1.0-(dist/50.0))

return total

def drawnetwork(loc):

#create the image

img = Image.new('RGB', (400,400),(255,255,255))

draw=ImageDraw.Draw(img)

#create the position dict

pos=dict([(vertex[i],(loc[i*2],loc[i*2+1])) for i in range(0, len(vertex))])

#Draw Links

for (a,b) in links:

draw.line((pos[a],pos[b]), fill=(255,0,0))

#Draw vertex

for (n,p) in pos.items():

draw.text(p,n,(0,0,0))

img.save('graph.jpg', 'JPEG')

img.show()

sol=randomoptimize(domain,crosscount)

crosscount(sol)

sol=annealingoptimize(domain,crosscount,step=50,cool=0.99)

crosscount(sol)

drawnetwork(sol)

Sunday, February 12, 2023

Previous article continued

Transformers changed that and in fact, were developed for the purposes of translation. Unlike RNNs, they could be parallelized. This meant that transformers could be used to train on large data sets. GPT-3 that writes poetry and code and writes conversations was trained on almost 45 Terabytes of text data and including the entire world wide web. It scales really well with a huge data set.

Transformers work very well because of three components: 1. Positional Encoding, 2. Attention and 3. Self-Attention. Positional encoding is about enhancing the data with positional information rather than encoding it in the structure of the network. As we train the network on lots of text data, the transformers learn to interpret those positional encodings. It really helped transformers easier to train than RNN. Attention refers to a concept that originated from the paper aptly titled “Attention is all you need”. It is a structure that allows a text model to look at every single word in the original sentence when making a decision to translate the word in the output. A heat map for attention helps with understanding the word and its grammar. While attention is for understanding the alignment of words, self-attention is for understanding the underlying meaning of a word so as to disambiguate it from other usages. This often involves an internal representation of the word also referred to as its state. When attention is directed towards the input text, there can be differences understood between say “server, can I have the check” and the “I crashed the server” to interpret the references to a human versus a machine server. The context of the surrounding words helps with this state.

BERT, an NLP model make use of attention and can be used for a variety of purposes such as text summarization, question answering, classification and finding similar sentences. BERT also helps with Google search and Google cloud AutoML language. Google has made BERT available for download via TensorFlow library while Hugging Face company has made Transformers available in Python language.