Cluster computing

Wednesday, June 28, 2023

The following IaC shows github integration.

data "azuread_client_config" "current" {}

variable namespace {

description = “The namespace for end-user deployment”

type = string

   default = "${var.name}-" + uuid()

}

resource "azuread_group" "contributor_group" {

display_name = "${var.namespace} contributor group"

owners = [data.azuread_client_config.current.object_id]

security_enabled = true

onpremises_group_type = "UniversalSecurityGroup"

onpremises_sync_enabled = true

}

resource "azuread_group" "operator_group" {

display_name = "${var.namespace} operator group"

owners = [data.azuread_client_config.current.object_id]

security_enabled = true

onpremises_group_type = "UniversalSecurityGroup"

onpremises_sync_enabled = true

}

resource "github_team" "deployment_contributors" {

name = "${var.namespace} contributor-team"

description = "Has read-write access"

privacy = "closed"

}

resource "github_team" "deployment_operators" {

name = "${var.namespace} operator-team"

description = "Has read-only access"

privacy = "closed"

}

resource "github_repository" "pipelines" {

name = "${var.namespace}-pipelines"

description = "${var.namespace} pipeline artifacts"

visibility = "private"

private = true

auto_init = true

template {

owner = "MyOrganization"

repository = "pipeline-template"

include_all_branches = true

}

resource "github_branch" "contributors-branch" {

repository = github_repository.pipelines.name

branch = "contributors-branch"

}

resource "github_branch" "operators-branch" {

repository = github_repository.pipelines.name

branch = "operators-branch"

}

resource "github_branch_protection" "contributors_branch_protection" {

repository_id = github_repository.pipelines.name

pattern = github_branch.contributors-branch.branch

enforce_admins = true

allows_deletions = false

push_restrictions = [

data.github_team.deployment_contributors.name,

]

}

resource "github_branch_protection" "operators_branch_protection" {

repository_id = github_repository.pipelines.name

pattern = github_branch.operators-branch.branch

enforce_admins = true

allows_deletions = false

push_restrictions = [

data.github_team.deployment_operators.name,

]

}

Tuesday, June 27, 2023

Firewall Rules:

This article follows up on a previous one regarding firewall rules. A firewall serves to deter hacker attacks against web applications. They are also referred to as Web Application Shields or Web Application Security Filters. This section of the article is aimed at technical decision makers as well as application owners so that they can be better prepared with the concepts behind the best practices in setting up a web application firewall.

The access to a web application measures the extent to which the required changes to the application source code are carried out in-house, on time, or can be carried out by third parties. Between the extremes of no access and full access, a WAF can come useful to consolidate access and provide safety measures such as encryption. In between these extremes, the benefits of a WAF is less when the application is mostly developed in house with low buy-ins and more when the application has high percentage modifications and more buy-ins.

Unlike securing transport of data between clients and servers, the firewall does not come with an option to offload to an external device and is designed to be a software plug-in. Prioritizing the web applications for securing behind a firewall depends on access to personal data, access to confidential information, essential requirement for the completion of critical business processes, and the relevance for the attainment of critical certifications. When access is denied from a firewall, some risks and costs apply such as interruption of business processes, damage compensation claims, and others. Maintenance contract of the applications and the short error replication times play a significant role in how a firewall is perceived just as much as its features are used even when configured correctly.

A WAF can help with cookie protection with its support for signed and encrypted cookies. It can prevent information leakage with the use of a cloaking filter or cleaning filter. It tackles session riding with URL encryption/token. It can check for viruses on file upload. It can deter parameter tampering and forced browsing. It provides protection against path traversal and link validation. It provides logging for specific or permitted parts of the requests. It can force SSL, prevent cross-site tracing, command injection, SQL injection, and just in time patching. It provides protection against HTTP request smuggling.

The central or decentral infrastructure, performance criteria, conforming to existing security policies, iterative implementation from basic security to full protection, role distribution, prioritizing applications and providing full protection are some of the areas of best practice.

Monday, June 26, 2023

Firewall rules:

Access control can be role-based with a determination of the identity of a caller based on what the caller has or what the caller knows to authenticate and authorize the caller and then determining the permissions based on the associated role. While this can be an elaborate mechanism, some simple criteria can suffice to determine admission control. The difference between the two is clear when the system does not want to provide any further information on the failures of the calls including whether a second chance or a retry is permitted. The behavior is at par with rate limits against calls to a resource, so the behavior is expected for API calls. The two mechanisms can even be complimentary to each other.

An allowlist allows access to those in the list but does not specify any behavior for those that are not in the list. That is why another list is required to articulate who must be denied. The addition of a single entity on an allowlist implicitly adds all others to the deny list, so some systems go the extra length of automatically adding a deny all whenever an allowlist entry is added. These complimentary lists must be made mutually exclusive and with a priority for allow and a catch all as deny.

An allowlist is maintained across various types of clients to the resource. Everyone must opt into the whitelist, or the system must enforce it, otherwise the allowlist simply means nothing. It is also harder to enforce when the system does not actively monitor or make use of the allowlist, and they are understood and enforced outside the system by third party.

Inclusion of an allowlist or a denylist is sometimes considered to be an antipattern, especially, when the system does not need to provide admission control and can scale arbitrarily to any load without affecting other callers. When there is possibility of denial of service to others by the calls of a few specific clients or bots, then these lists can be justified. If there are many criteria by which allow or deny must be decided, then the lists don’t suffice and a fully developed classifier to fully encapsulate the decision-making ability and to interpret the properties of the callers might be justified. The convention for evaluating rules in a classifier is to evaluate them one by one and in program order. The processing can stop at any rule or falls through the rest of the rules. The rules are evaluated as if there was an ‘or’ condition between the previous and the current. The logic, on the other hand, inside a single rule can be complex involving both the ‘or’ and ‘and’ logical operators.

Often allowlists and denylists can become part of a rule and the system stores and processes these rules. The expressions involved in a rule can be evaluated as a tree if it were not restricted to a flat sequence of first-level predicates. The evaluation of an expression tree is recursive and might require the plan to be compiled and saved so that they are not repeatedly prepared for matching against the criteria of a caller. The way the regular expressions are compiled before matching speaks to how the classifier must match the incoming criteria.

Sunday, June 25, 2023

# REQUIRES -Version 2.0

Synopsis: The following Powershell script serves as the complimentary

example towards the backup and restore of an AKS cluster introduced with backup script

The concept behind this form of BCDR solution is described here:

https://learn.microsoft.com/en-us/azure/backup/azure-kubernetes-service-cluster-backup-concept

param (

[Parameter(Mandatory=$true)][string]$resourceGroupName,

[Parameter(Mandatory=$true)][string]$accountName,

[Parameter(Mandatory=$true)][string]$subscriptionId,

[Parameter(Mandatory=$true)][string]$aksClusterName,

[Parameter(Mandatory=$true)][string]$aksClusterRG,

[string]$backupVaultRG = "testBkpVaultRG",

[string]$backupVaultName = "TestBkpVault",

[string]$location = "westus",

[string]$containerName = "backupc",

[string]$storageAccountName = "sabackup",

[string]$storageAccountRG = "rgbackup",

[string]$environment = "AzureCloud"

)

Connect-AzAccount -Environment "$environment"

Set-AzContext -SubscriptionId "$subscriptionId"

Write-Host "Before we start, test the backup vault"

$TestBkpVault = Get-AzDataProtectionBackupVault -VaultName $backupVaultName -ErrorAction Stop

if ($TestBkpVault -eq $null) {

Write-Host "This script should not be executed if the vault cannot be found."

exit 1

}

$policyDefn = Get-AzDataProtectionPolicyTemplate -DatasourceType AzureKubernetesService

$policyDefn.PolicyRule[0].Trigger | fl

ObjectType: ScheduleBasedTriggerContext

ScheduleRepeatingTimeInterval: {R/2023-04-05T13:00:00+00:00/PT4H}

TaggingCriterion: {Default}

$policyDefn.PolicyRule[1].Lifecycle | fl

DeleteAfterDuration: P7D

DeleteAfterObjectType: AbsoluteDeleteOption

SourceDataStoreObjectType : DataStoreInfoBase

SourceDataStoreType: OperationalStore

TargetDataStoreCopySetting:

$aksBkpPol = Get-AzDataProtectionBackupPolicy -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -Name "aksBkpPolicy"

if ($aksBkpPol -eq $null) {

Write-Host "This script should not be executed if there was no backup policy"

}

Write-Host "Tracking all the backup jobs"

$job = Search-AzDataProtectionJobInAzGraph -Subscription $subscriptionId -ResourceGroupName $backupVaultRG -Vault $TestBkpVault.Name -DatasourceType AzureKubernetesService -Operation OnDemandBackup

Write-Host "Fetch the relevant recovery point"

$AllInstances = Get-AzDataProtectionBackupInstance -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name

Write-Host "Searching across multiple vaults and subscriptions"

$AllInstances = Search-AzDataProtectionBackupInstanceInAzGraph -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -DatasourceType AzureKubernetesService -ProtectionStatus ProtectionConfigured

if ($AllInstances -eq $null) {

Write-Host "This script should not be executed if there was no backup instance."

}

Write-Host "Once the instance is identified, fetch the relevant recovery point"

$rp = Get-AzDataProtectionRecoveryPoint -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -BackupInstanceName $AllInstances[2].BackupInstanceName

Write-Host "Prepare the restore request"

$aksClusterId= "/subscriptions/$subscriptionId/resourceGroups/$resourceGroup/providers/Microsoft.ContainerService/managedClusters/$aksClusterName"

$aksRestoreCriteria = New-AzDataProtectionRestoreConfigurationClientObject -DatasourceType AzureKubernetesService -PersistentVolumeRestoreMode RestoreWithVolumeData -IncludeClusterScopeResource $true -NamespaceMapping @{"sourceNamespace"="targetNamespace"}

$backupInstance = $AllInstance[2]

$aksRestoreRequest = Initialize-AzDataProtectionRestoreRequest -DatasourceType AzureKubernetesService -SourceDataStore OperationalStore -RestoreLocation $location -RestoreType OriginalLocation -RecoveryPoint $rps[0].Property.RecoveryPointId -RestoreConfiguration $aksRestoreCriteria -BackupInstance $backupInstance

Write-Host "Trigger the restore"

$validateRestore = Test-AzDataProtectionBackupInstanceRestore -SubscriptionId $subscriptionId -ResourceGroupName $aksClusterRG -VaultName $backupVaultName -RestoreRequest $aksRestoreRequest -Name $backupInstance.BackupInstanceName

$restoreJob = Start-AzDataProtectionBackupInstanceRestore -SubscriptionId $subscriptionId -ResourceGroupName $aksClusterRG -VaultName $backupVaultName -BackupInstanceName $backupInstance.BackupInstanceName -Parameter $aksRestoreRequest

Write-Host "Track all the restore jobs"

$job = Search-AzDataProtectionJobInAzGraph -Subscription $subscriptionId -ResourceGroupName $backupVaultRG -Vault $TestBkpVault.Name -DatasourceType AzureDisk -Operation OnDemandBackup

Saturday, June 24, 2023

# REQUIRES -Version 2.0

Synopsis: The following Powershell script serves as a partial example

towards backup and restore of an AKS cluster.

The concept behind this form of BCDR solution is described here:

https://learn.microsoft.com/en-us/azure/backup/azure-kubernetes-service-cluster-backup-concept

param (

[Parameter(Mandatory=$true)][string]$resourceGroupName,

[Parameter(Mandatory=$true)][string]$accountName,

[Parameter(Mandatory=$true)][string]$subscriptionId,

[Parameter(Mandatory=$true)][string]$aksClusterName,

[Parameter(Mandatory=$true)][string]$aksClusterRG,

[string]$backupVaultRG = "testBkpVaultRG",

[string]$backupVaultName = "TestBkpVault",

[string]$location = "westus",

[string]$containerName = "backupc",

[string]$storageAccountName = "sabackup",

[string]$storageAccountRG = "rgbackup",

[string]$environment = "AzureCloud"

)

Connect-AzAccount -Environment "$environment"

Set-AzContext -SubscriptionId "$subscriptionId"

$storageSetting = New-AzDataProtectionBackupVaultStorageSettingObject -Type LocallyRedundant -DataStoreType OperationalStore

New-AzDataProtectionBackupVault -ResourceGroupName $backupVaultRG -VaultName $backupVaultName -Location $location -StorageSetting $storageSetting

$TestBkpVault = Get-AzDataProtectionBackupVault -VaultName $backupVaultName

$policyDefn = Get-AzDataProtectionPolicyTemplate -DatasourceType AzureKubernetesService

$policyDefn.PolicyRule[0]. Trigger | fl

ObjectType: ScheduleBasedTriggerContext

ScheduleRepeatingTimeInterval: {R/2023-04-05T13:00:00+00:00/PT4H}

TaggingCriterion: {Default}

$policyDefn.PolicyRule[1]. Lifecycle | fl

DeleteAfterDuration: P7D

DeleteAfterObjectType: AbsoluteDeleteOption

SourceDataStoreObjectType : DataStoreInfoBase

SourceDataStoreType: OperationalStore

TargetDataStoreCopySetting:

New-AzDataProtectionBackupPolicy -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -Name aksBkpPolicy -Policy $policyDefn

$aksBkpPol = Get-AzDataProtectionBackupPolicy -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -Name "aksBkpPolicy"

Write-Host "Installing Extension with cli"

az k8s-extension create --name azure-aks-backup --extension-type microsoft.dataprotection.kubernetes --scope cluster --cluster-type managedClusters --cluster-name $aksClusterName --resource-group $aksClusterRG --release-train stable --configuration-settings blobContainer=$containerName storageAccount=$storageAccountName storageAccountResourceGroup=$storageAccountRG storageAccountSubscriptionId=$subscriptionId

az k8s-extension show --name azure-aks-backup --cluster-type managedClusters --cluster-name $aksClusterName --resource-group $aksClusterRG

az k8s-extension update --name azure-aks-backup --cluster-type managedClusters --cluster-name $aksClusterName --resource-group $aksClusterRG --release-train stable --config-settings blobContainer=$containerName storageAccount=$storageAccountName storageAccountResourceGroup=$storageAccountRG storageAccountSubscriptionId=$subscriptionId # [cpuLimit=1] [memoryLimit=1Gi]

az role assignment create --assignee-object-id $(az k8s-extension show --name azure-aks-backup --cluster-name $aksClusterName --resource-group $aksClusterRG --cluster-type managedClusters --query identity.principalId --output tsv) --role 'Storage Account Contributor' --scope /subscriptions/$subscriptionId/resourceGroups/$storageAccountRG/providers/Microsoft.Storage/storageAccounts/$storageAccountName

az aks trustedaccess rolebinding create \

-g $aksClusterRG \

--cluster-name $aksClusterName\

–n randomRoleBindingName \

--source-resource-id $TestBkupVault.Id \

--roles Microsoft.DataProtection/backupVaults/backup-operator

Write-Host "This section is detailed overview of TrustedAccess"

az extension add --name aks-preview

az extension update --name aks-preview

az feature register --namespace "Microsoft.ContainerService" --name "TrustedAccessPreview"

az feature show --namespace "Microsoft.ContainerService" --name "TrustedAccessPreview"

az provider register --namespace Microsoft.ContainerService

# Create a Trusted Access RoleBinding in an AKS cluster

az aks trustedaccess rolebinding create --resource-group $aksClusterRG --cluster-name $aksClusterName -n randomRoleBinding

Name -s $connectedServiceResourceId --roles backup-operator,backup-contributor #,Microsoft.Compute/virtualMachineScaleSets/test-node-reader,Microsoft.Compute/virtualMachineScaleSets/test-admin

Write-Host "Update an existing Trusted Access Role Binding with new roles"

# Update RoleBinding command

az aks trustedaccess rolebinding update --resource-group $aksClusterRG --cluster-name $aksClusterName -n randomRoleBindingName --roles backup-operator,backup-contributor

Write-Host "Configure Backup"

$sourceClusterId = "/subscriptions/$subscriptionId/resourcegroups/$aksClusterRG /providers/Microsoft.ContainerService/managedClusters/$aksClusterName"

Write-Host "Snapshot resource group"

$snapshotRG = "/subscriptions/$subscriptionId/resourcegroups/snapshotrg"

Write-Host "The configuration of backup is performed in two steps"

$backupConfig = New-AzDataProtectionBackupConfigurationClientObject -SnapshotVolume $true -IncludeClusterScopeResource $true -DatasourceType AzureKubernetesService -LabelSelector "env=$environment"

$backupInstance = Initialize-AzDataProtectionBackupInstance -DatasourceType AzureKubernetesService -DatasourceLocation $dataSourceLocation -PolicyId $aksBkpPol.Id -DatasourceId $sourceClusterId -SnapshotResourceGroupId $snapshotRG -FriendlyName "Backup of AKS Cluster $aksClusterName" -BackupConfiguration $backupConfig

Write-Host "Assign required permissions and validate"

$aksCluster = $(Get-AzAksCluster -Id $sourceClusterId)

Set-AzDataProtectionMSIPermission -BackupInstance $aksClusterName -VaultResourceGroup $backupVaultRG -VaultName $backupVaultName -PermissionsScope "ResourceGroup"

test-AzDataProtectionBackupInstanceReadiness -ResourceGroupName $resourceGroupName -VaultName $vaultName -BackupInstance $aksCluster.Property

Write-Host "Protect the AKS cluster"

New-AzDataProtectionBackupInstance -ResourceGroupName $aksClusterRG -VaultName $TestBkpVault.Name -BackupInstance $aksCluster.Property

Write-Host "Run on-demand backup"

$instance = Get-AzDataProtectionBackupInstance -SubscriptionId $subscriptionId -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -Name $aksClusterName

Write-Host "Specify Retention Rule"

$policyDefn.PolicyRule | fl

BackupParameter: Microsoft.Azure.PowerShell.Cmdlets.DataProtection.Models.Api20210201Preview.AzureBackupParams

BackupParameterObjectType: AzureBackupParams

DataStoreObjectType: DataStoreInfoBase

DataStoreType: OperationalStore

Name: BackupHourly

ObjectType: AzureBackupRule

Trigger: Microsoft.Azure.PowerShell.Cmdlets.DataProtection.Models.Api20210201Preview.ScheduleBasedTriggerContext

TriggerObjectType: ScheduleBasedTriggerContext

IsDefault: True

Lifecycle: {Microsoft.Azure.PowerShell.Cmdlets.DataProtection.Models.Api20210201Preview.SourceLifeCycle}

Name: Default

ObjectType: AzureRetentionRule

Write-Host "Trigger on-demand backup"

$AllInstances = Get-AzDataProtectionBackupInstance -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name

Backup-AzDataProtectionBackupInstanceAdhoc -BackupInstanceName $AllInstances[0].Name -ResourceGroupName $backupVaultRG -VaultName $TestBkpVault.Name -BackupRuleOptionRuleName "Default"

Write-Host "Tracking all the backup jobs"

$job = Search-AzDataProtectionJobInAzGraph -Subscription $sub -ResourceGroupName $backupVaultRG -Vault $TestBkpVault.Name -DatasourceType AzureKubernetesService -Operation OnDemandBackup

Friday, June 23, 2023

How to address IaC shortcomings – Part 6b?

A previous article discussed a resolution to IaC shortcomings for declaring resources with configuration not yet supported by an IaC repository. This article discusses irreversible changes and manual intervention for certain Iac deployments.

IaC is an agreement between the IaC provider and the resource provider. An attribute of a resource can only be applied when the IaC-provider applies it in the way the resource provider expects and the resource-provider provisions in the way that the IaC provider expects. In many cases, this is honored but some attributes can get out of sync resulting in unsuccessful deployments of what might seem to be correct declarations.

One of the limitations is when one resource is created as part of the configuration of another resource and there is an association formed between the two resources. It should be possible to reverse the rollout by disassociating the resources before deleting the one that was created. However, sometimes the associations cannot be broken by IaC or by actions on the management portal for the resources. Other forms of management such as the Azure CLI command must be used. In such cases, manual intervention or introduction of logic in the pipeline to break the impasse is required. Only by the mitigation and running the IaC twice, once to detect the conflict for the existing resources and second to reset the configuration, will the IaC start succeeding in subsequent deployments.

The destroy of an existing resource and the creation of a new resource is also required to keep the state in sync with the IaC. If the resource is being missed from the state, it might be interpreted as a resource that was not there in the IaC to begin with and require the destroy before the IaC recognized creation occurs.

It is possible to make use of the best of both worlds with a folder structure that separates the Terraform templates into a folder called ‘module’ and the resource provider templates in another folder at the same level and named something like ‘subscription-deployments’ which includes native blueprints and templates. The GitHub workflow definitions will leverage proper handling of either location or trigger the workflow on any changes to either of these locations.

The native support for extensibility depends on naming and logic.

Naming is facilitated with canned prefixes/suffixes and dynamic random string to make each rollout independent of the previous. Some examples include:

resource "random_string" "unique" {

count = var.enable_static_website && var.enable_cdn_profile ? 1 : 0

length = 8

special = false

upper = false

}

Logic can be written out with PowerShell for Azure public cloud which is the de facto standard for automation language. Then a pseudo resource can be added using this logic as follows:

resource "null_resource" "add_custom_domain" {

count = var.custom_domain_name != null ? 1 : 0

triggers = { always_run = timestamp() }

depends_on = [

azurerm_app_service.web-app

]

provisioner "local-exec" {

command = "pwsh ${path.module}/Setup-AzCdnCustomDomain.ps1"

environment = {

CUSTOM_DOMAIN = var.custom_domain_name

RG_NAME = var.resource_group_name

FRIENDLY_NAME = var.friendly_name

STATIC_CDN_PROFILE = var.cdn_profile_name

}

PowerShell scripts can help with both the deployment as well as the pipeline automations. There are a few caveats with scripts because the general preference is for declarative and idempotent IaC rather than script so extensibility must be given the same due consideration as customization.

All scripts can be stored in folders with names ending with ‘scripts’.
These are sufficient to address the above-mentioned shortcomings in the Infrastructure-as-Code.

Terraform and discusses the order and repetition involved in IaC deployments.

For instance, some attributes of a resource can be specified via the IaC provider but go completely ignored by the resource provider. If there are two attributes that can be specified, the resource-provider reserves the right to prioritize one over the other. Even when a resource attribute is correctly specified, the resource provider could mandate the destroy of existing resource and the creation of a new resource. A more common case is one when where the IaC wants to add a new property for all resources of a specific resource type but there are already existing resources that do not have that property initialized. In such a case, the applying of the IaC change to add a new property will fail for existing instances but succeed for the new instances. Only by running the IaC twice, once to detect the missing property for the existing resources and initialize and second to correctly report the new property, will the IaC start succeeding in subsequent deployments.

The native support for extensibility depends on naming and logic.

Naming is facilitated with canned prefixes/suffixes and dynamic random string to make each rollout independent of the previous. Some examples include:

resource "random_string" "unique" {

count = var.enable_static_website && var.enable_cdn_profile ? 1 : 0

length = 8

special = false

upper = false

}

Logic can be written out with PowerShell for Azure public cloud which is the de facto standard for automation language. Then a pseudo resource can be added using this logic as follows:

resource "null_resource" "add_custom_domain" {

count = var.custom_domain_name != null ? 1 : 0

triggers = { always_run = timestamp() }

depends_on = [

azurerm_app_service.web-app

]

provisioner "local-exec" {

command = "pwsh ${path.module}/Setup-AzCdnCustomDomain.ps1"

environment = {

CUSTOM_DOMAIN = var.custom_domain_name

RG_NAME = var.resource_group_name

FRIENDLY_NAME = var.friendly_name

STATIC_CDN_PROFILE = var.cdn_profile_name

}

All scripts can be stored in folders with names ending with ‘scripts’.
These are sufficient to address the above-mentioned shortcomings in the Infrastructure-as-Code.

Thursday, June 22, 2023

How to address IaC shortcomings – Part 6b?

The native support for extensibility depends on naming and logic.