Cluster computing

Monday, January 17, 2022

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most networking discussions on Azure Private Link which is a full-fledged general availability service. A follow-up article discussed troubleshooting the network connectivity issues encountered with the Azure Private Endpoint

This article discusses the troubleshooting of network connectivity problems as encountered with Azure Private Link Service from operations:

1) When users connect to their Platform-as-a-Service aka PaaS services such as Azure Storage, Azure Cosmos DB, and Azure SQL Database using Azure Private Link, they are connecting to the Azure Private Endpoints on their virtual networks. Traffic between the virtual network and the service goes over the Microsoft Backbone network, which eliminates the exposure over the public internet. It also helps deliver those services privately to their customers. It can be enabled behind a standard load balancer for Azure Private Link access. In this case as well, the customers can create a private endpoint inside their virtual network rather than require it to be deployed with the PaaS service and map to it privately. This solution effectively brings those services to the consumer's over private access. By definition, a virtual network and the private link must be in the same region. Use of multiple regions to connect a virtual network is not feasible without some form of regionally peered or globally peered virtual networks. The customer on-premises over VPN or Azure Express circuits can also access these private links but care must be taken to ensure that the workloads are not impacted.

2) When a connectivity problem is encountered, the setup configuration must be checked. The Private Link Center provides an option to do so. The private link is selected to diagnose, and the virtual network and DNS information is ascertained to be correct. The connection state must be in an Approved state. The resource must have connectivity to the virtual network that hosts the private link. The FQDN and the private IP Address must be assigned. If these are correct, the setup is fine, but the data transfer might not be occurring. The Azure Monitor displays the Bytes in or Bytes out metrics for an Azure Private Link. Attempting to connect to the private Link should display the data as flowing in under ten minutes. If this is not the case, the connection from the resource must be checked. The outbound connections page for the resource has a connection troubleshooting option as a Test Connection feature. This can be used with the Network Watcher for troubleshooting the connection. It is preferred to test by Fully Qualified Domain Name aka FQDN for the resource.

3) When a connection problem is identified aside from the private link or its configuration, it would need to be investigated along with the name-resolution, Network Security Groups aka NSGs, or effective routes. Name resolution errors might occur due to DNS settings rather than IP connectivity. The steps to check the DNS settings depends on the use of a private zone or a custom DNS. If a private zone is used, the corresponding DNS zone record must exist. If it does not exist, it must be created. Delegation must be setup properly for the domain DNS to resolve to the child records. If a custom DNS is used, the settings must resolve to the private IP address of the private link. The existing Azure services must already have a DNS configuration to use when connecting to a public ip address. This configuration must be overwritten to connect to the private link.

These are some of the ways in which the private connectivity can be made to work and used effectively as a replacement to public connectivity.

Sunday, January 16, 2022

This article discusses the troubleshooting of network connectivity problems as encountered with Azure Private Endpoints from operations:

1) When users convert their storage accounts or key vaults to use Azure Private endpoints, they are taking it offline from publicly accessible Internet and exposing it internally to their virtual networks. If they have existing usages of their resources that span more than one region, they will not be able to keep up those usages once those resources fall off the publicly vulnerable addresses. This does not impact usages that are all within the virtual network because an Azure private endpoint is a network interface that connects the other resources privately and securely to a private link service. The workloads running on other resources are given an option for private network connectivity as compared to the publicly vulnerable connectivity. This solution effectively brings those services to the virtual network.

By definition, a virtual network and the private endpoint must be in the same region. Use of multiple regions to connect a virtual network is not feasible without some form of regionally peered or globally peered virtual networks. The customer on-premises over VPN or Azure Express circuits can also access these private endpoints but care must be taken to ensure that the workloads are not impacted.

2) When a connectivity problem is encountered, the setup configuration must be checked. The Private Link Center provides an option to do so. The private endpoint is selected to diagnose, and the virtual network and DNS information is ascertained to be correct. The connection state must be in an Approved state. The resource must have connectivity to the virtual network that hosts the private endpoint. The FQDN and the private IP Address must be assigned. If these are correct, the setup is fine, but the data transfer might not be occurring. The Azure Monitor displays the Bytes in or Bytes out metrics for an Azure Private Endpoint. Attempting to connect to the private endpoint should display the data as flowing in under ten minutes. If this is not the case, the connection from the resource must be checked. The outbound connections page for the resource has a connection troubleshooting option as a Test Connection feature. This can be used with the Network Watcher for troubleshooting the connection. It is preferred to test by Fully Qualified Domain Name aka FQDN for the resource.

3) When a connection problem is identified aside from the private endpoint or its configuration, it would need to be investigated along with the name-resolution, Network Security Groups aka NSGs, or effective routes. Name resolution errors might occur due to DNS settings rather than IP connectivity. The steps to check the DNS settings depends on the use of a private zone or a custom DNS. If a private zone is used, the corresponding DNS zone record must exist. If it does not exist, it must be created. Delegation must be setup properly for the domain DNS to resolve to the child records. If a custom DNS is used, the settings must resolve to the private IP address of the private endpoint. The existing Azure services must already have a DNS configuration to use when connecting to a public endpoint. This configuration must be overwritten to connect to the private endpoint.

These are some of the ways in which the private connectivity can be made to work and used effectively as a replacement to public connectivity.

Saturday, January 15, 2022

This is a continuation of the sample queries written for Azure Public Cloud for diagnostic purposes. The topic was introduced in this article earlier.

Sample Kusto queries:

1) When log entries do not have function names, scopes or duration of calls:

source

| where description Contains "<string-before-scope-of-execution>"

| project SessionId, StartTime=timestamp

| join (source

| where description Contains "<string-after-scope-of-execution>"

| project StopTime=timestamp, SessionId)

on SessionId

| project SessionId, StartTime, StopTime, duration = StopTime - StartTime

| summarize count() by duration=bin(min_duration/1s, 10)

| sort by duration asc

| render barchart

2) Since the duration column is also relevant to other queries later

source | extend duration = endTime – sourceTime

3) When the log entries do not have an exact match for a literal:

source

| filter EventText like "NotifyPerformanceCounters"

| extend Tenant = extract("tenantName=([^,]+),", 1, EventText)

4) If we wanted to use regular expressions on EventText:

source

| parse EventText with * "resourceName=" resourceName ",

totalSlices=" totalSlices:long * releaseTime=" releaseTime:date ")" *

| valid in~ ("true", "false")

5) If we wanted to read signin logs:

source

| evaluate bag_unpack(LocationDetails)

| where RiskLevelDuringSignIn == 'none'

and TimeGenerated >= ago(7d)

| summarize Count = count() by city

| sort by Count desc

| take 5

Friday, January 14, 2022

Virtual Network gateways in availability zones:

VPN and ExpressRoute gateways can be deployed to Azure Availability Zones. Previously, they were deployed to regions but now we have the ability to deploy them to the zones within the region. On one hand this improves the resiliency, scalability, and higher availability for virtual network gateways and on the other hand it opens more opportunities for the use of the gateways particularly with Azure traffic manager. Deploying gateways in Azure Availability Zones physically and logically separates gateways within a region, while protecting the on-premises network connectivity to Azure from zone-level failures. By deploying zonal gateways to each of the three zones and spanning a traffic manager over the gateways, we can now route traffic with zone isolation. This helps with availability zone down simulations. A use of TrafficManager to divert traffic was described earlier in this article: https://1drv.ms/w/s!Ashlm-Nw-wnWzhVd4TIY70gOs48M?e=ma9y5q

When we deploy It across availability zones, we can use zone-redundant virtual network gateways. This adds zone resilience to mission critical scalable services. Zone-redundant and Zonal gateways both rely on the Azure public IP resource standard SKU. The public IP address created using the standard public IP Sku, the behavior depends on whether the gateway is a VPN gateway, or an ExpressRoute gateway. Two gateway instances will be deployed in any two out of three availability zones that provide zone redundancy for a VPN gateway. All three zones can be spanned by an ExpressRoute gateway.

This can be compared to a zonal gateway where all the gateway instances will be deployed in the same zone that is specified by the user. The zones are identified by the numerals 1,2, or 3 and there can be upto three zones within a region. The public IP address must be created using the standard public IP SKU

When a regional gateway is deployed with a Basic public IP SKU, the gateway does not have zone redundancy built into it. Instead, when the gateways are deployed with zone redundancy across availability zones, each availability zone is a different fault and update domain. This makes the gateway more reliable, available and resilient to zone failures.

The Azure portal can be used to deploy the SKUs but the SKUs will be seen only in those regions that have availability zones. These gateways must be created new. They cannot be changed, migrated or upgraded from existing gateways to zone-redundant or zonal gateways. Co-existence of both VPN and ExpressRoute gateways in the same virtual network is supported but a /27 IP address range must be reserved for the gateway subnet.

his is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on Azure VPN Gateway which is a full-fledged general availability service.

Thursday, January 13, 2022

Symmetric Keys part 2

This is a continuation of the discussion on Symmetric Keys and its use in encrypting data. The algorithm and its output were brought up in the previous article but not the operational engineering considerations. This article completes that discussion.

The result of generating a key is a 16-character Identity Vector and a 32-character Key. When an encrypted data is written to a file, the identity vector must be included. Symmetric key encryption is fast, efficient and can work on block and streams for large data. Asymmetric encryption can be used for small sized data such as a password or a symmetric key.

A key is useful only to the point that it is not compromised. One of the most difficult things to do is guard a secret indefinitely. That is why secrets are often changed routinely. A symmetric key is just like a password but instead of allowing the system to carry it forward, it must be overwritten with a new one. A history can be maintained for all the keys generated. Data once encrypted with a key must be decrypted with that key before it can be re-encrypted with another key. This poses a few questions.

What happens when a new key must be generated? Will any part of the old key come useful? The answer is yes, we need to know the source and identity of the key without which a key cannot be copied or scripted. When we have this information, we can create a new key and re-encrypt locally. Then we can move the encrypted data and recreate the key in the new location. We have all the important seed data available for that new key at destination. This is how symmetric key encrypted data is moved from a source to destination independent of whether the data resides in a filesystem or a SQL Database.

This was about restore. One of the incorrect myths about symmetric keys is about backup. It goes something like this: You do not need to back up your symmetric key if it was created from a certificate, because you can just recreate it. This is not only incorrect it is also risky for the data that is being encrypted. The assumption for this myth is that two symmetric keys that are created with the encryption by certificate clause using the same certificate will be identical. Since the certificate can be backed up, the keys don’t need a backup. Here the cause-and-effect argument is incorrect.

When symmetric keys are generated in SQL server with the same certificate, they will have different key_guids. This is the first indicator but not the conclusive one. The certificate_id from the certificate and the crypt_property attribute of the symmetric key will also be different which is the conclusive proof that the symmetric keys are not the same.

An asymmetric key can also not be backed up. Two symmetric keys created with the same asymmetric key will also be different.

Those myths debunked; it should become clear from the approach in restore that backup is not necessary.

There is, however, a solution used in many commercial ventures that involves the use of an external key management system that eliminates the maintenance overhead of using symmetric keys.

Wednesday, January 12, 2022

Symmetric Keys

Introduction: Encryption is critical to protect data such as personally identifiable information. Symmetric key encryption allows the same key to be used both for encryption as well as decryption. Compare this to public key-private key encryption that is more ubiquitous and involves an encryption with the public key and decryption with the private key. The difference between them is that the symmetric key needs to exist at both source and destination while the private key for decryption is needed only with the party that decrypts. Since the transfer of key is avoided, the public key, private key becomes more popular while symmetric keys are used for faster and light-weight encryption.

Once the symmetric keys are created, they can be treated as passwords or adhoc secrets. KeyVaults and secret management stores can come in helpful to allow multiple parties to access it safely. The use of symmetric keys goes hand in hand with KeyVaults in many production systems.

Symmetric encryption algorithms are of two types:

1. Block algorithms: A set length of bits are encrypted in blocks of electronic data with the use of a specific secret key. The data is retained in memory as the system encrypts and waits for complete blocks.

2. Stream algorithms: This does away with the retaining and continuously encrypts the data as it streams.

Examples include AES, DES, IDEA, BlowFish, RC4, RC5, RC6

The keys can be generated in code as simply as the following example in C#:

using System.Security.Cryptography;

AesCryptoServiceProvider Aes = new AesCryptoServiceProvider();

Aes.GenerateIV();

Aes.GenerateKey();

Or in SQL as follows:

CREATE SYMMETRIC KEY SampleKey01

WITH ALGORITHM = AES_256

ENCRYPTION BY CERTIFICATE Certificate01;

A sample usage of symmetric key is cited as

Encrypt(UserID + ClientID) = Token

where UserID is a large integer and and Client ID is a regular integer. The original text can be 16 and 8 characters in length which gives us 24 characters. We used fixed length for both UserID and ClientID and pad left. If we want to keep the size of the encrypted text to be the same as the original string, we could choose AES stream encryption. If we were to use stronger algorithms the size would bloat. And when we use hex or base64 encode, the text could double in size.

Tuesday, January 11, 2022

VPN Gateway

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on Azure DNS which is a full-fledged general availability service that provides similar Service Level Agreements as expected from others in the category. In this document, we discuss VPN Gateway.

A VPN gateway is a specific type of virtual network gateway that is used to send traffic between an Azure Virtual Network and on-premises location over the public internet. The source and the destination can be any two virtual networks if there is an internet connectivity between them. They can even be across geographical regions. The VPN adds an IP header over the existing IP header so that the packet travels across the internet with one IP address but is peeled to determine the other IP address only the remote network knows about. That is why it is called a Tunnel. When we create multiple connections to the same VPN gateway, all the VPN tunnels share the available gateway bandwidth. A gateway is composed of two or more VMs that are automatically configured or deployed to a specific subnet, and these contain routing tables and specific gateway services.

The gateway configuration includes the gateway type which determines how the gateway will be used and the actions it can take. At the time of creation, we can specify whether an IPSec/IKE tunnel is used or a VNet-to-VNet tunnel is used but one of the most common usages is the Point-To-Site VPN connectivity. Cloud sites and Virtual machines leverage this so that the resource itself does not need to have a public IP assigned but the service is accessible over the VPN. Even DNS servers can be used in the VNets if they can resolve the domain names needed for Azure. The Point-to-Site connectivity occurs over the Secure Socket Tunneling Protocol or IKEv2. It lets us connect from a single computer to any resource within a virtual network. A certificate and a VPN client configuration package is required to set it up. Gateways can be policy-based gateway or route-based gateway. Even custom policies or TrafficSelectors can be specified.

When an Azure VM is setup for Point-to-Site connectivity, it does not need a public IP address nor the RDP/SSH firewall rule. By adding a virtual network gateway, a root and client certificate, downloading a VPN client and then running the setup, we can have a network reachable working VM that is part of the remote network such as the workplace and accessible from a computer over the VPN. We can verify the VPN connection by using the RDP to connect and targeting the private IP of the VM and not the public IP address.

The networking does not affect the authentication. If the Azure Active Directory account can log in to the Virtual Machine, it can continue to do so over the VPN connection.