Cluster computing

Thursday, January 13, 2022

Symmetric Keys part 2

This is a continuation of the discussion on Symmetric Keys and its use in encrypting data. The algorithm and its output were brought up in the previous article but not the operational engineering considerations. This article completes that discussion.

The result of generating a key is a 16-character Identity Vector and a 32-character Key. When an encrypted data is written to a file, the identity vector must be included. Symmetric key encryption is fast, efficient and can work on block and streams for large data. Asymmetric encryption can be used for small sized data such as a password or a symmetric key.

A key is useful only to the point that it is not compromised. One of the most difficult things to do is guard a secret indefinitely. That is why secrets are often changed routinely. A symmetric key is just like a password but instead of allowing the system to carry it forward, it must be overwritten with a new one. A history can be maintained for all the keys generated. Data once encrypted with a key must be decrypted with that key before it can be re-encrypted with another key. This poses a few questions.

What happens when a new key must be generated? Will any part of the old key come useful? The answer is yes, we need to know the source and identity of the key without which a key cannot be copied or scripted. When we have this information, we can create a new key and re-encrypt locally. Then we can move the encrypted data and recreate the key in the new location. We have all the important seed data available for that new key at destination. This is how symmetric key encrypted data is moved from a source to destination independent of whether the data resides in a filesystem or a SQL Database.

This was about restore. One of the incorrect myths about symmetric keys is about backup. It goes something like this: You do not need to back up your symmetric key if it was created from a certificate, because you can just recreate it. This is not only incorrect it is also risky for the data that is being encrypted. The assumption for this myth is that two symmetric keys that are created with the encryption by certificate clause using the same certificate will be identical. Since the certificate can be backed up, the keys don’t need a backup. Here the cause-and-effect argument is incorrect.

When symmetric keys are generated in SQL server with the same certificate, they will have different key_guids. This is the first indicator but not the conclusive one. The certificate_id from the certificate and the crypt_property attribute of the symmetric key will also be different which is the conclusive proof that the symmetric keys are not the same.

An asymmetric key can also not be backed up. Two symmetric keys created with the same asymmetric key will also be different.

Those myths debunked; it should become clear from the approach in restore that backup is not necessary.

There is, however, a solution used in many commercial ventures that involves the use of an external key management system that eliminates the maintenance overhead of using symmetric keys.

Wednesday, January 12, 2022

Symmetric Keys

Introduction: Encryption is critical to protect data such as personally identifiable information. Symmetric key encryption allows the same key to be used both for encryption as well as decryption. Compare this to public key-private key encryption that is more ubiquitous and involves an encryption with the public key and decryption with the private key. The difference between them is that the symmetric key needs to exist at both source and destination while the private key for decryption is needed only with the party that decrypts. Since the transfer of key is avoided, the public key, private key becomes more popular while symmetric keys are used for faster and light-weight encryption.

Once the symmetric keys are created, they can be treated as passwords or adhoc secrets. KeyVaults and secret management stores can come in helpful to allow multiple parties to access it safely. The use of symmetric keys goes hand in hand with KeyVaults in many production systems.

Symmetric encryption algorithms are of two types:

1. Block algorithms: A set length of bits are encrypted in blocks of electronic data with the use of a specific secret key. The data is retained in memory as the system encrypts and waits for complete blocks.

2. Stream algorithms: This does away with the retaining and continuously encrypts the data as it streams.

Examples include AES, DES, IDEA, BlowFish, RC4, RC5, RC6

The keys can be generated in code as simply as the following example in C#:

using System.Security.Cryptography;

AesCryptoServiceProvider Aes = new AesCryptoServiceProvider();

Aes.GenerateIV();

Aes.GenerateKey();

Or in SQL as follows:

CREATE SYMMETRIC KEY SampleKey01

WITH ALGORITHM = AES_256

ENCRYPTION BY CERTIFICATE Certificate01;

A sample usage of symmetric key is cited as

Encrypt(UserID + ClientID) = Token

where UserID is a large integer and and Client ID is a regular integer. The original text can be 16 and 8 characters in length which gives us 24 characters. We used fixed length for both UserID and ClientID and pad left. If we want to keep the size of the encrypted text to be the same as the original string, we could choose AES stream encryption. If we were to use stronger algorithms the size would bloat. And when we use hex or base64 encode, the text could double in size.

Tuesday, January 11, 2022

VPN Gateway

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on Azure DNS which is a full-fledged general availability service that provides similar Service Level Agreements as expected from others in the category. In this document, we discuss VPN Gateway.

A VPN gateway is a specific type of virtual network gateway that is used to send traffic between an Azure Virtual Network and on-premises location over the public internet. The source and the destination can be any two virtual networks if there is an internet connectivity between them. They can even be across geographical regions. The VPN adds an IP header over the existing IP header so that the packet travels across the internet with one IP address but is peeled to determine the other IP address only the remote network knows about. That is why it is called a Tunnel. When we create multiple connections to the same VPN gateway, all the VPN tunnels share the available gateway bandwidth. A gateway is composed of two or more VMs that are automatically configured or deployed to a specific subnet, and these contain routing tables and specific gateway services.

The gateway configuration includes the gateway type which determines how the gateway will be used and the actions it can take. At the time of creation, we can specify whether an IPSec/IKE tunnel is used or a VNet-to-VNet tunnel is used but one of the most common usages is the Point-To-Site VPN connectivity. Cloud sites and Virtual machines leverage this so that the resource itself does not need to have a public IP assigned but the service is accessible over the VPN. Even DNS servers can be used in the VNets if they can resolve the domain names needed for Azure. The Point-to-Site connectivity occurs over the Secure Socket Tunneling Protocol or IKEv2. It lets us connect from a single computer to any resource within a virtual network. A certificate and a VPN client configuration package is required to set it up. Gateways can be policy-based gateway or route-based gateway. Even custom policies or TrafficSelectors can be specified.

When an Azure VM is setup for Point-to-Site connectivity, it does not need a public IP address nor the RDP/SSH firewall rule. By adding a virtual network gateway, a root and client certificate, downloading a VPN client and then running the setup, we can have a network reachable working VM that is part of the remote network such as the workplace and accessible from a computer over the VPN. We can verify the VPN connection by using the RDP to connect and targeting the private IP of the VM and not the public IP address.

The networking does not affect the authentication. If the Azure Active Directory account can log in to the Virtual Machine, it can continue to do so over the VPN connection.

Monday, January 10, 2022

Fault injection Testing:

Stability and Resiliency of software is critical for smooth running of an application. Fault injection testing is the deliberate introduction of errors and faults to a system to observe its behavior. The goal is for the software to work correctly despite errors encountered from calls made to dependencies such as other APIs, system calls and so on. By introducing intermittent failure conditions over time, the application behaves as realistically as in production where hardware and software faults can occur randomly, but the services must remain available, and the business continuity must be maintained.

System needs to be resilient to the conditions that cause production disruptions. The dependencies might include infrastructure, platform, network, 3^rd party software, or APIs. The risk of impact from dependency failure may be direct or cascading. Fault injection methods are a way to increase coverage and validate software robustness and error handling, either at build time or at run-time with the intention of embracing failure as a part of development lifecycle. These methods assist service teams in designing and continuously validating for failure, accounting for known and unknown failure conditions, architect for redundancy and employ retry and back-off mechanisms. Together with the introduction of intermittent failures and continuous monitoring in the stage environment of service deployments, these methods promote near total coverage of known and unknown faults that can impact the service in production. The purpose of the monitoring aspect during these experiments is the observation of fault and its recovery time, overview of symptoms in related components and the determination of the threshold and values with which alerts can be set.

Fault engineering is equally applicable to software, protocol, and infrastructure. Software faults include error-handling code paths and in-process memory management for which edge-case unit-tests, integration tests and stress and soak load tests are written. Protocol faults include the vulnerabilities in communication interfaces such as command line parameters or APIs. Examples of tests that mitigate this includes fuzzing which provides invalid, unexpected, or random data as input and we can access the level of protocol stability of a component. Infrastructure faults include outages, networking, and hardware failures. The tests that mitigate these cause fault in the underlying infrastructure such as shutting down virtual machines, crashing processes, expiring certificates and others.

One of the challenges with these methods is the signal to noise ratio from the errors. A fault is a hypothesis of an error. An error is a failure in the system and can lead to other errors. Since they occur in a cycle, the fault-error-failure cycle can lead to many errors from which the ones that must be fixed to improve system resilience and reliability need to be discerned. When these experiments are run for short durations, the number of errors to investigate is usually low. The leveraging of automation to continuously validate what matters during the experiment allows the detection of even errors that are hard to find manually.

Such automation can even be introduced into the pipeline to release software. This promotes a shift-left approach where the testing occurs as early in the development and project timeline as when the code is written. It follows the test early and often principle and the benefit is in the possibility to troubleshoot the issues encountered via debugging.

The outcomes of the fault injection testing are the measurement and definitions of a steady healthy state for the system’s interoperability, finding the difference between the baseline state and the anomalous state and documenting the processes and observations to identify and act on the result.

Sunday, January 9, 2022

This is a continuation of a series of articles on operational engineering aspects of Azure public cloud computing that included the most recent discussion on Azure Data Lake which is a full-fledged general availability service that provides similar Service Level Agreements as expected from others in the category.

Monitoring is a critical aspect for any service in the cloud both internal and customer facing. Metrics and alerts are part of the monitoring dashboard.

Each resource provides metrics to monitor specific aspects of the operations. These metrics can be viewed with the Azure Monitor Service or explored and plotted with the Azure Monitor Metrics Explorer. The metrics include QueryVolume, RecordSetCount, RecordSetCapacityUtilization. The last one is a percentage while the first two are counts. The QueryVolume is a sum of all queries received over a period. It can be viewed by browsing the metrics explorer, scoping down to the resource and selecting the metric with sum for aggregation. The RecordSetCount shows the number of Recordsets in Azure DNS for the DNS zone. All the recordsets are counted and the aggregation is the maximum of all the recordsets. The RecordSet capacity utilization shows the percent used for the RecordSet capacity of a DNS Zone. Each zone has a RecordSet limit that defines the maximum number of RecordSets allowed for the zone. The aggregation type is maximum.

Resource metrics can be used to raise alerts. They can be configured from the monitor page in the Azure portal. It must be scoped to a resource which is the DNS zone in this case. The signal logic can be configured by selecting a signal and configuring the threshold and frequency of evaluation for the metric.

Continuous monitoring of API is also possible via Synthetic monitoring. It provides proactive visibility into API issues before customers find the issues themselves. This is automated probing to validate build-out of deployments, monitoring a service or a mission critical scenario independent of the service deployment cycle and testing the availability of dependencies. It ensures end-to-end coverage of specific scenarios and can even perform validation of the response body not just the status code and headers. By utilizing all properties of making a web request and checking its response as well as a sequence of requests, the monitoring logic begins to articulate the business concerns that must remain available. Synthetic is not just active monitoring of a service. It is a set of business assets that take away the onus from business continuity assurance.

The steps to set up a Synthetic monitoring includes an onboarding, provisioning and deployment. The onboarding is required to isolate all the data structures and definitions specific to the customer and referred to by an account id. The provisioning is the setup of all Azure resources that are necessary to execute the logic. The deployment of the logic involves both the code and the configuration. The code is a .Net assembly and the configuration is a predefined json. It can specify more than one region to deploy and the regions can be changed from deployment to deployment of the same logic.

The use of active and passive monitoring completes the overall probes needed to ensure the smooth running of the services.

Saturday, January 8, 2022

DNS servers used with Active Directory can be primary or secondary.
The primary stores all the records while the secondary gets the contents from primary
The contents of a zone file are stored hierarchically, and this structure can be replicated among all the DCs.
It is updated via LDAP operations or DDNS (Dynamic DNS must have AD integration). A common misconfiguration issue is the island issue when an IP address for a DNS, changes and it is updated only locally. To do a global update instead, they must point to a root server other than themselves. Delegation options are granted to DNS servers or DCs. Simple is when DNS namespaces are delegated to DCs, and DC hosts a DNS zone. The records in a DNS server as opposed to DC are autonomously managed. DNS servers need to allow DDNS by DC. DC does DDNS to prevent updates to the DNS records in the server. Support and maintenance are minimal with DDNS. Standalone AD is used to create test or lab networks. A forest is created, a DC is assigned, DNS Service is installed. DNS zone is added, unresolved requests are forwarded to an existing corporate server. The primary DNS for all clients points to the DC. Background loading of DNS Zones makes it even easier to load DNS zones while keeping the zone available for DNS updates / queries.
Active directory has a feature where by one or more IP address can be specified to forward name resolutions to that are not handled by the local DNS server. The conditional forwarder definitions are also replicated via Active Directory. Together with the forward and reverse lookup zones in the active directory these can be set via the DNS mmc management console. The DNS servers are usually primary or secondary in nature. The primary stores all the records of the zone and the secondary gets the contents of its zone from the primary. Each update can flow from the primary to the secondary or the secondary may pull the updates periodically or on demand. All updates must be made to the primary. Each type of server can resolve name queries that come from hosts for the zones. The contents of the zone file can also be stored in the active directory in a hierarchical structure. The DNS structure can be replicated among all DCs of the domain, each DC holds a writeable copy of the DNS data. The DNS objects stored in the Active Directory could be updated on any DC via LDAP operations or through DDNS against DCs that act as DNS servers when the DNS is integrated with the Active Directory.
The DNS "island" issue sometimes occurs due to improper configuration. AD requires proper DNS resolution to replicate changes and when using integrated DNS, the DC replicates DNS changes through AD replication. This is the classic chicken and egg problem. If the DC configured as name server points to itself and its IP address changes, the DNS records will successfully be updated locally but other DCs cannot resolve this DC's IP address unless they point to it. This causes replication fail and effectively renders the DC with the changed IP address an island to itself. This can be avoided when the forest root domain controllers that are the name servers are configured to point at root servers other than themselves.
Application partitions are user defined partitions that have a custom replication scope. Domain controllers can be configured to host any application partition irrespective of their domains so long as they are in the same forest. This decouples the DNS data and its replication from the domain context. You can now configure AD to replicate only the DNS data between the domain controllers running the DNS service within a domain or forest.
The other partitions are DomainDnsZones and ForestDnsZones. The system folder is the root level folder to store DNS data. The default partitions for Domain and Forest are created automatically.
Aging and scavenging When the DNS records build up, some of the entries become stale when the clients have changed their names or have moved. These are difficult to maintain as the number of hosts increases. Therefore, a process called scavenging is introduced in the Microsoft DNS server that scans all the records in a zone and removes the records that have not been refreshed in a certain period. when the clients register themselves with the dynamic DNS, their registrations are set to be renewed every 24 hours by default. Windows DNS will store this timestamp as an attribute of the DNS record and is used with scavenging. Manual record entries have timestamps set to zero, so they are excluded from scavenging.
"A "no-refresh interval" for the scavenging configuration option is used to limit the amount of unnecessary replication because it defines how often the DNS sever will accept the DNS registration refresh and update the DNS record.
This is how often the DNS server will propagate a timestamp refresh from the client to the directory or filesystem. Another option called the refresh interval specifies how long the DNS server must wait to follow a refresh for a record to be eligible for scavenging and this is typically seven days.

Friday, January 7, 2022

Azure DNS allows hosting a DNS zone and managing the DNS records for a domain in Azure. The domain must be delegated to the Azure DNS from the parent domain so that the DNS queries for that domain can reach Azure DNS. Since Azure DNS isn't the domain registrar, delegation must be configured properly. A domain registrar is a company who can provide internet domain names. An internet domain is purchased for legal ownership. This domain registrar must delegate to the Azure DNS.

The domain name system is a hierarchy of domains which starts from the root domain that starts with a ‘.’ followed by the top-level domains including ‘com’, ‘net’, ‘org’, etc. The second level domains are ‘org.uk’, ‘co.jp’ and so on. The domains in the DNS hierarchy are hosted using separate DNS zones. A DNS zone is used to host the DNS records for a particular domain.

There are two types of DNS Servers: 1) An authoritative DNS Server that hosts DNS zones and it answers the DNS queries for records in those zones only and 2) a recursive DNS server that doesn’t host DNS zones but queries the authoritative servers for answer. Azure DNS is an authoritative DNS service.

DNS clients in PCs or mobile devices call a recursive DNS server for the DNS queries their application needs. When a recursive DNS server receives a query for a DNS record, it finds the nameserver for the named domain by starting at the root nameserver and then walks down the hierarchy by following CNAMEs. The DNS maintains a special type of name record called an NS record which lets the parent zone point to the nameservers for a child zone. Setting up the NS records for the child zone in a parent zone is called delegating the domain. Each delegation has two copies of the NS records: one in the parent zone pointing to the child, and another in the child zone itself. These records are called authoritative NS records and they sit at the apex of the child zone.

The DNS records help with name resolution of services and resources. It can manage DNS records for external services as well. It supports private DNS domains as well which allows us to use custom domain names with private virtual networks.

It supports record sets where we can use an alias record that is set to refer to an Azure resource. If the IP address of the underlying resource changes, the alias record set updates itself during DNS resolution.

The DNS protocol prevents the assignment of a CNAME records at the zone apex. This restriction presents a problem when there are load balanced applications behind a Traffic Manager whose profile requires the creation of a CNAME record. This can be mitigated with Alias records which can be created at the zone apex.

Azure DNS alias records are qualifications on a DNS record set. They can reference other Azure resources from within the DNS zone. For example, an alias record set points to a public ip address instead of an A record. This pointing is dynamic. When the IP addresses change, the record sets update dynamically during name resolution. An alias record set can exist for A, AAAA, CNAME record types. An A record set also known as a resource record set is the collection of DNS records in the zone that have the same name and are of the same type. An AAAA record is for ipv6 address. The SOA and CNAME record types are exceptions The DNS Standard does not permit multiple records with the same name for these types. These record sets can only contain a single record.

Azure DNS supports wild card records. These get returned in response to any query with a matching name. CAA records allow domain owners to specify which certificate authorities are authorized to issue certificates for their domain.

CAA records allow domain owners to specify which certificate Authorities are authorized to issue certificates for their domain and this allows them to avoid issuing incorrect certificates in some cases. The CNAME record sets can’t co-exist with the other record sets with the same name. Also, CNAME record sets can’t be created in the zone apex (name = ‘@’) which will always contain the NS and SOA record sets during the creation of the zone. The NS records are bound to the creation and deletion of the zone. It contains the name of the Azure DNS name servers assigned to the zone. This only applies to the NS record set to support cohosting domains with more than one DNS provider

Some of the validations that can be performed on these records include:

1. Parent Zone with conflicting child records fails

2. Delegation with no conflicts pass

3. Delegation with already configured zone passes

4. Delegation with different configured zone fails

5. Delegations with trailing dot in record set pass

6. Zone with intermediate delegation fails

7. Zone with root wild card fails

8. Zone with root Txt wild card fails

9. Zone with intermediate wild card fails

10. A record can be created in the zone

11. A record can be created in an empty zone

12. A record can be created with the same data

13. A record can be created with compatible existing records

14. A record with conflicting another A record fails

15. A record with conflicting CNAME record fails

16. A record with conflicting wild card record fails

17. A CNAME record can be created

18. A CNAME record can be created in an empty zone

19. A CNAME record with same data succeeds

20. A CNAME record with conflicting record fails

21. A CNAME record with conflicting wild card fails