Friday, May 31, 2024

 This is a continuation of IaC shortcomings and resolutions. In this section, we focus on the deployment of azure machine learning workspaces with virtual network peerings. When peerings are established traffic from any source in virtual network can flow to any destination in another. This comes very helpful when egress must be from one virtual network. Any number of virtual networks can be peered in hub-and-spoke model or as transit but they have their drawbacks and advantages. The impact this has on the infrastructure for az ml deployments is usually not called out in deployments and there can be quite a few surprises in the normal functioning of the workspace. Some of the previous articles explained these from the workspace side but in this section, we describe the network side in more detail, specifically the configuration options with peering.

When a local virtual network is peered with a remote virtual network, then there are four options presented to the user out of which only the first is selected and the rest remain unselected. Unfortunately, the default settings are not always appropriate for every situation and deserve special attention. These four options are:

1. Allow local network to access remote network

2. Allow local network to receive forwarded traffic from remote network

3. Allow gateway or route server in local network to forward traffic to remote network

4. Allow local network to use remote network’s gateway or route server.

Now, local and remote are interchangeable and these options repeated for the opposite direction as well with both sections of four choices each appearing on the ‘Add Peering’ page. This gives complete control over all aspects of treating the local and remote network in an asymmetrical manner rather than symmetrical bidirectionally-equal configuration.

Now, let’s revisit the options themselves assuming we have picked one of the networks as local. If the first option is not selected, there is no peering because traffic does not flow at all for the local network. This option is therefore selected by default in both sections and can be overridden selectively by the cloud network contributor role, but seldom done.

The second option is necessary for Microsoft hosts such as login.microsoftonline.com aka Microsoft Entra ID, management.azure.com aka Azure Portal and Azure Resource Manager to reach the local network. Failure to do so will result in incomplete handshakes during authentication as users begin to use resources in the local network.

The third  and fourth options are for leveraging egress traffic to use gateway or route server. Often, a designated third remote virtual network was chained behind the remote and local networks for its firewall. When the firewall is enabled configuring the gateway or route server helps to ensure that all resources use that gateway or route server as their next hop. Setting this option allows the local network to use that single gateway or route server for all chained virtual networks. Between the third and the fourth options, the gateway or route server only happens to be in the local or the remote network. They can also be both selected with preference for local as well as remote appliance because third occurs before fourth.

In this way, peering configuration has complete control over the traffic between the participating networks. Traffic can optionally be observed with the help of a network watcher. This completes the discussion around network side and workspace side configuration options for ensuring full connectivity to the compute and successful code execution on those hosts.


No comments:

Post a Comment