Cluster computing

Monday, May 27, 2024

This is a continuation of articles on IaC shortcomings and resolutions. In this section too, we focus on the deployment of azure machine learning workspaces with virtual network peering and securing it with proper connectivity. When peerings are established traffic from any source in virtual network can flow to any destination in another. This is very helpful when egress must be from one virtual network. Any number of virtual networks can be peered into hub-and-spoke model or as transit, but they have their drawbacks and advantages. The impact this has on the infrastructure for AZ ML deployments is usually not called out in deployments and there can be quite a few surprises in the normal functioning of the workspace. The previous article focused on DNS name resolution and the appropriate names and ip addresses to use with A records. This article focuses on private and service endpoints, firewall, NSG, and user defined routing.

The workspace and the compute can have public and private ip addresses and when a virtual network is used, it is intended to isolate and secure the connectivity. This can be done in one of two ways. A managed virtual network or a customer specified virtual network for the compute instances and cluster. Either way, the workspace can retain public ip connectivity while the compute instances and clusters can choose to be assigned public and private connectivity independently. The latter can be provisioned with disabled public ip connectivity and only using private ip addresses from a subnet in the virtual network. It is important to say that the workspace’s ip connectivity can be independent from that of the compute and clusters because this affects end-users’ experience. The workspace can retain both a public and private ip address simultaneously but if it were made entirely private, then a jump server and a bastion would be needed to interact with the workspace including its notebooks, datastores and compute. With just the compute and the clusters having private ip connectivity to the subnet, the outbound ip connectivity can be established through the workspace in an unrestricted setting or with a firewall in a conditional egress setting. The subnet that the compute and clusters are provisioned from must have connectivity to the subnet that the storage account, key vault and azure container registry that are internal to the workspace. A subnet can even have its own Nat gateway so that all outbound access can get the same ip address prefix which is very helpful to secure using an ip rule for the prefix for incoming traffic at t the destination. Storage account and key vault can gain access via their service endpoints to the compute and cluster’s private ip address while the container registry must have a private endpoint for the private plane connectivity to the compute. A dedicated image server build compute can be created for designated image building activities. On the other hand, if the computer and cluster were assigned public ip connectivity, the azure batch service would need to be involved and these would reach the compute and cluster’s ip address via a load balancer. If created without a public ip, we get a private link service to accept the inbound access from Azure Batch Service and Azure Machine Learning Service without a public ip address. Local host file with the private ip address of the compute and a name like ‘mycomputeinstance.eastus.instances.azureml.ms’, is an option to connect to the virtual network with the workspace in it. is also important to set user-defined routing when a firewall is used, and the default rule must have ‘0.0.0.0/0’ to designate all outbound internet traffic to reach the private ip address of the firewall as a next hop. This allows the firewall to inspect all outbound traffic and security policies can kick in to allow or deny traffic selectively.

Previous article: IaCResolutionsPart126.docx

Cluster computing

Monday, May 27, 2024

No comments:

Post a Comment