One of our customers runs on Azure Service Fabric (SF) which is backed by a Virtual machine scale set (VMSS). We had a connectivity problem recently and one of the developers enabled remote debugging on the SF cluster to see what went wrong. Little did he know that (among other things) a large number of additional TCP ports are opened on the cluster load balancers to allow debuggers to attach. In the portal, this looks like this:
This is an undesirable situation because:
- the attack surface of the SF cluster has increased due to all these open ports (security perspective) and
- the ARM template we use to deploy our SF cluster no longer works (maintenance perspective).
The reason behind the latter is that Azure does not allow the removal of Inbound NAT pools and NAT rules when they are in use by a VMSS. So if you deploy a ARM template that does not have all the Inbound NAT rules that you also see in the Azure portal, you get an error message:
Cannot remove inbound nat pool DebuggerListenerNatPool-8ypmdj7pp8 from load balancer since it is in use by virtual machine scale set /subscriptions/<subscription id>/resourceGroups/ClusterResourceGroupDEV/providers/Microsoft.Compute/virtualMachineScaleSets/Backend
If you try to remove an Inbound NAT rule via the portal you get an even nicer message:
Failed to delete inbound NAT rule 'DebuggerListenerNatPool-2zqjmhjv3q.0'. Error: Adding or updating NAT Rules when NAT pool is present on loadbalancer /subscriptions/<subscription id>/resourceGroups/ClusterResourceGroupDEV/providers/Microsoft.Network/loadBalancers/LB-sfdev-Backend is not supported. To modify the load balancer, pass in all NAT rules unchanged or remove the LoadBalancerInboundNatRules property from your PUT request
And the portal actually warns you that this is not yet supported so we could have known beforehand:
So what if you actually do want to remove these Inbound NAT rules? Or you want to remove the default NAT rules that allows RDP access to your SF cluster VMs? Googling around I couldn’t really find a solution, only people with the same question so I thought: let’s find a way to do this.
The error messages provide a valuable clue: you can not modify NAT rules because they are in use by the VMSS. So let’s check Azure Resource Explorer to see if we can find a link between the VMSS and these NAT pools. This link exists and here they are:
I selected our
Frontend VMSS and scrolled down to the network profile. There we have four Inbound NAT pools that you can just remove using Azure Resource Explorer so it should look like this:
So now the link between the VMSS and the NAT pools no longer exists. We can now navigate to the load balancer and remove the NAT pools there as well:
We should now be in a situation where there are no longer any NAT pools we do not want. This means we can redeploy our SF cluster ARM template again and everything is back to normal.
Note that a similar approach can be used for adding/updating/deleting NAT rules. The only thing you have to do is remove the link between the VMSS and the corresponding NAT pool, make your changes and reapply the link.