Deploy a fully network-isolated AKS cluster that you can manage securely through Cloud Shell—no VPN or jumpbox required.
cd default
cp terraform.tfvars.example terraform.tfvars
# Edit terraform.tfvars with your values (subscription_id, unique names for storage/acr/relay)
terraform init && terraform apply
# Then configure Cloud Shell VNet integration in Azure Portal (see step 3 below)- Complete Network Isolation - No public endpoints, no outbound internet access
- Cloud Shell Access - Manage your cluster from anywhere via Azure Portal
- Built-in Monitoring - Container Insights with Azure Monitor Agent
- Entra ID Auth - No shared credentials, Azure RBAC for access control (local accounts disabled)
- Workload Identity - Secure pod-to-Azure service authentication
- Private Container Registry - ACR Premium with cached MCR images for bootstrapping
- Auto-Upgrades - Kubernetes patches and node images update automatically
- Azure Policy - Governance and compliance enabled by default
- Local accounts - Disabled when admin groups are configured; enabled otherwise for development convenience
- Deployer gets admin access - The user running
terraform applyis automatically grantedAzure Kubernetes Service RBAC Cluster Adminrole. - Admin groups are optional - Set
aks_admin_group_object_idsto grant additional Entra ID groups cluster admin access and disable local accounts. If you need to create a group:az ad group create --display-name "AKS Admins" --mail-nickname "aks-admins" --query id -o tsv
The cluster is configured with automatic upgrades:
| Upgrade Type | Schedule | Window |
|---|---|---|
| Kubernetes patches | Weekly (Sunday) | 02:00-06:00 UTC |
| Node OS images | Weekly (Sunday) | 06:00-10:00 UTC |
To change these windows, modify the maintenance_window_* blocks in aks.tf.
- ACR Premium (~$50/month) - Required for private endpoints
- AKS Standard tier (default) - Includes Uptime SLA; use
aks_sku_tier = "Free"for dev/test - Log Analytics - Pay-per-GB ingestion; set
log_analytics_retention_daysto control costs - Azure Relay - Standard tier for Cloud Shell connectivity
Before you start, make sure you have:
- Azure CLI >= 2.61.0
- Terraform >= 1.9
- An Azure subscription where you can:
- Create AKS clusters and managed identities
- Create private endpoints and DNS zones
- Register resource providers
cd default
cp terraform.tfvars.example terraform.tfvarsEdit terraform.tfvars and update these key values:
# Required - your Azure subscription
subscription_id = "your-subscription-id"
# Base names - a unique suffix is auto-generated from your subscription ID
# Example: acrpvtaks → acrpvtaksabc123
acr_name = "acrpvtaks"
storage_account_name = "stpvtakscs"
relay_namespace_name = "arn-pvtaks"
# Optional - Entra ID group for admin access (recommended)
# Create a group: az ad group create --display-name "AKS Admins" --mail-nickname "aks-admins" --query id -o tsv
aks_admin_group_object_ids = ["your-group-object-id"]
# Optional - override the auto-generated suffix
# name_suffix = "prod01"The suffix ensures globally unique names. See
terraform.tfvars.examplefor all options.
terraform init
terraform plan
terraform applyThis creates about 40 resources and takes ~10 minutes.
This is the magic that lets you access your private cluster from anywhere.
Tip: Run
terraform outputto see the actual resource names with suffixes.
- Open Azure Portal → Click the Cloud Shell icon (top right)
- Click Settings (gear icon) → Reset user settings if you already have Cloud Shell configured
- Select Bash → Click Show advanced settings
- Fill in:
- Subscription: Your subscription
- Region: Same as your
locationin tfvars - Resource Group: Your
resource_group_namefrom tfvars - Storage Account: Use
cloudshell_storage_account_namefrom terraform output (select "Use existing") - File Share:
acsshare(select "Use existing")
- Check Show VNET isolation settings
- Fill in the VNet settings:
- Virtual Network:
vnet-<cluster_name>(based on yourcluster_namein tfvars) - Network Profile:
np-cloudshell-<location>(based on yourlocationin tfvars) - Relay Namespace: Use
cloudshell_relay_namespace_namefrom terraform output
- Virtual Network:
- Click Create storage
After a few minutes, you'll have a Cloud Shell instance running inside your VNet accessible from the Azure Portal!
Network Note: Cloud Shell runs in a separate subnet with internet access (required for Azure CLI, package updates, etc.). The AKS subnet remains fully isolated—Cloud Shell can reach the cluster's API server via private endpoint, but cannot reach pods directly over the internet.
Run these from Cloud Shell (with VNet integration configured).
# Get credentials and test connection (use your values from tfvars)
az aks get-credentials --resource-group <resource_group_name> --name <cluster_name>
kubectl get nodes -o wide
kubectl get pods -n kube-system | grep -E "ama|workload-identity"# Verify private DNS resolution (should show 10.x.x.x addresses)
nslookup <acr_name>.azurecr.io
# Test that outbound internet is blocked (should timeout after ~30 seconds)
kubectl run nettest --image=busybox --restart=Never --rm -i --tty=false -- \
sh -c "wget -T 5 -q -O- http://1.1.1.1 2>&1 || echo 'BLOCKED: No internet access'"
# Expected: "timed out waiting for the condition" = internet is blocked# Check ACR cache rule and cached images (use your acr_name from tfvars)
az acr cache list --registry <acr_name> --output table
az acr repository list --name <acr_name> --output table# Deploy a test pod that generates logs
kubectl run logtest --image=busybox --restart=Never -- \
sh -c "while true; do echo \"Test log at \$(date)\"; sleep 10; done"
kubectl get pod logtest -wWait 5-10 minutes for logs to reach Log Analytics, then:
# Query logs (use your resource_group_name and cluster_name from tfvars)
export WORKSPACE_ID=$(az monitor log-analytics workspace show \
-g <resource_group_name> -n log-<cluster_name> --query customerId -o tsv)
az monitor log-analytics query -w $WORKSPACE_ID \
--analytics-query "ContainerLogV2 | where PodName == 'logtest' | project TimeGenerated, LogMessage | take 5"
# Clean up
kubectl delete pod logtestAfter deployment, these outputs are available:
| Output | What It Is |
|---|---|
resource_group_name |
Resource group containing everything |
cluster_name |
Your AKS cluster name |
acr_name |
Container registry name |
acr_login_server |
ACR URL for pushing images |
oidc_issuer_url |
OIDC issuer for workload identity federation |
cloudshell_storage_account_name |
Storage account for Cloud Shell |
cloudshell_relay_namespace_name |
Azure Relay for Cloud Shell connectivity |
log_analytics_workspace_name |
Where your monitoring data lives |
kubeconfig |
Cluster credentials (marked sensitive) |
default/
├── versions.tf # Terraform and provider versions
├── variables.tf # Input variables with validations
├── data.tf # Data sources
├── resource_group.tf # Resource group
├── network.tf # VNet, subnets, NSGs, private DNS zones
├── acr.tf # ACR, cache rules, private endpoint
├── aks.tf # AKS cluster, identities, role assignments
├── monitoring.tf # Log Analytics, Container Insights
├── cloudshell.tf # Network profile, Relay, Storage, endpoints
├── outputs.tf # Output values
├── terraform.tfvars # Your variable values (create from example)
└── terraform.tfvars.example
When you're done, clean up to avoid charges:
cd default
terraform destroy -auto-approve┌─────────────────────────────────────────────────────────────────────────────┐
│ VNet (10.1.0.0/16) │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐ │
│ │ aks-subnet │ │ api-server-subnet│ │ acr-subnet │ │
│ │ 10.1.1.0/24 │ │ 10.1.2.0/24 │ │ 10.1.3.0/24 │ │
│ │ │ │ │ │ │ │
│ │ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │ │
│ │ │ AKS Nodes │ │ │ │ API Server│ │ │ │ ACR PE │ │ │
│ │ └───────────┘ │ │ │(delegated)│ │ │ └───────────┘ │ │
│ │ [NSG] │ │ └───────────┘ │ └─────────────────┘ │
│ └─────────────────┘ └──────────────────┘ │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
│ │cloudshellsubnet │ │ relaysubnet │ │storage-pe-subnet│ │
│ │ 10.1.4.0/24 │ │ 10.1.5.0/24 │ │ 10.1.6.0/24 │ │
│ │ │ │ │ │ │ │
│ │ ┌───────────┐ │ │ ┌───────────┐ │ │ ┌───────────┐ │ │
│ │ │Cloud Shell│ │ │ │ Relay PE │ │ │ │Storage PE │ │ │
│ │ │(delegated)│ │ │ └───────────┘ │ │ │(blob+file)│ │ │
│ │ └───────────┘ │ └─────────────────┘ │ └───────────┘ │ │
│ └─────────────────┘ └─────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
This cluster is fully network isolated with blocked outbound internet:
| Setting | Value | What It Does |
|---|---|---|
outbound_type |
none |
AKS doesn't provision egress infrastructure (no LB, no NAT Gateway) |
NSG deny rule |
DenyInternetOutbound |
Blocks all outbound traffic to Internet service tag |
bootstrap_profile |
Cache |
System images pulled from private ACR, not MCR |
private_cluster_enabled |
true |
API server only accessible within VNet |
private_dns_zone_id |
System |
AKS creates and manages the private DNS zone for API server |
local_account_disabled |
true |
No local admin kubeconfig; Azure RBAC only |
public_network_access_enabled |
false |
ACR and Storage not exposed publicly |
With just outbound_type=none, AKS doesn't provision outbound infrastructure, but Azure's fabric routing still allows pods to reach the internet via SNAT. To truly block egress, we add an NSG rule on the AKS subnet:
resource "azurerm_network_security_rule" "aks_deny_internet_outbound" {
name = "DenyInternetOutbound"
priority = 4000
direction = "Outbound"
access = "Deny"
protocol = "*"
destination_address_prefix = "Internet" # Azure service tag - excludes private IPs
}This blocks internet access while still allowing:
- VNet traffic (private endpoints, peered VNets)
- Azure services via private endpoints (ACR, Storage, Relay)
- Pod-to-pod and pod-to-service communication
For the cluster to bootstrap and upgrade, AKS needs to pull images from Microsoft Container Registry. Since we blocked internet access, we set up an ACR cache rule that mirrors MCR:
resource "azurerm_container_registry_cache_rule" "aks_managed" {
name = "aks-managed-mcr" # Exact name required
source_repo = "mcr.microsoft.com/*" # Exact source required
target_repo = "aks-managed-repository/*" # Exact target required
}This cache rule must match exactly - it's documented in Microsoft's network isolated cluster guide. Changing these values will break cluster creation and upgrades.
- Network isolated AKS clusters - The concepts behind this pattern
- Network isolated AKS with BYO ACR - Step-by-step guide
- AKS private clusters - Private cluster options
- API Server VNet Integration - How the control plane stays private
- Azure CNI Overlay - Why we use overlay mode for pod networking
- Private endpoints - How services stay private
- NSG service tags - How we block internet egress
- Workload Identity - Secure pod-to-Azure authentication
- Azure RBAC for Kubernetes - Authorization with Entra ID
- Azure Policy for AKS - Governance and compliance
- Cloud Shell in a VNet - VNet integration setup
- Container Insights - Monitoring your cluster
- Log Analytics - Where your logs live
- AKS pricing tiers - Free vs Standard vs Premium