CloudRunbook | Practical Cloud Engineering
Private Endpoints + DNS baseline: stop outages before they happen
A practical baseline for Azure Private Endpoints and DNS: ownership, zone design, resolver routing, and onboarding patterns that prevent midnight outages.
Pick a private DNS ownership model, keep zones in one place, automate zone linking, and bake the pattern into subscription vending. Anything else ends up as manual fixes when a workload flips on Private Link.
Why you should care
Private Endpoints without governance are outage factories. The moment someone links a zone to the wrong vNet, deletes the record, or forgets reverse lookups, apps stop resolving. A baseline keeps ownership, automation, and rollback in the platform team instead of relying on every workload engineer remembering the DNS handshake.
Baseline decision points
-
Zone ownership
Platform-owned subscription vs workload-owned zones. Centralised ownership simplifies automation but requires a service boundary. -
vNet linking model
Hub-only links with resolver forwarding vs direct workload links. Hub links + Azure DNS Private Resolver keep churn low. -
Naming and alignment
One zone per service (recommended) vs consolidated zones. Per-service zones make lifecycle management easier. -
Automation tooling
Bicep/Terraform/pipeline templates vs portal. The baseline assumes IaC. -
Subscription vending hook
Decide where Private Endpoint onboarding sits in the vending pipeline. -
Resolver strategy
Use the Azure DNS Private Resolver in hub networks to avoid custom VMs. -
Monitoring and drift
How you detect missing or stale zone links.
Runbook: Private Endpoints + DNS baseline
- Define ownership and landing zones for DNS resources
Put private DNS zones, the Azure DNS Private Resolver, and automation identities in a platform subscription. Document the RBAC split: platform controls records; workloads request via automation or service management.
- Catalogue the services you allow via Private Link
List the Azure services (e.g. Key Vault, Storage, SQL, Container Registry) that are permitted. This drives which private zones you deploy. Keep the list under version control so policy and automation stay aligned.
- Deploy standard private DNS zones with IaC
Create the per-service zones ahead of time. Use consistent naming (
privatelink.database.windows.net, etc.) and tag them with owners and lifecycle info.resource kvZone 'Microsoft.Network/privateDnsZones@2020-06-01' = { name: 'privatelink.vaultcore.azure.net' location: 'global' tags: { owner: 'platform-dns' purpose: 'PrivateLink' } } - Provision Azure DNS Private Resolver in the hub
Deploy inbound/outbound endpoints in the hub vNet. Configure forwarding rules to the private zones. This keeps spoke networks lightweight and avoids custom DNS servers.
- Automate vNet linking and zone registration
Script or pipeline the linking of workload vNets to the relevant zones when a subscription onboards. Avoid manual linking; it inevitably drifts. Store metadata so you can generate reports of which vNets are linked to which zones.
- Create a Private Endpoint request workflow
Provide a template (Terraform module, ARM/Bicep snippet, or portal checklist) that workloads must follow. Include parameters for approval, target subnet, zone linking, and rollback. Make the workflow part of your change process.
- Enforce policy guardrails
Use Azure Policy to prevent random private endpoints in forbidden subnets, enforce Private DNS zone registration, and audit any endpoint without an RBAC-approved owner. This keeps requests visible.
- Add monitoring and drift detection
Run scheduled scripts (Logic App/Azure Automation) to compare declared zone links against actual ones. Alert when zones are unlinked or when records are missing. Combine with activity log alerts for zone deletions.
- Feed the pattern into subscription vending
When a new subscription is created, ensure the standard networking template includes the DNS forwarding settings and zone link automation. Workloads should inherit the ability to resolve private endpoints without extra steps.
Common pitfalls
If each team creates zones and endpoints in the portal, you will never regain control. Use IaC or service catalog items.
Putting some zones in workload subs and others in platform subs creates a support nightmare. Pick one model and document it.
Logs and diagnostics often need reverse DNS. Ignoring it makes investigations painful.
If the resolver isn’t in place when workloads arrive, they bake in custom DNS workarounds that are hard to remove later.
Validation checklist
- ✓
All approved private DNS zones exist in the platform subscription with correct tags.
- ✓
Azure DNS Private Resolver endpoints are deployed, healthy, and documented.
- ✓
Every workload vNet is linked to the required zones via automation (no manual leftovers).
- ✓
Private Endpoint requests use the standard workflow and land in approved subnets.
- ✓
Azure Policy assignments audit or deny unsupported Private Link usage.
- ✓
Monitoring alerts fire if a zone link is removed or a zone is deleted.
- ✓
Subscription vending pipelines automatically grant workloads DNS resolution from day zero.
- ✓
Rollback procedures for each service (Key Vault, Storage, SQL, etc.) are written and tested.
Rollback / back-out plan
- To remove a Private Endpoint: delete the endpoint, clear the DNS A record, and run the automation that restores public connectivity if needed. Expect short outages.
- To revert resolver changes: disable forwarding rules, but note that cached results may linger until TTLs expire. Communicate clearly with workloads before flipping.
- To undo zone links: remove the link and flush DNS on affected hosts. Document why the link was removed and ensure alternatives (public endpoints or other regions) exist.
Rollback is never “instant” because DNS caching and dependency on Private Link may require maintenance windows. Keep playbooks per service.