Azure AKS: Attach Volume Failed, already attached

After upgrading Azure AKS Cluster to Version 15.7, I experienced an issue. The deployment was not able to re-attach some disks to the cluster nodes. The error message: “Attach volume failed, …, already attached to node” appeared in the cluster logs. You will find a quick solution for this error in this article.

First of all, I want to thank kwikwag for identifying the root cause and summarizing it on GitHub. For the reason that I was not following the steps in detail, I will summarize my solution here.

Attaching issue in Kubernetes

After performing a AKS Cluster upgrade, some pods where stuck in status “Container Creating”. In the logfile I was able to identify the following error messages:

Warning  FailedMount         41m (x54 over 162m)    kubelet, aks-agentpool-11111111-vmss000002  Unable to mount volumes for pod "mypod-86cd4f75d9-xxxxx_my-namespace(00000000-73d8-494a-9aa0-0000000000000)": timeout expired waiting for volumes to attach or mount for pod "my-namespace"/"my-pod-86cd7c75d9-xxxxx". list of unmounted volumes=[my-pod]. list of unattached volumes=[my-pod default-token-xxxxx]

Warning  FailedAttachVolume  21m (x693 over 161m)   attachdetach-controller                     AttachVolume.Attach failed for volume "pvc-000000000-1652-11ea-a59c-7eb26180ec6b" : compute.DisksClient
#Get: Failure responding to request:StatusCode=429 -- Original Error: autorest/azure: error response cannot be parsed: "" error: EOF

Warning  FailedMount         2m3s (x17 over 38m)    kubelet, aks-agentpool-11111111-vmss000002  Unable to mount volumes for pod "my-pod-86cd7c75d9-xxxxx_my-namespace(0000000000000)": timeout expired waiting for volumes to attach or mount for pod "my-namespace"/"my-pod-86cd7c75d9-xxxxx". list of unmounted volumes=[my-pod]. list of unattached volumes=[my-pod default-token-xxxxx]

Warning  FailedAttachVolume  92s (x9774 over 144m)  attachdetach-controller                     AttachVolume.Attach failed for volume "pvc-000000000-1652-11ea-a59c-7eb26180ec6b" : disk(/subscriptions/0000-0000/resourceGroups/MC_RG_XXXXXXX_westeurope/providers/xxx/disks/kubernetes-dynamic-pvc-000000000-1652-11ea-a59c-7eb26180ec6b) already attached to node(/xxx/MC_RG_XXXXXXX_westeurope/xxxx/aks-agentpool-11111111-vmss/x/aks-agentpool-11111111-vmss_1), could not be attached to node(aks-agentpool-11111111-vmss000002)
  

The really relevant part of this error message is:

Attach failed for volume "pvc-000000000-1652-11ea-a59c-7eb26180ec6b" already attached to node (aks-agentpool-11111111-vmss_1) could not be attached to node(aks-agentpool-11111111-vmss000002)

So what does that mean? The disk is already attached to node #1 with the ID 0 (aks-agentpool-111111-vmss_1). During the attempt to deploy the pod with the disk on node #2 with ID 1 (aks-agentpool-111111-vmss_2) this fails.

In general, if you have issues with Azure AKS you should have a look at the Microsoft Documentation page.

Detach PVC manually from AKS Node

First of all, we identify the PVC disc which is not detaching from node #1.

kubernetes-dynamic-pvc-000000000-1652-11ea-a59c-7eb26180ec6b

It might also makes sense to have a look at the instances in your Virtual Machine Scale Set in Azure to identify the nodes.

Azure AKS, volume attach failed: VMSS Instances

After that, we have to take a look at node vmss_1 to identify the LUN to which the pvc disc is attached. You can do this with the following command. “–instance-id 0” is the respective node of the Virtual Machine Scale Set.

az vmss show -g MC_RG_XXXXXXX_westeurope -n "aks-agentpool-11111111-vmss" --instance-id 0 --query "storageProfile.dataDisks[].{name: name, lun: lun}"

The output of this command might be a bit longer, based on the attached volumes. But, you will be able to identify the disk you are looking for.

{
    "lun": 4,
    "name": "kubernetes-dynamic-pvc-000000000-1652-11ea-a59c-7eb26180ec6b"
},

Once you have the LUN – in this case it is LUN #4 – identified, you can proceed with detaching of the PVC disc manually of the node.

az vmss disk detach -g MC_RG_XXXXXXX_westeurope -n aks-agentpool-11111111-vmss --instance-id 0 --lun 4

The command will now detach the pvc disc from the VMSS instance and its config gets updated.

Redeploy POD after detaching

After all, you should redeploy your pod with containers manually by scaling down to 0 and then back to 1 copy. this can be performend with the following commands.

$ kubectl scale deployment my-pod --replicas=0 -n my-namespace
$ kubectl scale deployment my-pod --replicas=1 -n my-namespace

Now your POD will be in health state and back up and running. More tips & tricks are highly welcome 🙂

If you work with websites on Kubernetes, you might want to check out my post about securing Azure AKS with Let’s Encrypt certificates.

Author: Patrick Riedl

I am Patrick Riedl, and as you can see I am totally Microsoft enthusiastic. Through my work as a Cloud Architect and my background in IT- & information-security, I always try to be ahead of times. With this blog & podcast I hope to give back some knowledge and learning to the online community. I am always looking forward to feedback.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.