Hello,
I have a VSphere 5.0 cluster with several hosts. I have an operation that runs every few hours that detaches and reattaches a set of RDM LUNs to some of the VMs in the cluster. I've followed the "how to unpresent a lun" procedure to the letter, however I still see the "lost connectivity to storage device" error message in the logs. The VMKernel logs have even uglier error messages, indicating it's lost paths to storage unexpectely. I believe this activity causes my hosts to become unresponsive from time to time.
Here is the procedure I use to dismount an RDM LUN:
1) Set the disk offline in the guest OS
2) Remove the disk from the VM and delete the mapping file
3) Detach the RDM lun from ALL ESXi hosts
4) Unmap the LUN on the storage array so the hosts no longer see it
5) Rescan all hosts
These are the relevant log messages I see:
Device naa.60a980003753314d733f434b2f2f4e4e, has been turned off administratively.
XXX esx.problem.scsi.device.state.off.category not found XXX
3/8/2013 1:34:33 PM
host2.mydomain.com
Task: Detach SCSI LUN
info
3/8/2013 1:34:34 PM
Detach SCSI LUN
host2.mydomain.com
Lost connectivity to storage device naa.60a980003753314d733f434b2f2f4e4e. Path vmhba3:C0:T0:L22 is down. Affected datastores: Unknown.
error
3/8/2013 1:35:56 PM
host2.mydomain.com
Lost connectivity to storage device naa.60a980003753314d733f434b2f2f4e4e. Path vmhba2:C0:T0:L22 is down. Affected datastores: Unknown.
error
3/8/2013 1:35:56 PM
host2.mydomain.com
Lost connectivity to storage device naa.60a980003753314d733f434b2f2f4e4e. Path vmhba3:C0:T1:L22 is down. Affected datastores: Unknown.
error
3/8/2013 1:35:56 PM
host2.mydomain.com
Task: Rescan all HBAs
info
3/8/2013 1:35:59 PM
Rescan all HBAs
host2.mydomain.com
Task: Rescan VMFS
info
3/8/2013 1:36:02 PM
Rescan VMFS
host2.mydomain.com
My specific questions are:
- Is this normal behavior? This seems like something is not going correctly to me, but I could be wrong.
- If not, is there anything wrong with my procedure above?
Thanks for the assistance!