Pages

Monday, February 10, 2020

Host cannot communicate with one or more other nodes in the vSAN enabled cluster

Ping between nodes was working so it was not a physical network issue. This is the lab environment so all services (mgmt, vMotion, vSAN) are enabled on single VMKNI (vmknic0).

So what's the problem?

I did some google searching and found that some people were experiencing problems with vSAN unicast agents.

Here is the command to list of unicast agents on vSAN node

esxcli vsan cluster unicastagent list

Grrrr. The list is empty!!!! On all ESXi hosts in my 3 nodes vSAN cluster.

Let's try to configure it manually.

Each vSAN node should have a connection to agents on other vSAN nodes in the cluster.

For example, one vSAN node from 4-node vSAN Cluster should have 3 connections

 [root@n-esx04:~] esxcli vsan cluster unicastagent list  
 NodeUuid               IsWitness Supports Unicast IP Address    Port Iface Name Cert Thumbprint  
 ------------------------------------ --------- ---------------- -------------- ----- ---------- -----------------------------------------------------------  
 5e3ec640-c033-7c7d-888f-00505692f54d     0       true 192.168.11.105 12321       18:F3:B7:9F:66:C4:C4:3E:0F:7D:69:BB:55:92:BC:A3:AC:E4:DD:5F  
 5df792b0-f49f-6d76-45af-005056a89963     0       true 192.168.11.107 12321       20:4C:C1:48:F5:2D:04:16:55:F1:D3:F1:4C:26:B5:C4:23:E5:B4:12  
 5e3e467a-1c1b-f803-3d0f-00505692ddc7     0       true 192.168.11.106 12321       53:99:00:B8:9D:1A:97:42:C0:10:C0:AF:8C:AD:91:59:22:8E:C9:79  

We need the get local UUID of the cluster node.

 [root@n-esx08:~] esxcli vsan cluster get  
 Cluster Information  
   Enabled: true  
   Current Local Time: 2020-02-11T08:32:55Z  
   Local Node UUID: 5df792b0-f49f-6d76-45af-005056a89963  
   Local Node Type: NORMAL  
   Local Node State: MASTER  
   Local Node Health State: HEALTHY  
   Sub-Cluster Master UUID: 5df792b0-f49f-6d76-45af-005056a89963  
   Sub-Cluster Backup UUID:  
   Sub-Cluster UUID: 52c99c6b-6b7a-3e67-4430-4c0aeb96f3f4  
   Sub-Cluster Membership Entry Revision: 0  
   Sub-Cluster Member Count: 1  
   Sub-Cluster Member UUIDs: 5df792b0-f49f-6d76-45af-005056a89963  
   Sub-Cluster Member HostNames: n-esx08.home.uw.cz  
   Sub-Cluster Membership UUID: f8d4415e-aca5-a597-636d-005056997c1d  
   Unicast Mode Enabled: true  
   Maintenance Mode State: ON  
   Config Generation: 7ef88f9d-a402-48e3-8d3f-2c33f951fce1 6 2020-02-10T21:58:16.349  

So here are my nodes
n-esx08 - 192.168.11.108 - 5df792b0-f49f-6d76-45af-005056a89963
n-esx09 - 192.168.11.109 - 5df792b0-f49f-6d76-45af-005056a89963
n-esx10 - 192.168.11.110 - 5df792b0-f49f-6d76-45af-005056a89963

And the problem is clear. All vSAN nodes have the same UUID.
Why?  Let's check ESXi system UUIDs on each ESXi host.

 [root@n-esx08:~] esxcli system uuid get  
 5df792b0-f49f-6d76-45af-005056a89963  
 [root@n-esx08:~]  

 [root@n-esx09:~] esxcli system uuid get  
 5df792b0-f49f-6d76-45af-005056a89963  
 [root@n-esx09:~]  

 [root@n-esx10:~] esxcli system uuid get  
 5df792b0-f49f-6d76-45af-005056a89963  
 [root@n-esx10:~]  

So the root cause is obvious. I use nested ESXi to test vSAN and I forgot to regenerate system UUID after the clone. The solution is easy. Just delete UUID from /etc/vmware/esx.conf and restart ESXi hosts.

ESXi system UUID in /etc/vmware/esx.conf

You can do it from command line as well

sed -i 's/system\/uuid.*//' /etc/vmware/esx.conf
reboot

So we have identified the problem and we are done. After ESXi hosts restart vSAN Cluster Nodes UUIDs are changed automatically and vSAN unicastagents are automatically configured on vSAN nodes as well.

However, if you are interested in how to manually add a connection to a unicast agent on a particular node, you would execute the following command

esxcli vsan cluster unicastagent add –a <ip address unicast agent> –U <supports unicast> –u <Local UUID> -t < type>

Anyway, such a manual configuration should not be necessary and you should do it only when instructed by VMware support.

Hope this helps someone else in VMware community.

No comments:

Post a Comment