|
|
NOTE: The troubleshooting items below are only applicable when Nutanix support is enabled, and are to be used along with VMware Troubleshooting. |
If the connection to your Nutanix Cluster or Nutanix Controller Virtual Machines is lost, the following log messages and exceptions may appear in the error.log:
"Failed to start connection; nested exception is:"
"net.schmizz.sshj.transport.TransportException: Broken transport; encountered EOF"
"java.net.ConnectException: Connection times out: connect"
This error may occur if the Cluster is down when PowerChute attempts to stop the Cluster. This could also happen if your Nutanix Cluster credentials have changed after configuring PowerChute. Ensure the correct Cluster/Controller VM credentials are provided via the PowerChute Setup wizard, or the Nutanix Settings screen.
The errors below are written to the PowerChute Event Log:
"Could not connect to the Nutanix Cluster. Cannot stop Nutanix Cluster services. Please find more information here."
"Authentication error occurred when connecting to the Nutanix Cluster. Please verify that the correct credentials have been provided. Please find more information here."
"A connection error occurred when connecting to the Nutanix Cluster. Please verify that the connection details are correct and that the cluster is accessible from the network that PowerChute is installed in. Please find more information here."
"The Nutanix cluster is not running. Cluster operations cannot be performed. Please find more information here."
This can be caused by an unsuccessful attempt to stop AFS or an unsuccessful attempt to power off all User VMs. To resolve this issue, ensure you configure a sufficient duration to successfully stop AFS and power off all the User VMs.
The error below is written to the PowerChute Event Log:
"Nutanix Cluster cannot be gracefully shut down. Please find more information here."
"The cluster is still running after the cluster stop command was sent. Please find more information here."
"Unable to connect to any CVM for the cluster stop operation. Authentication errors occurred with at least one CVM. Please verify that the correct credentials have been provided. Please find more information here."
"Unable to connect to any CVM for the cluster stop operation. The CVMs may be stopped, the connection details may be incorrect, or the CVMs may be unreachable from the network PowerChute is installed in. Please find more information here."
Cluster may not start if the time drift is incorrectly set. To check if the time between the individual Controller VMs is correctly synchronized, execute the following command:
allssh date
For information on configuring time synchronization for your Nutanix Cluster, see the following recommendations from Nutanix.
This could also happen if your Nutanix Cluster credentials have changed after configuring PowerChute. Ensure the correct Cluster/Controller VM credentials are provided via the PowerChute Setup wizard, or the Nutanix Settings screen.
This could also occur if insufficient time has been configured for cluster start.
The errors below are written to the PowerChute Event Log:
"The start cluster operation did not complete successfully. Please find more information here."
"The cluster has not started after the cluster start duration has elapsed. Please find more information here."
If AFS cannot be successfully stopped, the Cluster cannot be stopped and Controller VMs cannot be gracefully shut down. To resolve this issue, increase the AFS stop duration. It is recommended that you manually test the time needed to stop AFS on your Cluster and specify that as the duration in the PowerChute UI.
To do this, connect to any Controller VM while AFS is running, and use the following command:
afs infra.stop <fileservername>
Note the duration it takes for the AFS service to fully stop and use this as the AFS Shutdown Duration in the Virtualization Settings screen in the PowerChute UI.
The error below is written to the PowerChute Event Log:
"Nutanix AFS cannot be gracefully shut down. Please find more information here."
The error below is written to the event log when insufficient time is configured for AFS stop:
"The AFS stop operation timed out. Please ensure that sufficient time has been configured. Please find more information here."
The error below is written to the PowerChute error.log:
"Failed to stop Nutanix AFS: [error received]"
If the AFS stop command failed to execute properly, the following error is written to the PowerChute event log. In this case, further details may be available in the error and debug logs. It is recommended to manually test that AFS stop functions correctly by connecting to a controller VM and running the afs infra.stop command, as described above.
"The AFS stop operation did not complete successfully. Please find more information here"
For information on why this did not abort, refer to the error message generated in the error.log.
The error below is written to the PowerChute Event Log:
"Nutanix Protection Domain replications cannot be gracefully aborted. Please find more information here."
If some active replications failed to abort, the following error is written to the event log. In this case, check the corresponding message in the error.log for more information:
"Some active replications were not successfully aborted. Please find more information here."
The errors below are written to the PowerChute error.log:
"Failed to retrieve Protection Domains replication status: [error received]"
"Failed to abort ongoing Protection Domain replications: [error received]"
Metro Availability may not be disabled if Metro Availability was not correctly set up on the Cluster.
For more information, refer to the error message generated in the error.log.
The error below is written to the PowerChute Event Log:
"Nutanix Metro Availability cannot be disabled. Please find more information here."
The error below is written to the PowerChute error.log:
"Failed to disable Protection Domains Metro Availability: [error received]"
Metro Availability may not be re-enabled if Metro Availability was not correctly set up on the Cluster.
For more information, refer to the error message generated in the error.log.
The error below is written to the PowerChute Event Log:
"Nutanix Metro Availability cannot be enabled. Please find more information here"
The error below is written to the PowerChute error.log:
"Failed to enable Protection Domains Metro Availability: [error received]"