Recovering NSX-T Manager from File System Corruption

One of our labs had a temporary storage issue which left two NSX-T Managers (separate instances of NSX-T installation) in a corrupted state. Here are some steps you can take to attempt to revive the NSX-T Manager appliance back to life. BTW these steps might work for Edge Nodes as well.

The issue starts with the appliance having file system in read only mode. After reboot you will see a message:
UNEXPECTED INCONSISTENCY: RUN fsck Manually

The first step is to go into appliance GRUB menu that appears briefly after start up, hit e key, enter root/VMware1 GRUB credentials (these are different from the regular credentials) and edit the line with starting with linux and replace ro with rw and delete the rest of the line.

Continue with the boot process by pressing Ctrl+x. Hopefully now you are able to get into BusyBox shell and run fsck /dev/sda2 or similar to fix the corrupted partition. Reboot.

What can happen now is that the appliance will boot but again will find LVM corruption and will go into emergency mode and you can see repeated Login incorrect messages.

Repeat the process with the GRUB edit. This time you will be asked to enter root password to go into maintenance mode.

Type the root password and follow this KB article by typing fsck /dev/mapper/nsx-tmp command. Reboot again.

Hopefully now the appliance starts properly.

What can also happen is that your root password expired and you will not be able to enter the maintenance mode. Although the official documentation has a process how to reset it, the process will not work in this case. The workaround is again in the GRUB menu edit the linux line, replace ro with rw but then append init=/bin/bash. You should be able to get to the shell and reset your password with passwd command.

Good luck with the recovery and do not forget to set up backup and disable password expiration.

7 thoughts on “Recovering NSX-T Manager from File System Corruption

  1. you can’t run fsck on a mounted volume. Unmount it then run fsck again. You’ll be rebooting after so you don’t need to re-mount it yourself.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.