Wednesday, February 20, 2013

vCenter Service Dies, SQL To Blame

Working with VMware occasionally takes me into MSSQL. Most recently I experienced an issue where vCenter 5.1 build 880146 stopped working.  The service was just stopped and restarting the service fixed the problem right away. I was curious as to the root cause so I opened a ticket using the vCenter Support Assistant (best tool ever, go get it).

A quick look at the log showed the following:
2013-02-13T00:24:46.226-05:00 [05212 error 'Default' opID=SWI-106a635e] [VdbStatement::Fetch] SQLError was thrown: "ODBC error: (40001) - [Microsoft][SQL Server Native Client 10.0][SQL Server]Transaction (Process ID 263) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction." is returned when executing SQL statement "SELECT HOST_ID            FROM VPX_VM, VPX_VDEVICE_FILE_BACKING            WHERE VPX_VM.ID = VPX_VDEVICE_FILE_BACKING.VM_ID            AND VPX_VDEVICE_FILE_BACKING.HARD_DEVICE_BACKING_FILE_NAME = ?"

I sent VMware the logs for analysis and the vCenter Support Assistant made this very easy. The VMware support engineer told me this deadlock issue was resolved in 5.1 however there could have been a problem setting the policy on our database during the upgrade. I verified this by looking at the database properties and noting they were set to off:

ALTER DATABASE [<VCDB>] SET ALLOW_SNAPSHOT_ISOLATION OFF
GO
ALTER DATABASE [<VCDB>] SET READ_COMMITTED_SNAPSHOT OFF
GO

VMware recommended running the following query to set these options to ON.  Note that the option READ_COMMITTED_SNAPSHOT cannot be set while there are active connections to the database.  Therefore we put the database into single user mode temporarily, set the option and then switch it back.
ALTER DATABASE <VCDB> SET ALLOW_SNAPSHOT_ISOLATION ON;
GO
ALTER DATABASE <VCDB> SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
ALTER DATABASE <VCDB> SET READ_COMMITTED_SNAPSHOT ON;
ALTER DATABASE <VCDB> SET MULTI_USER;