You know what’s scary as hell? When one node of an important cluster loses its trust relationship with the domain and you see the error “the trust relationship between this workstation and the primary domain failed”. That happened to me late last year with one of my SQL Server 2008 R2 nodes. The scary part was that I just didn’t know what to expect. The fix could be simple, or it could require a node rebuild.
Generally, I address this issue not by removing servers from the domain, but by using PowerShell v3 by executing the following with an admin prompt:
In this case, however, I did not have access to v3, so at an Admin prompt, I executed
netdom resetpwd /s:dc.ad.local /ud:ad\adminaccount /pd:*
This successfully reset the password and I was able to login again with a domain account. I then started the cluster service, and failed my SQL Server over with no issues. I then failed back and rebooted the server for good measure (and to ensure the trust still existed). I tested this as well on Windows 2012 R2 with a SQL 2014 cluster with success, though Reset-ComputerMachinePassword was easier to remember and worked just as well.
What caused the loss of trust? I haven’t figured it out yet, but I’m assuming that node probably cheated with its ex.