netnerds.net

14Jan/111

Adding a Node to a SQL 2008 Failover Cluster

Installing a SQL Server 2008 Failover cluster is actually fairly easy, once you have the Windows Cluster set up. I experienced an issue adding the failover node, which is now a separate process compared to how it was done in SQL Server 2005. I'm not going to cover the entire process in this post, but I wanted to share a piece of information I came across when I had issues adding the failover node.

If you fail the cluster checks when attemping to add the second node to a 2 node SQL Server 2008 cluster, run the setup from command line and tell it to skip the cluster checks. If you installed the primary SQL node and the cluster checks passed, then you are most likely fine...

1. From a command prompt on the node that is going to be added, locate the SQL 2008 installation files directory.
2. Enter the following command, specifying the instance name that you are adding
setup /SkipRules=Cluster_VerifyForErrors /Action=AddNode /INSTANCENAME="myInstance"
3. Follow through the setup screens as normal.

That's it. Enjoy your new cluster once the setup process has completed.

Posted by: Brandon   Filed under: SQL Server 1 Comment
18Nov/100

SQL Browser Routing Errors and Infrastructure Errors in Event Log – syspolicy_purge_history

Recently, I noticed errors occuring daily at 2:00AM on my SQL Cluster. The errors showed Windows logins being routed to the wrong named instances, along with errors from SQL Browser not being able to service client requests. I noticed that all of the instances being affected were SQL Server 2008, but never any of my 2005 instances.

My first guess was maybe the server was being inundated with connections by some rogue job and causing SQL Browser to get confused. One day, SQL Browser even suffered a fatal error and stopped servicing requests altogether. This meant all of my instances on the server were unavailable until I noticed that morning and restarted the SQL Browser service.

After some investigation, I noticed that the syspolicy_purge_history job that every SQL 2008 install gets was running on all of my 2008 instances at 2:00AM. So, I disabled all of the jobs and the next day all of the nightly errors were gone.

I found this documented by Microsoft here:
http://support.microsoft.com/kb/955726

I had seen references to this before in regards to updating the job when using Virtual SQL Host names, especially in clustered configurations. The default setup uses the local node name, but if you fail the instance over to another node, the PowerShell script that gets called is called with the wrong host name... which may lead the job to fail.

I still don't know exactly why this job causes SQL Browser to freak out, but it does. I'd rather not even use the job, since I'm not currently using Policy Management, but I haven't yet investigated far enough to see if I need the step that cleans up the health records. I will just update all of the jobs to use the virtual SQL host name and check for errors every morning until I'm comfortable.

So, moral of the story - If you have SQL 2008 instances running in a clustered environment, make sure you update the node names in Step 3. Staggering the jobs wouldn't be a bad idea either.

Posted by: Brandon   Filed under: PowerShell, SQL Server No Comments
7Sep/100

SQL Server 2008: Dumb New Default Setting in SSMS 2008

In an effort to protect DBAs and developers from themselves, Microsoft has introduced a new default setting in SQL Server Management Studio. This setting prevents you from saving almost any change to db tables and throws out the following error:

Saving changes is not permitted. The changes that you have made require the following tables to be dropped and re-created. You have either made changes to a table that can't be re-created or enabled the option Prevent saving changes that require the table to be re-created.

I don't know about you, but I stopped reading at the second sentence because most Microsoft error messages do not contain super helpful information. In this case, however, they provide the solution, which is to uncheck "Prevent saving changes that require the table to be re-created." This setting can be found in Tools -> Options -> Designers -> Table and Database Designers.

Don't make this change if you work with db metadata.

Posted by: Chrissy   Filed under: SQL Server No Comments
24Aug/100

SharePoint 2010 Deployment: I wish I had this when I started…

SharePoint Geoff has a SharePoint 2010 Deployment Microsoft Project file (MPP) available for download that lists nearly every step imaginable for deploying SharePoint 2010. From identifying team leaders to installing software to monitoring web performance, it's got a ton of things covered. It's not a tailored solution, especially for migrations, but it's a great place to start.

Here are other random links that you may find useful:
SharePoint 2010 Technical Diagrams at Microsoft.com
SharePoint Upgrade Action Plan
Upgrade and Migration for SharePoint Server 2010 on Microsoft.com

Good luck!

Posted by: Chrissy   Filed under: SQL Server, SharePoint No Comments
28Jul/101

Saving Time with Differential Backups

Recently I've been migrating a lot of databases from stand-alone physical servers into a PolyServe environment. Since this obviously requires a downtime, we have to schedule the moves with limited amounts of time. To add to the issues at night, the NetApp filers are busy so copies are slower.

In order to speed up the process and reduce the application downtime, I decided to use differential backups - which I will admit to not using that often.

If my downtime starts at 11PM, I will take a Full database backup earlier in the evening with SSMS. I backed the database up to the remote server that was to become the new SQL Server (\\server\path\db.bak) in order to avoid having to copy the backup file later.

After the full backup was completed, I restored it to the new server and made sure all my logins were moved over with sp_help_revlogin. Make sure when you restore the full backup that you choose the RESTORE WITH NO RECOVERY option, or you will not be able to apply the differential backup. Also, verify that the directory structure you are restoring the files to is correct. It will try to restore to the directory structure used by the old server.

Once the downtime started, I made sure all user connections to the database were gone and kicked off my differential backup. My full backup was nearly 20GB, but my differential was less than 200MB, which I also backed up to the new database server.

Once the differential backup was completed, I restored this database (using the Restore Database functionality in SSMS) as well. This time select the option to Restore with Recovery if you do not have any transaction logs to restore. If you shut down database access and took a differential like I did, there will be no transaction logs to worry about. Once you Restore With Recovery, the database will be available for use immediately after the restore is complete.

At first I wasn't sure if I should Restore Log or Restore Database, but Restore Database seemed like the best option and it was the right one. For some reason I had to fix the paths to my databases again, because the backup was using the directory structure from the old server here as well. I hoped it would be smart enough to know what to do on its own, but wasn't.

I did not specify OVERWRITE. The restore initially was quick but then hung on 90% for a while. It finally finished up and my database was ready to go. My downtime was 15 minutes - the time it took to take the differential and restore it on the other server. For a savings of at least 30 to 40 minutes.

I'd rather put the time in up front and reduce downtime with an added bonus of not having to stay up as late waiting on it. I used the time I saved to write this article as a quick reference for anyone else who is looking for the same information.

Posted by: Brandon   Filed under: SQL Server 1 Comment
1May/100

SQL Backup Process Hanging and Stuck in Kill/Rollback – HyperBac

One of the most important tasks a DBA should perform daily is verifying database backup jobs are successful. Using Database Mail to e-mail upon the failure of a job is something everyone should be doing. But in some cases, that is not enough. You can query the MSDB database in order to find out the last time a database was backed up. If you have a large amount of servers, consider using PowerShell's SMO capabilities to query a list of servers and compile the information into a database table. Then you can query that table for backup dates older than a day or two, depending on your backup schedule. That's the topic of another article.

Using a similar approach, I recently discovered a database that had missed being backed up for 2 days over the weekend. I had not received any email regarding a failure. Using SSMS, I pulled up the list of jobs in Object Explorer Details and noticed that the job was stuck executing since the previous Friday night. The database backup file was not actively being written to, and since backups are serialized commands, any subsequent attempt to backup the database while that process is running will be unsuccessful. Using SSMS, I stopped the executing job.

A quick query of the sys.sysprocesses table (select * from sys.sysprocesses) showed that the process was still running. I killed the process since it was being so difficult. From the sysprocesses table results, find the spid for the process. Execute the statement Kill spid where spid is the actual ID that is shown. If you prefer to use SSMS, launch Activity Monitor, find the process in the list and kill it from there.

Once I killed the process, it went into a rollback state, which is expected. I'm not 100% sure what a backup transaction is rolling back, but it sure will try. Take note of the "waittime" column in the sys.sysprocesses table. The value may increase initially, but at some point it should max out and start decreasing. It is possible to track the progress of the rollback by executing the following command:

kill spid with statusonly

The result stated that the rollback was 16% completed. I decided to wait and see how long it would take. My waittime value continued to increase. I went on with my business and checked back later. After many hours I returned to find that the rollback was still at 16% and the waittime was STILL increasing. I knew something was wrong. My first quick-fix idea was to restart the SQL Service. This actually would have fixed my problem, but being that the system is in production, I wanted to find a better way. Also, a restart would not help me figure out why the problem was occurring in the first place.

Eventually, I realized that this server was using HyperBac, which is a really cool compression tool that intercepts a native backup and zips it up. I decided to stop the HyperBac service to see if it would help. After attempting to stop the service, the service got stuck in the "Stopping" state. I checked the log file in the HyperBac (Program Files) folder and saw a line stating that it was attempting to close out the very backup that had been stuck for 3 days.

I gave the service a large amount of time to stop, but it never did. I eventually had to End Task a few times on the HyperBac process. Once the process ended, my SQL backup process disappeared.

I cannot definitively say if HyperBac caused the SQL backup process to hang, or if the SQL process caused HyperBac to hang. Of the hundreds of backups that run every night, I have run into this issue 3 times. All 3 times were on large databases (~70GB) using HyperBac to back up to a network drive. Perhaps a network blink during the backup process is to blame for initiating the issue. Whatever the reason, I opened a ticket with HyperBac and was told that version 4.2.x fixes this situation. I should also comment on how quickly HyperBac responded to my problem with an answer, so thumbs up on customer support. I plan on upgrading to 4.2.x soon, but server reboots are required. If the issue happens again before then, it is easy enough to fix.

Posted by: Brandon   Filed under: SQL Server No Comments
1May/102

SQL 2008 Installation Failure on Reporting Services

Recently, SQL Server 2008 was uninstalled from a Windows 2008 Server machine due to some configuration issues with Reporting Services. Attempts to re-install SQL Server with the Reporting Services option would fail. I was tasked with the re-install and as always, I welcome this type of work because there is something to learn nearly every time. Before the installation began, I went to "Programs and Features" and uninstalled anything referencing SQL Server. The Native Client installation was still listed, so I removed that. There was also a reference to SQL Server 2008 itself, but attempts to remove it failed instantly saying it was already uninstalled. Red flag number 1.

An installation summary was provided from a previous failed attempt and it provided an exception link with the following information:

Product: SQL Server
ID: 50000
Source: setup.rll
Version: 10.0
Component: SQL Server Native Client
Message: A network error occurred while attempting to read from the file '%.*ls'.

The page went on to explain that an attempt was made to install (or update) SQL Server Native Client on a computer where SQL Server Native Client is already installed, and where the existing installation was from an MSI file that was not named sqlncli.msi.

Having just removed the SQL Server Native Client, I was confident that my next installation attempt would be successful. To make a long story short, my installation failed again when trying to install Reporting Services with the same exact error. Ok, so we must be dealing with a registry key issue, so I decided to run regclean to clean up the system. I would caution to do this at your own risk. In my case, there was nearly nothing else installed on the server and if anything drastic happened, I would have been able to get a file system restore or having the system rebuilt without losing anything major. I reviewed the undo reg file that was created by regclean and found references to Reporting Services.

Attempt 2! I was confident yet again that it would work and to make another long story short - it failed. At the same point, with the same error! Ok, now it's time to go through the error log a little bit closer. And for those of you who don't care about all the steps and only the solution, you should have skipped down to this part.

After the installation failure, the setup directed me to a summary log file. In the log file, there was a path to another log containing the failure information:

C:\Program Files\Microsoft SQL Server\100\Setup Bootstrap\Log\20100428_094920\sql_rs_Cpu64_1.log

I opened this file and located the following line:

MSI (s) (3C:80) [09:57:00:939]: Specified instance {2453DBC8-ACC4-4711-BD03-0C15353AA3D8} via transform :InstID01.mst;:InstName01.mst is already installed. MSINEWINSTANCE requires a new instance that is not installed.

Fire up RegEdit and search for the instance ID specified in the log. In my case it was {2453DBC8-ACC4-4711-BD03-0C15353AA3D8}.

Delete the registry keys that match up with that ID. You will notice that some of the values contained in that key folder reference Reporting Services and the initial installation path. In my case, I knew it was the right place to be because the previous install was done from a different location than where I was installing from.

After deleting the registry keys, I launched my Setup and added Reporting Services successfully. I was unable to choose the default configurations, however. My only option was to install without configuring. Once the installation was completed, I had to launch Reporting Services Configuration Manager and set up the new ReportServer databases. You have the option to create new databases or use existing ones.

This methodology would probably come in handy in any type of SQL Server 2008 installation failure where you suspect bad registry entries.

Posted by: Brandon   Filed under: SQL Server, Windows 2 Comments
29Mar/101

SQL Server: Error: 3041, Severity: 16, State: 1

There are probably a good number of reasons this error shows up in your SQL Server log, along with the following message:

Source: Backup
Message: BACKUP failed to complete the command BACKUP LOG DATABASE_NAME. Check the backup application log for detailed messages.

In my case, the recovery model was set to SIMPLE and the DB maintenance plan was set to backup all user databases, which included all databases set to SIMPLE. I have my suspicions as to why those production databases were set to SIMPLE but FULL is best in most cases anyway, so I set all recovery models to FULL, and manually backed up all databases. To see a list of all of your databases and their recovery models, run the following query:

select [name],databasepropertyex([name],'Recovery') as [Recovery]
from master.dbo.sysdatabases
order by [name]

Note that the following system databases are set to SIMPLE by default: master, tempdb and msdb.

Posted by: Chrissy   Filed under: SQL Server 1 Comment
20Jan/102

Asynchronously mirroring a SharePoint 2007 Content Database between Two Farms

Recently, I triumphantly setup an asynchronous content mirror between two separate SharePoint farms.

My SharePoint architecture consists of two separate farms in two different cities (San Diego and Las Vegas), both accessible over a corporate intranet.  Each farm is unique and the configuration and administration databases contain unique information that cannot be mirrored from one site to another. The technique outlined below works, however, because the content databases are identical between the two sites and that is the only information that must be mirrored. One farm is a fail over for the other farm, so each farm contains the same sites.  

After searching the Internet for information on how to mirror SharePoint data, I only found one MSDN article which stated that it should be done either with SQL Server clustering or with synchronous mirroring requiring a witness server.  Eventually I found a picture in the MSDN showing a way to mirror only the SharePoint content databases between two farms asynchronously.  

This picture outlined what I wanted, but I knew SharePoint well enough to know it was going to cause problems. I decided to try my hand at a solution, nevertheless. I started on the primary farm in Las Vegas, where I set up mirroring on the Sharepoint_Content database and mirrored it to the Sharepoint_Content database in San Diego, which was designated as the backup farm.  By design, the San Diego content database was inaccessible while the mirroring was in place, so I had little chance of the backup farm actually loading up any pages. This was acceptable, however, because the backup farm did not need to work unless I failed over to it.  I made some changes to my primary farm by adding a few announcements to a web part on a site and attached a picture to it.  After some time, I decided to remove mirroring from the primary database.  The reason I removed mirroring instead of doing a fail over, is because I could not have downtime on the primary farm.  Nor did I want to take any chances breaking the database by reversing the mirroring if my technique did not work.

After breaking the mirror, the mirrored database showed as "Restoring.." until I ran the command:  RESTORE DATABASE [SharePoint_Content] WITH RECOVERY.

This method is used to make the mirror available again as a stand-alone database.  Once the database was available, I loaded up my SharePoint site on the fail over farm with my fingers crossed.  As the page loaded, I saw a dreaded "Error 400 - Bad Request."  I figured this would happen, as things are rarely straightforward with SharePoint. I didn't have time to address the issue so I decided to just restore from my most recent backup and put my experiment on hold.  After issuing a restore command using stsadm, I saw a peculiar error which stated that it could not restore my site because no content databases were associated and suggested that I add a content database before proceeding.

Perhaps this could work after all! I loaded up the administration panel, found at http://servername:8080, and navigated to Content Databases under Application Management.  The Sharepoint_Content database was listed as it should have been but I figured since it was not working as desired anyway, I could remove the SharePoint_Content database from the configurations and re-add it.  First, I removed the content database and re-added a new content database with the same name and selected my SQL Server from the drop-down menu.  I clicked OK and SharePoint updated its configuration.  Soon the new content database was listed.

I hurried back to my site, refreshed the browser and to my delight, the page loaded and the announcements I had added in Las Vegas were displayed on the site running in San Diego. It was at this moment when I realized that this instance of making the impossible possible would be perfect fodder for my first blog post on netnerds.net.

Posted by: Brandon   Filed under: SQL Server, SharePoint 2 Comments
14Jan/101

Free SQL Server eBooks available

Brad McGehee just published what's sure to be another incredibly informative ebook on SQL Server Maintenance Plans. A list of free eBooks, including the SQL Server Tackle Box, can be found on Brad's blog. Check it, one time.

Posted by: Chrissy   Filed under: SQL Server 1 Comment