Sunday, 15 August 2010

SQL Server administration best practices

SQL Server administration best practices

This article explains best practices for system administration in a Microsoft SQL Server 7.0 / 2000 environment, including regular maintenance tasks.

No industry today can do away without engaging in a working and efficient data protection plan. Data being the life and blood of any enterprise, protecting it becomes an inevitable task. All it needs for corporate data to be safe and secure is - a sound and wise investment in a backup and restore strategy and its implementation. If an organization considers data important, then it must focus on data protection and be willing to bear the costs associated with it. The elements of cost for such a strategy include:
  • Time invested in Planning
  • Trained personnel
  • Backup and restore hardware/media
  • Backup and restore software
  • Scheduled testing and validation of recovery plans
SQL Server utilizes a structure called a backup device to manage backups. These are logical names that point to physical files on the local hard disk or a network share. The backup devices allowed by SQL Server are tape, disk, and pipe (Note: Backups can also be written to and restored from physical files directly, without creating backup devices).

After all the introduction of why and how of backups, let's get to the core basics. We all know that backups are a must and they are the crux of an enterprise that cares about their data. Let me quote from an article I once read:

"
If a DBA maintains proper backups and can guarantee recovery of data up to the point required by the business process, they have done the job they were hired for. A solid backup plan is the first thing a DBA is required to do. If a DBA does absolutely nothing else in your company, he/she has earned their money by providing a solid backup plan and protecting your data. Every other activity is a simple bonus on top of this."

The planning and implementation of backup and recovery plans, the steps involved and guidelines are discussed under the following sections:
  • Creating a solid backup plan
  • Determine where to store backups
  • Things to keep in mind
  • Determine when to backup databases
  • Restoring databases
  • Creating a solid disaster recovery plan
  • Tips to save your life as a DBA
Creating a solid backup plan 
Nothing strikes fear into the hearts of managers, users and administrators than a server crashing and data becoming corrupted. A few short years ago, this would have been relatively important. But nowadays, it is absolutely vital and is one of the most serious things that can happen to a company. This is because, the data contained in the databases represents the competitive advantage of a company and its entire lifeblood. Losing key data can be catastrophic to a company.

Therefore, it is absolutely imperative that the DBA constructs a solid and reliable backup strategy so that in the event of a disaster the data can be recovered. The main questions that need to be answered while coming up with a backup strategy are:
1. How frequently does the data change in our system?

2. What is the downtime allowed for the production server in case of failure?

3. For determining the maintenance window, what is the time of the day, week and month when the database server is likely to have minimum activity - i.e. minimum updates?

4. How much is the enterprise willing to invest in data backup strategies? This is directly answered by addressing the question - how crucial is the data? Is it mission critical?

5. How much data loss is acceptable to the company?

6. How do we plan to recover the data in case of a failure?

Determine where to store backups 
As we learnt before, SQL Server can back up to hard disk, tapes or named pipe devices. This question above pertains to making a decision as to what media does a company want to use for an efficient backup and restore strategy. While browsing over that question, consider the following facts:
Disk File Backups: Disk backups will store the data on a physical disk i.e. simply a hard drive or a disk array. Disk backups can be performed locally or over the network. They are the most common and easiest medium for storing backups. Once a database has been backed up to a disk file, it can in turn be backed up to tape as a part of the regular enterprise file system backup (on tape). Thus, in case of corrupt or lost data, the disk files will have to be restored from tape and then restore the database into SQL Server from the disk files.

Tape Backup: When you back up databases directly to a tape, the tape drive must be attached locally to the SQL Server. Backing up to remote tape devices is not supported. However, you can have multiple tape devices attached to a single SQL Server. These tapes can then be moved to another location for off-site storage.

Named Pipe backups: SQL Server provides the ability to backup data to a named pipe to allow users to take advantage of the backup and restore features of third-party software packages.
Things to keep in mind 
A fact to always remember is that a database can be backed up while it is online and active. That means, it is actively being utilized by clients while it is being backed up. So the database server doesn't have to be down while performing backups. However, it should surely be in a state where it is minimally utilized at that time. This is because, the following operations cannot take place during the backup process:
Creating or deleting database files.
Creating indexes.
Performing non-logged operations.
Shrinking a database.
If you attempt to start a backup operation when one of these operations is in progress, the backup operation aborts. If a backup operation is already in progress and one of these operations is attempted, the operation fails and the backup operation continues.

While online backups are supported, you can't do online restores. During a restore operation, the database must not be in use.
Determine when to backup databases 
Your decision as to when and how often you back up your database depends on your particular business environment and the degree of importance of the application. There are also times when you may need to perform unscheduled backups.
Backing up system databases:
The system databases need to be backed up just as user databases are backed up. This allows the system to be rebuilt in the event of system or database failure, for example, if a hard disk fails. Please note that it is always a good practice to create a separate maintenance plan for backing up the system databases and not mix it with the user databases. Within that too, backup the master separately from the other system databases. This is because only full backups of master database are allowed.

The system databases in SQL Server are: master, msdb, model, tempdb, distribution. It is important to have regular backups of these system databases, however, it is not necessary to back up the tempdb system database because it is rebuilt each time SQL Server is started. When SQL Server is shut down, any data in tempdb is deleted permanently. For this reason, do not store any application specific data in the tempdb database. Leave it exclusively for use by SQL Server.

Again, model database needs to be backed up only if it is customized. Similarly distribution database comes into picture only if the server is configured as a Replication Distributor.

Master: The master database contains system information and high-level information about all databases on an SQL Server. If the master database becomes damaged, SQL Server may fail to start and user databases may become unavailable. There are many operations which change the content of the master database - like creating and altering databases, adding and modifying logins, creating linked servers etc. But since one cannot keep backing up master after every such operation, schedule the master database to be backed up on a regular basis (for example: once every night, or once every week depending on the frequency of such changes to the database). This will backup the changes made to the user databases and SQL Server, which can then be recovered in case of a master database corruption.

Note: Only full database backups of master can be performed. Transactional log, differential or filegroup backups of master are not allowed. Thus if you create a Database Maintenance Plan for all the system databases or if you select the master database and you select the Back up the transaction log as part of the maintenance plan option, the backup transaction log step for the master database will fail with this error message:
Backup can not be performed on this database. This sub task is ignored.
Model: The model database is a template, used by Microsoft SQL Server when creating other databases, such as tempdb or user databases. When a new database is created, the entire contents of the model database are copied to the new database. Back up the model database if you modify it, to include the default configuration for all new user databases. If the master or msdb databases are rebuilt, the model database is also rebuilt and therefore changes are lost.

Msdb: The msdb database is used by SQL Server, SQL Server Enterprise Manager, and SQL Server Agent to store data, including scheduled job information, backup and restore history information, DTS packages.

Note: You will notice that, by default, the trunc. log on chkpt database option is set to true, for the msdb database. This helps ensure that the transaction log of the database does not fill up, and prevents problems that may occur due to inadequate disk space. Because the msdb database generally remains rather small, full database backups provide a fast alternative to transaction log backups for this database.

Distribution: The distribution database is used by the replication components of SQL Server, to store data including transactions, snapshot jobs, synchronization status, and replication history information. A server configured to participate either as a remote distribution server or as a combined Publisher/Distributor has a distribution database.

Consult SQL Server Books Online regarding the backup/restore strategies of distribution database in different kinds of replication scenarios.

Backing up user databases:
User databases should be backed up on a regular basis. Also, it needs to be performed after a new database or index is created and when certain non-logged operations are executed.

There are four overall backup and restore strategies, each with its own strengths and weaknesses. A DBA needs to weigh each aspect of the database system and reach a decision, which is the best possible one for the system, users and administrators of the application. The database size and frequency of data modification determine the time and resources involved in implementing a database backup strategy. The four types of backups supported by SQL Server are:
  • Backing up only the database
  • Backing up the database and the transaction logs
  • Differential database backups
  • File or Filegroup Backups
Backing up only the database: With this strategy, the entire database is backed up regularly. In case of failure, all the committed transactions that occurred after the most recent database backup, are lost. The primary advantage of using only complete database backups is simplicity. Backing up is a single operation, normally scheduled at regular intervals. And should a restore be necessary, it can be accomplished easily in one step.

Use full database backups if:
  • The database is small. The amount of time required to backup a small database is reasonable.
  • The database has few data modifications.
  • You are willing to accept the loss of changed data if the database fails between backups and must be restored to its previous state.
Backing up the database and the transaction logs: With this strategy, the entire database is backed up less frequently; the transaction log is backed up frequently between database backups. In case of failure, you will be able to recover all backed-up transactions and possibly even committed (complete) transactions that occurred since the last transaction log backup. Only uncommitted (incomplete) transactions will be lost.

Use database and transaction log backups if:
  • The database is considerably large or predicted to grow large in the near future.
  • There are substantial updates/data modifications taking place on the database.
  • In case of a disaster, the need is to recover the database to as recent a state as possible - thus not to lose any transactions taken place on it.
  • You cannot afford to lose changes since the most recent database backup.
Differential database backups: This strategy is used to augment either the database backup strategy or a database and transaction log backup strategy. Differential backups consist only of the portions of the database that have changed since the last database backup. The first stage in this strategy is always to take a complete database backup. Then you can schedule the transaction logs as usual. But the interesting part is that now onwards, instead of scheduling a complete database backup, we would now schedule a differential backup after the day's transaction log backups. The differential backup strategy combined with the transaction log backup strategy, reduces the number of transaction log backups that need to be restored while rebuilding/recreating a database.

Use differential backup strategy if:
  • The amount of time spent in recovering the database by applying all the transaction logs is not acceptable.
File or filegroup backups: File or filegroup backups are a specialized form of database backup in which only certain individual files or filegroups from a database are backed up. This is usually done when there is not enough time to perform a full database backup. To make use of file and filegroup backups, transaction log backups must be created as well.

Use file or filegroup backup strategy if:
  • The databases to be backed up are Very Large Databases (VLDB) which are partitioned among multiple files.
  • For example, if you have only one hour to perform a database backup that would normally take four hours, you could create the database using four data files, backup only one file each night, and still ensure data consistency. Transaction log backups could be performed at short intervals during the day.
Restoring databases 
The model, msdb, or distribution database may need to be restored from a backup when:
  • The master database has been rebuilt using the Rebuild master command prompt utility.
  • The model, msdb, or distribution database has been damaged, for example, due to media failure.
If model has been modified, it is necessary to restore model from a backup when you rebuild master because the Rebuild Master utility deletes and re-creates model database.

If msdb contains scheduling or other data used by the system, it is necessary to restore msdb from a backup when you rebuild master because the utility deletes and re-creates msdb, which results in a loss of all scheduling information, alerts, DTS packages etc. If msdb is not restored, and is not accessible, SQL Server Agent cannot access or initiate any previously scheduled tasks. For example, if database backup operations are scheduled to run automatically using SQL Server Agent, a damaged msdb will prevent those backup operations from occurring.

The distribution database is not rebuilt automatically when the Rebuild Master utility is used to rebuild master; therefore it is not necessary to restore distribution after rebuilding master. If the distribution database is still intact, distribution can be re-created automatically by attaching the database to Microsoft SQL Server. Alternatively, a backup of distribution can be restored instead.

However, if distribution is not re-created by restoring a backup or attaching the database, the SQL Server replication utilities will not run, preventing data replication. If the distribution database is used for replication by many Publishers, this can affect many systems.

You cannot restore a database that is being accessed by users. Therefore, when restoring msdb or distribution databases, SQL Server Agent should be stopped. If SQL Server Agent is running, it may access msdb or distribution databases.


Creating a solid disaster recovery plan (DRP) 
Disaster recovery is the process by which information systems are recovered in the event of a catastrophe: a natural disaster such as a fire, or technical disaster such as a two-disk failure in a RAID-5 array. Disaster recovery planning is the work devoted to preparing all the actions that will occur in response to a catastrophic event. Disaster recovery assessment is the simulation of a catastrophic event and/or the evaluation of the disaster recovery plan's capability to deliver the specified recovery needs.

Questions that need to be addressed while creating a Disaster Recovery Plan:
1. Are you certain you can recover, in case of a catastrophe wiping out your 24/7 data center?

2. How long will it take you to recover and have your system available for normal functionality?

3. How much data loss can your organization tolerate?

4. How much are we ready to spend in order to achieve the required level of recovery?
Ideally, the disaster recovery plan should state how long the recovery should take, and the final database state the users can expect. It is typically important that management be kept clearly informed of these specifications. Disaster recovery assessment should be able to substantiate the specification.

A disaster recovery plan can be structured in many different ways and can contain many types of information (how and where to get the required hardware, the configuration of the servers, service pack information, who is to communicate what, who are the people to be contacted in the event of a disaster, how are they to be contacted, who owns the administration of the plan, and so on).


The Disaster Recovery Plan for each of the backup scenarios presented above is given below:

DRP for Backing up only the Database Strategy: If the backup strategy is to make complete backups of the databases, then recovery will be performed up to the point when the last full backup was taken.
To recover the database in case of a disaster, simply rebuild the server, restore the last complete backup taken, overwriting the corrupted version of the database.

DRP for Backing up the database and the transaction logs Strategy: Restoring a database that has been backed up using a database and transaction log strategy involves two steps. First rebuild the server, restore the most recent complete database backup. Then apply all of the transaction log backups that were created since the most recent complete database backup.

DRP for The Differential Backup Strategy: Recovery using this strategy requires that you restore the most recent complete database backup and the most recent differential backup. If transaction log backups are also made, only those created since the most latest differential backup need to be applied to fully recover the database.

DRP for The Filegroup Backup Strategy: Recovering using this strategy requires you to first rebuild the server, restore all file and filegroup backups, followed by the restoration of all the transaction log backups taken between the earliest file or filegroup backup and the end of the latest file or filegroup backup.
Tips to save your life as a DB
Here are a few checklists of activities that an efficient DBA should perform on a regular basis to make life easier and to ensure the reliability of data:

Main Checklist for things to do during the initial setup:
1.     Automate all possible jobs and maintenance plans on the server for things like database backups, integrity checks, automatic shrinking, transaction log backups, etc. You could do this by creating Maintenance Plans in SQL Server, which would automatically generate and schedule the required jobs. (For more details on Maintenance Plans, refer to Appendix A).
2.     Install SQL Mail on all your production servers and set it up to send you notifications on your email (or cell phone or pager - whatever is convenient).
3.     Through SQL Mail, set up email notifications for all the jobs and maintenance plans on the production server - for every database. Also set up email notifications for you to be notified in case of a severity alert - like file growing large and thus reducing disk space, etc.
4.     Always keep a script of the functional database schema in a secure location on the network. This comes in handy if you need to know the structure of the database in production or you need to recover a database, which does not have any backup left.
5.      Set the MSSQL and SQL Server Agent services to Auto-start when the server starts.
Monthly checklist:
1.      Make a list of all the sa passwords for each server and save it in a secure place.
2.      Make a list of all the passwords for each login created on the production boxes.
3.      Save the SQL Servers’ and Windows' configuration information in a secure place. This information is needed to rebuild an NT & SQL Server box in case of a disaster.
4.      Perform a test restore of a database backup. This is done in order to prepare for unforeseen situations.
5.      Save information about any changes made to a server - hardware or software.
6.      Maintain system logs in a secure fashion. Keep records of all service packs installed for both Microsoft Windows NT Server and Microsoft SQL Server. Keep records of network libraries used, the security mode, sa passwords and service accounts.
7.      Assess the steps in recovering from a disaster ahead of time on another server, and amend the steps in your Disaster Recovery Document, as necessary to suit your environment.
8.      Audit Database Access: You should periodically perform a review of who has access to your production databases and what type of rights they possess. Doing so can prevent unauthorized access to production data.
Daily Checklist:
  1. Check the connectivity of each server over the network. You could do this by pinging the SQL servers twice a day or by clicking the server’s name in your Enterprise Manager and seeing if it is able to connect.
  2. Check whether the services are running. For each server, go to its SQL Service Manager and check whether the SQL Server Agent and MSSQL Server services are running (showing a green light). If not, start those services. (You could also check these from the Control Panel or Enterprise Manager).
  3. Check whether the scheduled tasks on the production servers are running normally. You could check this from the Enterprise Manager of each server or your email (if you have set up SQL Mail to notify you).
  4. Check the hard disk space available on the SQL Servers. If system drives run low on space, they crash.
  5. Check all the database and transaction log space on each server. If the database or transaction log space runs out, the transactions will fail.
  6. Check NT event Logs for any error messages. SQL Server writes to the NT application log in case of application errors or SQL errors and also warns you before a problem becomes critical.
  7. Check SQL Error Logs for any errors occurring within SQL Server. SQL Server warns you through these logs before the problem becomes critical.
As needed Checklist:
  1. Run disk defragmentation utilities: You should periodically run disk defragmentation utilities on your server's hard disks. A high degree of hard disk fragmentation can lead to decreased hard disk performance. 
Other Useful Tips:
  1. While backing up or restoring databases manually from Query Analyzer using BACKUP or RESTORE commands, use the WITH STATS option. This option serves as a progress bar and displays the percentage of work done continuously.
  2. Spread your backups across multiple backup devices residing on different hard disk drives. This lets SQL Server take advantage of parallel IO, and improves the backup and restore performance.
  3. SQL Server 2000 lets you specify passwords for your backups. Use this feature effectively to prevent unauthorized access to backup files.
  4. Consider implementing a combination of transaction log and differential database backups to reduce the time it takes to recover from a failure. This approach reduces the amount of transaction log that must be applied while restoring a database.
APPENDIX A
Creating Database Maintenance Plans
The Database Maintenance Plan Wizard can be used to set up the core maintenance tasks that are necessary to ensure that the database performs well, is regularly backed up in case of system failure, and is checked for inconsistencies. The Database Maintenance Plan Wizard creates SQL Server jobs that perform these maintenance tasks automatically at scheduled intervals.
The maintenance tasks that can be scheduled to run automatically are:
  • Reorganizing the data on the data and index pages.
  • Compressing data files by removing empty database pages.
  • Updating index statistics to ensure the query optimizer has up-to-date information regarding the distribution of data values in the tables.
  • Performing internal consistency checks of the data and data pages within the database to ensure that a system or software problem has not damaged data.
  • Backing up the database and transaction log files. Database and log backups can be retained for a specified period. This allows you to create a history of backups to be used in the event that you need to restore the database to a time earlier than the last database backup.
The results generated by the maintenance tasks can be written as a report to a text file, HTML file, or the sysdbmaintplan_history tables in the msdb database. The report can also be e-mailed to an operator.
Useful Tips on Maintenance Plans
  • It is always recommended to create a separate maintenance plan for the system databases and a separate one for the user databases. This is so that when the backup policies are reviewed depending on the usage of databases, the system databases are not affected due to the changes and their backups are performed uninterrupted.
  • Create an operator (with a valid MAPI account) in SQL Server, and make him/her receive all the reports sent by maintenance plans via email. This way he/she would know if a job fails or succeeds each time it is run.
  • In the Integrity tab of the maintenance plan wizard, if you check the ‘Attempt to repair any minor problems’, remember - this would bring the database in single user mode. And if there are many users using it at that time, this operation would fail.

No comments:

Post a Comment