MS SQL, or Microsoft SQL Server, is a relational database management system (RDBMS) developed by Microsoft. As with any database system, SQL Server databases can become damaged or corrupted over time, often due to hardware failures, power outages, or program bugs. When this happens, it’s important to know how to properly repair the database to restore full functionality.
In this 5000 word guide, we will cover everything you need to know to repair and restore a corrupted SQL Server database, including:
- Detecting and diagnosing database corruption
- Repairing minor corruption using DBCC CHECKDB
- More serious repairs using emergency mode
- Restoring from backups
- Piecemeal restores
- Page and file restores
- Point-in-time restores
- Automated repair options
- When to rebuild vs. repair
- Best practices for backup and recovery
Properly repairing a damaged SQL Server database requires an understanding of the internal architecture and recovery capabilities of SQL Server. Following proper troubleshooting methods and using the right recovery techniques can mean the difference between restoring your database promptly or losing your data permanently.
Detecting and Diagnosing SQL Server Database Corruption
The first signs of database corruption often come in the form of errors being returned to client programs or within SQL Server error logs. Some common error messages related to corruption include:
- Error 824 – Indicates damage to database pages, indexes, or other structural components
- Error 833 – Points to corruption in catalog metadata tables
- Error 834 – General database corruption error
- Error 2570 – Transaction log corruption
However, a database can still be corrupted without any errors if only a portion of the data is damaged. That’s why it is important to be proactive and periodically check your databases for consistency issues.
SQL Server provides the DBCC CHECKDB command for checking database consistency. DBCC CHECKDB performs several checks on the structural integrity of the database, including:
- DBCC CHECKALLOC – Checks extent allocation consistency
- DBCC CHECKTABLE – Verifies integrity of all tables and indexes
- DBCC CHECKCATALOG – Checks consistency of catalog metadata
When corruption is found, DBCC CHECKDB returns error messages indicating the type and location of the corruption. This helps you determine the best repair method.
Some examples of important DBCC CHECKDB error outputs include:
- Msg 823 – Indicates a problem with a database page level integrity
- Msg 8928 – Corruption in catalog metadata
- Msg 8939 – Index corruption
To properly diagnose corruption issues, it is important to review the complete DBCC CHECKDB output, including severity levels and error state codes. The pages, objects, and ranges of damage indicate how widespread the corruption is.
You can also use built-in functions like DBCC PAGE and DBCC IND to view detailed page and index node information when corruption is isolated to specific database objects.
Having a sound methodology for detecting and diagnosing database corruption issues will help guide the repair process.
Repairing Minor Corruption Issues with DBCC CHECKDB
For minor corruption issues where isolated pages or database objects are damaged, running DBCC CHECKDB with the REPAIR_ALLOW_DATA_LOSS option can often fix the problems automatically.
For example:
DBCC CHECKDB ('mydb') WITH REPAIR_ALLOW_DATA_LOSS;
This tells SQL Server to repair any found errors without prompting. USE WITH CAUTION, however, because as the name implies, this option can result in data loss if used incorrectly.
Some key points about using REPAIR_ALLOW_DATA_LOSS:
- Always have a valid database backup before attempting repairs
- Test first in a non-production environment if possible
- Only use for minor corruption issues
- You will not be able to undo any changes, so data may be lost
When corruption is limited to a few isolated pages or database objects, REPAIR_ALLOW_DATA_LOSS can automate fixing the errors without needing more advanced and time consuming repair methods. Just be sure to have backups ready in case anything goes wrong.
Advanced Repairs Using Emergency Mode
For more serious corruption issues, SQL Server provides emergency mode recovery options. Emergency mode allows you to start an instance of SQL Server in a special single-user mode that gives full access to perform repairs using DBCC and other commands.
To start in emergency mode, specify the -f parameter on startup:
NET START MSSQLSERVER /f
This will start SQL Server without other user connections and with the database in emergency mode.
Once in emergency mode, you have a few repair options including:
DBCC CHECKDB Rebuilds
You can perform DBCC CHECKDB repairs by specifying REPAIR_REBUILD instead of REPAIR_ALLOW_DATA_LOSS. This rebuilds damaged pages and objects while preserving data.
For example:
DBCC CHECKDB ('mydb') WITH REPAIR_REBUILD;
DBCC PAGE and INDEX Rebuilds
Use DBCC CHECKDB first to locate specific corruption locations. Then target those pages or indexes for rebuild:
ALTER DATABASE mydb PAGE=5 WITH RECOVERY
DBCC INDEXDEFRAG (myindex)
Reassigning Damaged Pages
Mark bad pages for reassignment to fresh pages with DBCC MARK_CORRUPT_PAGE. SQL Server will rebuild data as it’s accessed:
DBCC CHECKDB ('mydb') WITH TABLERESULTS
DBCC MARK_CORRUPT_PAGE ('mydb', PageID, REASSIGN);
Start Up Database in Emergency Mode
Set specific database options like single-user mode and emergency mode on startup, then perform repairs:
ALTER DATABASE mydb SET SINGLE_USER WITH ROLLBACK IMMEDIATE
ALTER DATABASE mydb SET EMERGENCY
Restoring from Backups
If database corruption is widespread, restoring the full database from a backup is often the safest recovery option. Make sure backups are validated and readily available before attempting repairs.
Some options for restoring backups include:
Restore Database to Most Recent Full and Log Backups
Use the full database backup file, followed by incremental log backups to roll forward to the current point in time. This gets you closest to the state before corruption:
RESTORE DATABASE mydb FROM mydb_full_backup
RESTORE LOG mydb FROM mydb_log1_backup
RESTORE LOG mydb FROM mydb_log2_backup
Restore to an Explicit Point In Time
If you know a specific point where the database was not corrupted, restore to that point:
RESTORE DATABASE mydb
FROM mydb_full_backup
WITH STOPAT = '2021-10-28 00:00:00'
Piecemeal Restore
Restore individual files or file groups if only portions are corrupted:
RESTORE DATABASE mydb FILE='mydb_data_1'
FROM mydb_file1_backup
Piecemeal Restore
Piecemeal restore is useful when corruption is limited to specific files or file groups within the database. You can restore just the damaged portions without needing to recover the entire database.
The general process is:
- Take the database offline and into emergency mode
- Backup the logical file or filegroup needing restore
- Restore the damaged files from backup, specifying the file list
- Bring database back online
For example:
USE master; ALTER DATABASE mydb SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
BACKUP DATABASE mydb FILEGROUP='myfg1' TO DISK='myfg1.bak';
RESTORE DATABASE mydb FILE='myfg1' FROM DISK='myfg1.bak';
ALTER DATABASE mydb SET MULTI_USER;
Piecemeal restore works best when the damaged files or file groups are known and limited.
Page and File Restore
At a granular level, you can restore individual database pages and files using emergency mode. This targets just the corrupted portions.
The process involves:
- Detect corruption with DBCC CHECKDB
- Start database in emergency mode
- Identify file and page numbers of corruption
- Restore individual pages or files
To restore a file:
RESTORE DATABASE mydb FILE='file1' FROM mydb_file1_bak;
To restore pages:
RESTORE DATABASE mydb PAGE='123' FROM mydb_file1_bak;
RESTORE DATABASE mydb PAGE='456' FROM mydb_file2_bak;
Page level restore allows precise repair of damaged pages while minimizing data loss.
Point-in-Time Restore
When widespread corruption happens at an unknown point, restoring the database to a known good point-in-time can ensure recovery:
- Identify a point-in-time before damage
- Restore full and differential backups to precede that point
- Stop at the specified point-in-time
For example:
RESTORE DATABASE mydb FROM mydb_full_backup
RESTORE DATABASE mydb FROM mydb_diff_backup1
WITH STOPAT='2022-12-01 09:00 AM'
This rolls back the database to December 1 at 9am, eliminating any corruption after that point.
Automated Repair Options
SQL Server provides some automated recovery methods that can potentially repair corruption:
Auto-Close and Reopen Database
Databases marked as SUSPECT due to corruption may begin working again after SQL Server automatically closes and reopens them.
Auto-Recovery
Databases where auto-recovery is enabled will automatically run recovery steps after an unexpected shutdown, which can fix some issues.
Auto-Page Repair
SQL Server may detect bad pages automatically and try to recover data by reading from the page file on disk.
However, these automated options are not guaranteed to fix corruption problems. It is still best to use the targeted repair techniques described in this guide.
When to Repair vs. Rebuild
For widespread corruption, there comes a point where it is better to simply rebuild the database rather than trying to salvage it. Some scenarios where rebuilding from scratch makes more sense:
- Corruption exists across numerous files and objects
- The database is very large and repairs take too long
- You cannot pinpoint exact sources and timing of corruption
- DIfficulty restoring from backups
- Frequent reoccurring corruption issues
Rebuilding from scratch avoids cumulative data anomalies and gives you a fresh start. The decision depends on the extent and repeatability of corruption.
Best Practices for Backup and Recovery
While this guide focused on repair, recovering from backups is usually the safest approach. Some best practices include:
- Test restores regularly to validate backups
- Encrypt and protect backup files
- Have a defined backup schedule and retention policy
- Store backups off-site for disaster recovery
- Follow the 3-2-1 rule (3 copies, 2 media, 1 off-site copy)
- Monitor database consistency with DBCC CHECKDB
With rock-solid backups in place, you have insurance against data loss from corruption. Prioritize backup integrity and testing.
Conclusion
Repairing a corrupted SQL Server database can be complex, but following a systematic approach based on diagnosing the type and extent of corruption helps guide the process. Using targeted repair techniques like page restores, point-in-time recovery, and piecemeal restore you can often salvage the database. Backups are critical as a fall back option, so invest heavily in backup and recovery systems. Proper corruption handling methods along with solid backup/restore procedures will allow you to minimize downtime and data loss even with database corruption issues.