How to fix SQL Server database in recovery?

A SQL Server database may enter into a recovery state for various reasons, including hardware failures, power outages, memory issues, and database corruption. When a database is in recovery, it is in a special “redo” state where committed transaction logs are replayed to roll back any uncommitted transactions and restore the database into a consistent state. The recovery process happens automatically, but a database stuck in recovery typically requires manual intervention to successfully complete the recovery and bring the database back online.

Symptoms of a Database Stuck in Recovery

The most common symptoms of a SQL Server database stuck in recovery include:

  • The database is offline and unavailable to users
  • Status shows as “In Recovery” in SQL Server Management Studio
  • High CPU usage from the SQL Server process
  • Inability to manually bring the database online
  • Error messages about the database being in recovery

A database stuck in recovery will prevent access to the database files and data until the underlying issues can be resolved. Time is critical for a quick resolution, before transaction logs fill up the disk space or other severe errors occur.

Initial Troubleshooting Steps

When a SQL Server database enters recovery, allow sufficient time for the recovery process to complete automatically. Recovery time is influenced by factors such as hardware speed, database size, activity levels at time of failure, and number of transactions in the logs. However, if the recovery takes more than a few hours, manual intervention may be required.

As an initial troubleshooting step, check the SQL Server error log for clues about what caused the recovery state. Look for errors indicating hardware failures, out of disk space conditions, memory issues, or database corruption. The error log can point you towards the proper recovery steps.

Next, check there is adequate free disk space for the transaction logs on the server. Recovery can stall if the log files cannot expand as needed. You may need to free up disk space to allow recovery to finish.

Also verify that the SQL Server service account has read/write access to the database files and transaction log location. Incorrect permissions can prevent recovery from proceeding.

If the database is stuck in a loop trying to recover the same transactions, it may indicate log corruption. You can try restoring just the transaction log files from backup to resolve this loop scenario.

Put Database in Emergency Mode

A useful technique for debugging a stuck recovery state is to start the SQL Server database in emergency mode. This special start-up mode will prevent any uncommitted transactions from being rolled back or rolled forward. It allows read-only access to the database to help troubleshoot the underlying problems.

To start in emergency mode, locate the master database data and log files on disk, typically in C:\Program Files\Microsoft SQL Server\MSSQL\Data. Then start SQL Server from the command line with the -f parameter:

NET START MSSQLSERVER /f /T3608 

This will skip recovery steps for all databases and bring up the SQL Server instance in emergency mode. Now you can query the stuck database and inspect its files, data pages, indexes, etc. to identify any corruption issues. Finding and resolving damage to the database files or transaction logs is often the key to getting a database out of recovery.

Force Database Recovery

If you determine that log corruption is preventing recovery from completing, the nuclear option is to force recovery by truncating the transaction log manually. This will break the recovery process and lose all transactions that have not been hardened to the database files. But it can restore operations if the data loss is acceptable.

To forcibly recover the database, follow these general steps:

  1. Take the database offline in SQL Server.
  2. Stop the SQL Server service.
  3. Delete or rename the existing transaction log file(s).
  4. Start SQL Server, allowing it to recreate empty log files.
  5. Bring database online, which will force recovery to finish.

This process will break the recovery sequence and lose all uncommitted transactions. Only use this method if the data loss is acceptable and no other options exist to successfully complete database recovery.

Restore from Backups

If forcing recovery is not an option, the fallback is to fully restore the database from backups. Verify you have valid, recent full and transaction log backups for the database. Then restore the database to a point before the issues began:

  1. Restore the most recent full database backup.
  2. Apply transaction log backups in sequence, stopping at the last clean backup before recovery problems started.
  3. Bring the recovered database online.

This will replay all committed activity up until the recovery issue occurred. You will lose only the small amount of transactions after that point. This is the best way to recover with minimal data loss.

Restore Lost or Damaged Files

In cases where specific database files or logs are lost or damaged, you can also restore and recover only those individual files. SQL Server supports piecemeal restore operations:

  • Take the database offline to free up the files.
  • Restore just the missing or corrupt files from backup.
  • Bring database online and allow recovery to finish.

By selectively restoring only damaged files, you can minimize data loss and recovery time.

Rebuild Corrupted Databases

For severe database corruption problems, rebuilding the database from scratch may be required. This involves creating a new database, restoring a backup of the damaged database, and transferring the recovered data into the new database using Import and Export tools.

Follow these general steps to rebuild a corrupted database:

  1. Create a new, empty database to serve as the rebuilt database.
  2. Restore the last known good full backup of the damaged database to an alternate location.
  3. Use SQL Server Import/Export or other tools to export the recovered data into the new empty database.
  4. Redirect applications to use the new rebuilt database.
  5. Drop the original damaged database once migrated over to the rebuilt copy.

While rebuilding a corrupted database can be complex, it is sometimes the only way to recover and fix very extensive database errors and inconsistencies.

Resolve Underlying Issues

In addition to the immediate steps to restore a database stuck in recovery, be sure to investigate and resolve any underlying problems that led to the issue in the first place. Failing hardware, buggy software, memory leaks, or configuration issues can all cause repeated recovery failures if not addressed.

Analyze error logs to determine the root cause. Check for issues such as:

  • Disk failures or bad sectors
  • Faulty memory modules
  • Fragmented log files
  • Outdated drivers or SQL Server patches
  • Unoptimized queries slowing recovery performance
  • Transaction log configured for incomplete recovery model

Getting to the root factors that put the database into recovery can prevent future failures and avoid prolonged outage situations.

Monitor Database Closely

Once a database is recovered and back online, monitor it closely for any further anomalies. Some problems only manifest under load, so keep an eye on heavy usage periods. Watch for conditions such as:

  • Slow performance
  • Deadlocks
  • Blocking queries
  • Memory or CPU pressure
  • Transaction log growth spikes

Any of these could indicate that unresolved issues still lurk, and the database may fall back into recovery state under certain conditions. Continued monitoring and proactive maintenance is key to avoiding future recovery scenarios.

Implement Preventative Measures

While perfect uptime is impossible to guarantee, there are many best practices that can help avoid putting a SQL Server database into recovery status:

  • Use Enterprise hardware: Invest in quality SAN/NAS storage, redundant power supplies, ECC memory, and other fault-tolerant equipment.
  • Follow performance tuning guidelines: Monitor workloads and optimize queries, indexes, and server configuration.
  • Keep up with Patching: Install relevant updates, service packs, and hotfixes for SQL Server.
  • Configure database settings properly: Use full recovery model, test recovery scenarios, manage file growth.
  • Validate backups: Run regular test restores of database and log backups.

Building a resilient SQL Server environment better able to avoid and recover from disasters can ultimately help minimize downtime.

Conclusion

While a SQL Server database stuck in recovery may seem ominous, there are techniques both immediate and long-term that can help get it back online quickly and prevent future failures. Quick action to restore database files from backup, repair damaged pages, or forcibly recover the database can restore operations faster. But fully addressing root causes like hardware faults, memory issues, configuration problems, or corruption provides a more permanent fix and prevention plan.

With a combination of emergency restore options and ongoing preventative measures, even severely damaged databases that become stuck in recovery can often be repaired and returned to normal operation.

Leave a Comment