Smart Status refers to the self-monitoring, analysis, and reporting technology (SMART) built into computer hard disk drives. SMART monitors certain internal attributes of the drive to detect and report on various indicators of drive reliability and likelihood of failure.
What is SMART Status?
SMART (Self-Monitoring, Analysis and Reporting Technology) is a monitoring system built into most modern hard disk drives (HDDs) and solid state drives (SSDs). Its purpose is to detect and report on various indicators of drive reliability and likelihood of failure.
SMART-capable drives have sensors that monitor things like drive temperature, mechanical wear, read/write errors, bad sectors, spin-up time, and other internal metrics. The drive firmware (software integrated into the electronics of the drive) analyzes these metrics over time to determine the drive health and likelihood of failure in the near future.
SMART data and alerts are accessed through drive diagnostic software or system utilities. The SMART status indicates whether the drive is operating normally or if it is experiencing issues that could lead to failure.
Common SMART Statuses
- OK/Good – The drive is operating normally and has no indication of imminent failure.
- Caution – SMART has detected early signs of issues that bear monitoring but do not yet indicate imminent failure.
- Bad – SMART has detected conditions that indicate imminent drive failure is likely.
- Failing – SMART has detected conditions indicating failure is imminent or already in progress.
What Does “Bad” SMART Status Mean?
A SMART status of “bad” indicates the drive has detected indicators of imminent failure. This means failure could occur at any time and immediate action should be taken.
Some examples of SMART attributes that can cause a bad status include:
- High number of reallocated sectors – The drive has marked bad sectors and substituted spare good sectors, indicating physical media errors.
- High soft read error rate – The drive is encountering uncorrectable read errors due to physical media damage.
- High scan uncorrectable sector count – The drive is detecting unreadable sectors during scans and cannot recover the data.
- High hard read error rate – The drive is encountering mechanical issues reading from the platter surfaces.
- High Reallocation Event Count – The drive firmware has remapped a large number of problematic sectors.
- Reported uncorrectable errors – The drive has encountered errors even error correction code cannot correct.
- High head flying height – Read/write heads are functioning out of normal specification.
- High seek error rate – The drive actuator is having trouble mechanically positioning heads.
A bad SMART status means it is prudent to immediately back up data from the drive and replace it, as failure could be imminent.
What Does “Bad Backup and Replace” Mean?
“Bad backup and replace” refers to the recommended actions to take when a hard drive shows a bad SMART status – immediately back up the data and replace the drive.
Why Backup the Drive?
Backing up the data on a drive with a bad SMART status is critical because failure could occur at any time. Once the drive fails, the data may become inaccessible. So an immediate backup of the potentially failing drive preserves the data before it is lost when the drive stops working.
Why Replace the Drive?
Replacing the drive is recommended because a bad SMART status means hardware damage or deterioration has occurred that will likely get worse. The drive has a high probability of complete failure in the near future. Replacing it with a new drive prevents permanent data loss when the declining drive finally reaches the end of its lifespan.
How to Check SMART Status
Checking the SMART status on a drive will reveal if the drive is OK or has a caution, bad, or failing status. Here are some ways to check SMART status:
- Disk Management – Under drive Properties, the Volumes tab shows SMART status.
- Device Manager – Under Disk Drives properties, the Tools tab displays SMART status.
- DiskPart – The “detail disk” command shows SMART status.
- PowerShell – Use the Get-PhysicalDisk cmdlet to see SMART status.
- Third-party tools – Apps like Speccy, CrystalDiskInfo, Hard Disk Sentinel show SMART info.
- System Information – Under Hardware > Storage, SMART Status is shown.
- Disk Utility – Select drive and click Info to see SMART Status.
- Terminal – Use smartctl command to view SMART data.
- Third-party tools – Apps like DriveDx and SMARTReporter display SMART data.
- GNOME Disks – Select drive and view SMART Data & Tests button.
- GNOME Disk Utility – The Self-tests tab displays SMART data.
- Terminal – Use smartctl command to view SMART info.
- Third-party apps – Tools like GSmartControl and HardInfo check SMART.
How to Repair or Backup a Drive with Bad SMART Status
When a drive shows bad SMART status, repair options are limited. The best course of action is to back up the data and replace the drive. However, here are some other options that may help get a failing drive working temporarily:
- Repair bad sectors – Utility tools like chkdsk or fsck can detect and isolate bad sectors.
- Update firmware – An updated firmware version may fix buggy SMART monitoring or work better with the drive.
- Clear SMART data – Resetting the SMART data makes the drive re-evaluate its health. Not recommended.
- Troubleshoot environmental issues – Check drive temperatures, cabling, power supplies or interference.
- Low-level format – Completely erases drive and recreates the filesystem. Destructive to data.
However, these steps are not guaranteed to bring a failing drive back to full health. The safest option is comprehensive data backup and replacement of the drive.
How to Backup a Failing Drive
To properly back up a failing drive showing bad SMART status:
- Use drive cloning software like Clonezilla to make an exact copy to a new drive.
- frequently during backup to detect deterioration.
- Opt for disk imaging over file copy backup.
- Backup to a different physical drive, not just a partition.
- Verify backup integrity and restorability.
- Store backup copy offline or offsite to protect from cryptolocker.
Also consider replacing cables, power supplies, controllers or drive bays during backup if issues are detected.
Selecting a Replacement Drive
When replacing a failed or failing drive, consider compatibility, capacity and form factor:
- Compatibility – Choose a replacement drive that matches connectors and interfaces like SATA or SAS.
- Capacity – Equal or larger capacity; copy less data if going smaller.
- Form factor – Match the physical size and mounting of old drive.
- Features – Seek same or better speed, cache size, warranty length.
An exact model match or upgrade is ideal. Consult hardware compatibility lists if needed.
How to Restore from a Backup
To restore a backup after replacing a failed drive:
- Reinstall the OS if cloned drive won’t boot.
- Connect backup drive and boot to a liveCD environment.
- Use recovery software to restore the backup image to the new drive.
- Verify partitions and filesystems before restoring files.
- Copy individual files/folders selectively if OS intact.
- Check restored data integrity; re-backup corrupted files.
- Reconfigure bootloader and OS as needed for hardware change.
Test the restored drive thoroughly before returning system to production use.
Preventing Drive Failure
Some tips to keep drives healthy and avoid failures:
- Monitor SMART stats and replace caution drives early.
- Manage drive temperatures and airflow.
- Use surge protectors and UPS battery backup.
- Maintain proper server environmentals.
- Implement RAID redundancy for fault tolerance.
- Do not jolt, bump, or move drives during operation.
- Perform regular backups and test restores.
Catching issues early and taking preventative measures will help avoid sudden drive failures.
A bad SMART status indicates a drive is prone to imminent failure and its data should immediately be backed up and the drive replaced. This prevents permanent data loss when the declining drive finally stops working completely. Monitoring SMART stats, controlling drive environmentals, and implementing fault tolerance features like RAID can also help avoid sudden drive failures and data loss.