Do overwritten files get deleted?

When a file is overwritten, the original data is not actually deleted right away. Instead, the area where the original file was stored is marked as available space to write new data. The new data is then written over the old data in that space. Until the new data fully overwrites the original data, both versions of the file still exist on the disk.

How file overwriting works

To understand why overwritten files are not immediately deleted, it helps to understand a bit about how computer file systems work. When a file is saved to a disk, it gets stored in one or more clusters, which are fixed-size blocks of storage space. The file system keeps track of which clusters belong to which files in its metadata.

When a file is opened for writing, the operating system looks for available clusters to store the new data. It may overwrite some or all of the clusters that were allocated to the original file. But until the original clusters are all overwritten, the original data is still recoverable. The file system just marks those clusters as eligible for overwrite.

File slack space

One reason overwritten file data sticks around is because of file slack space. This is unused space between the end of a file and the end of the last cluster allocated to it. When a file is overwritten, the beginning and middle portions are overwritten first. Any data in the file slack space at the end of the original file remains until it also gets overwritten.

Cluster tips

Another reason is due to cluster tips. Many file systems align cluster boundaries on cylinder or strip boundaries on the hard disk for optimal performance. When a file is not an exact multiple of the cluster size, there is unused space at the end of the last cluster. Like file slack space, this area remains even when the rest of the clusters are overwritten.

When is an overwritten file really deleted?

An overwritten file is not completely deleted until all traces of the original data are gone from the disk. There are a few scenarios when this occurs:

  • The entire file is overwritten by new data the same size or larger. This will overwrite the beginning, middle, and end of the original file.
  • The file system reuses the original file’s clusters for a different file. This will overwrite the original file slack space and cluster tips.
  • The disk is securely erased using software or firmware utilities. These repeatedly overwrite all sectors to hide previously stored data.

As long as the original clusters have not been reused, forensic analysis tools may be able to recover some or all of the overwritten data. But once every bit is overwritten with new data, the original file is essentially gone.

Examples of recovering overwritten files

There are examples where investigators, data recovery companies, and intelligence agencies have been able to recover previously overwritten data:

  • In a 1996 study, data recovery firm Ontrack were able to recover information from a University of California Berkeley system after one of their students, Peter Tippett, testified in a trial that overwritten data couldn’t be restored. Ontrack showed that information was recoverable from the hard drive up to 25 overwrites.
  • In a study published in 2008, researchers Craig Wright and Dave Kleiman demonstrated recovering overwritten data going back as many as 28 overwrites. They used custom-built equipment and techniques like magnetic force microscopy.
  • The Gutmann method, a 35-pass overwrite technique named after cryptographer Peter Gutmann, was designed to make recovery as difficult as possible by overwriting data extensively.

These examples show that with the right tools and technique, overwritten data can still be found as long as any trace of the original bits remain on the physical media. But for most casual users, once data has been overwritten even once, it is essentially unrecoverable.

Manually deleting files versus overwriting

There is a difference in how modern operating systems handle manually deleted files versus overwritten files. When you delete a file, most OS’s like Windows and MacOS will simply mark the file’s clusters as available rather than overwrite them immediately. This makes recovery easy using undelete tools until those clusters get reused.

But when a file is overwritten, the original data has actually been replaced with new data. The only way to recover it is through sophisticated data forensics. Recovering overwritten data is much more difficult than undeleting a manually deleted file.

Securely overwriting data

Because of the potential to recover overwritten data, some organizations follow guidelines to overwrite files multiple times to increase security. For example, government standards often require 7-pass overwrites to fully render data unrecoverable. Software tools are available to securely overwrite files by rewriting the same disk area repeatedly.

Single pass is sufficient for consumer SSDs

On modern SSD drives, even a single pass is considered sufficient to prevent recovery of overwritten data. SSDs handle writes differently than traditional spinning hard disks. The wear-leveling algorithms in SSDs ensure that data is well-scrambled after just one pass.

What happens when you copy a file?

Copying a file is handled differently than overwriting a file. When a file copy takes place, the operating system doesn’t overwrite or delete the original data. Instead, it allocates new free clusters to store the duplicate data. The original file remains intact until deleted or overwritten.

The same is true if you move a file from one folder to another on the same drive. The clusters storing the original data remain allocated to the file as it is copied or moved to a new location. No overwriting has occurred, just new directory entries.

Can overwritten files be undeleted?

With consumer tools, it is virtually impossible for an average user to undelete or recover an overwritten file. Once new data has replaced the original data in a cluster, that original data is considered permanently gone on most systems.

Only with expensive professional data recovery or forensic analysis is it potentially possible to retrieve traces of the original data. If recovery is performed quickly enough before too much new data is written, some original data may still reside in file slack or original clusters. But overall, overwritten data should be considered unrecoverable and permanently deleted.

Preventing file overwrites

You can take steps to prevent accidental file overwriting:

  • Turn on file permissions or file/folder attributes like read-only to restrict ability to edit or delete files.
  • Configure backup software to store previous versions of files.
  • Use source code repositories like Git to preserve file change history.
  • Mirror important data across redundant physical drives.

Conclusion

While an overwritten file is not immediately erased, the original data is marked for deletion once new data takes its place. Until the new data completely overwrites every cluster used by the original file, traces of that data can still exist. But recovery becomes exponentially more difficult the more new data replaces the old. For all practical purposes, an overwritten file should be considered permanently deleted by consumer tools and methods.

Term Definition
File overwrite Writing new data on top of existing data in a file. The original data is marked for deletion but still recoverable until fully replaced.
Cluster Fixed-size block of storage space on a disk used to store file contents.
File slack space Unused space at the end of a file from mismatch between file size and cluster boundaries.
File system Software responsible for organizing data storage on a disk and tracking file cluster allocations.

Leave a Comment