Where do files go when replaced?

When files are replaced or overwritten on a computer system, the original data is erased and substituted with new data. File overwriting refers to replacing old data by writing new data over storage space that was previously occupied by other files (https://www.webopedia.com/definitions/overwrite/). Overwriting is common in computing and can happen intentionally or unintentionally.

Intentionally overwriting files allows you to update and replace outdated documents and data. Unintentionally overwriting important files can lead to data loss if the originals are not backed up elsewhere. When files are overwritten, the original data is generally no longer recoverable unless specialized forensic data recovery techniques are used. Understanding how file overwriting works is important to avoid permanent data loss and utilize overwriting for purposefully updating files.

File Systems and Storage

Files are stored on computer hard drives and disks using file systems. A file system handles how data is stored, organized, and retrieved on a drive. Some common file systems include NTFS on Windows, HFS+ on Mac, ext4 on Linux, and FAT32 used across operating systems.

At the core, files are stored on a drive as blocks of data. The specific blocks a file occupies are tracked in a file system’s metadata, along with other attributes like filename, timestamps, permissions, etc. This allows the file system to retrieve a file by locating its data blocks when requested. Some file systems like NTFS and HFS+ also use a master file table to track files.

When it comes to organizations, most file systems use a hierarchical tree-like structure with folders and subfolders to categorize files. The location of a file in this directory structure is part of its recorded metadata. Some file systems allow linking files in multiple locations.

In addition, different file systems can use distinct methods for tasks like allocating disk space, preventing fragmentation, checksumming data, and journaling for recovery. The technical implementation varies across file systems optimized for aspects like performance, reliability, and operating system compatibility.

Overall, while the exact low-level details differ, at their core file systems provide an organized way to store file contents and metadata on a drive so the data can be efficiently located, identified, and retrieved by the operating system.

When a File is Replaced

When a file is replaced, the new file overwrites the old file. The old file is no longer accessible and essentially disappears from the file system. This happens when you save a new file with the same name and location as an existing file in your folders. According to Android Phonesoft (https://www.androidphonesoft.com/blog/how-to-restore-replaced-files-windows-10/), “When a file is replaced, the new file overwrites the old file. The old file is no longer accessible and essentially disappears from the file system.”

The overwrite happens at a low level in the file system. The operating system sees that a file with the same name and path already exists, so it simply overwrites the old file’s data with the new file’s data. The original contents are erased and replaced with the new contents byte for byte. From the user’s perspective, it appears as if the old file magically transforms into the new file.

This can happen intentionally when manually saving over a file, or unintentionally if an application saves new data to an existing file name without warning. Once saved, the original file is gone for good and users can no longer access the previous version or contents. The only way to recover the old file is through specialized data recovery software or from file version backups.

File Overwriting

The process of overwriting data involves replacing existing data stored on a storage device with new data. When a file is “deleted” or saved, the file system marks the space that the data occupies as available for new data. However, the actual data still resides there until it gets overwritten.

When a file needs to be overwritten, the operating system will write the new data over the clusters occupied by the old file. This overwriting process follows a sequential order, overwriting data on disk in the same order it was initially written. The new data simply replaces the old data bit-by-bit or sector-by-sector in that predefined order (see https://www.minitool.com/lib/overwrite.html).

So in summary, overwriting involves sequentially replacing older data with new data until the entire file size has been overwritten with the new data. This ensures no remnants of the old data are left behind.

Recovery of Overwritten Files

When a file is overwritten, the original data is not actually erased from the storage device right away. The areas of the disk where the original data resided simply get marked as available for new data to be written. Until those areas are overwritten again, the original data may still exist there intact. However, the file system no longer maintains the mapping of that original data to the original filename, making recovery challenging.

There are a few scenarios in which overwritten data can potentially be recovered:

File versioning: If a file was overwritten in a system that implements file versioning, such as Apple’s Time Machine, you may be able to restore a previous version of the file before it was overwritten. This depends on the versioning settings and how long file snapshots are retained (cite https://experience.dropbox.com/resources/recover-overwritten-files).

Data recovery software: Specialized data recovery programs can scan the raw contents of a drive and reconstruct files based on file signatures, metadata, and folder structures. This requires advanced algorithms to piece file fragments back together. Success depends on how thoroughly the original data areas have been overwritten by new data (cite https://nordic-backup.com/blog/how-to-recover-overwritten-files-quickly/).

Drive imaging: If a full forensic image of the drive was taken before the file was overwritten, this can provide a snapshot of the drive to extract the original file from later. However, this requires proactively imaging the drive using specialized tools (cite https://www.easeus.com/file-recovery/recover-overwritten-files.html).

In summary, while overwritten data may still exist on the storage medium, recovering it requires advanced techniques, prior backups, or proactive drive imaging. The success rate also decreases the more thoroughly the original data areas get overwritten over time.

Secure Deletion

Securely deleting files ensures that they cannot be recovered by malicious actors or law enforcement. There are a few techniques that can be used:

Using a secure delete utility like CCleaner will overwrite the file’s data multiple times, making it unrecoverable. Many security experts recommend overwriting 7 times or more using algorithms like DoD 5220.22-M or Peter Gutmann’s 35-pass method.

For solid state drives (SSDs), secure erase commands like ATA Secure Erase or hdparm can reset an entire drive, eliminating any trace of deleted files. However, this technique resets the whole drive, not just individual files [1].

File encryption before deletion is another option, which scrambles file data so it appears random. Popular tools like VeraCrypt allow creating encrypted containers to store sensitive files [2].

Physically destroying the storage media is the only way to guarantee deleted files are unrecoverable. However, this is impractical for most everyday users.

File Versioning

File versioning allows users to access previous versions of a file. It works by saving a new version of the file each time it is edited or saved. The previous versions are retained instead of being overwritten. This creates a detailed history of all changes made to the file over time.

Versioning systems store file history by keeping duplicate copies of the file. There are a few common methods for doing this:

  • Save a completely new file with an incremented version number (e.g. fileV1.txt, fileV2.txt)
  • Save just the changes, not a full copy, as reverse deltas from the latest version
  • Copy-on-write – Create a snapshot first before edits are made

The full editing history can be recalled by combining the latest version with the reverse deltas or snapshots to reconstruct any previous iteration of the file. Advanced versioning systems minimize storage usage by only retaining changes between versions rather than full copies. However, plenty of storage is still required to support multiple live versions.

According to an article on Bitcatcha, “When a modification is made to a document, file versioning pretty much automatically saves it at certain time intervals. It creates a separate copy of that file so you end up with multiple versions of that single document.” This allows users to easily roll back changes if needed.

Cloud Storage

When files are replaced on cloud storage services like Google Cloud Storage, the original file is not modified or deleted. Instead, a new version of the file is uploaded which overwrites the existing file. According to the Google Cloud Storage documentation, objects (files) in Cloud Storage are immutable. However, it is possible to replace objects by uploading a new version.

When a file replace operation occurs in Cloud Storage, the service handles it by performing a rewrite operation behind the scenes. As explained in the Cloud Storage API documentation, rewriting an object creates a new object with the same name and updates metadata, but does not modify the original source object. So the previous version still exists behind the scenes until garbage collection occurs.

Overall, replacing a file in cloud storage is more akin to uploading a new version, rather than directly overwriting the existing data. Old versions persist for some time, allowing for rollback if needed. However, once garbage collection runs, previous versions are consolidated and deleted.

Best Practices

When replacing files, it’s important to follow best practices to properly manage the process. Here are some tips:

Use consistent and descriptive file naming conventions like dates, version numbers, or project names to distinguish different file versions (https://www.sec.state.ma.us/divisions/archives/records-management/file-best-practices.htm). This makes it easier to identify the current or relevant file.

Store previous versions of files in a separate archive folder or use your operating system’s versioning feature if available. This preserves the revision history (https://www.liberty.edu/web-services/wordpress/tutorials/file-replacement/).

Avoid replacing files that are in use or opened by other users to prevent data loss or corruption. Check who has a file open before replacing it.

Notify users when shared files will be replaced to minimize disruption. Allow time for them to back up or synchronize changes if needed.

Use cloud storage or collaboration tools with built-in version control like SharePoint or OneDrive to seamlessly manage file replacements (https://techcommunity.microsoft.com/t5/microsoft-teams/best-practice-for-replacing-file-servers-with-teams/td-p/1209154).

Back up files regularly in case you need to restore a previous version.

Review permissions and access controls whenever replacing shared files to ensure the appropriate users have access.

Conclusion

When a file is replaced on a storage medium, the original data blocks are overwritten with the new file data. However, traces of the original data may still exist until those storage sectors are reused. While overwritten files can sometimes be recovered using advanced forensic tools, true secure deletion requires special techniques to completely purge file contents.

Cloud storage and file versioning systems now allow users to easily restore previous versions of files. So even if a file is accidentally replaced, the original may be retrievable. Overall, being mindful of proper file management and backup practices can help avoid unintended data loss.