How does virtualization change data protection methods?

Virtualization has transformed the way organizations store, access and protect their data. By abstracting workloads from the underlying physical infrastructure, virtualization provides greater flexibility, scalability and efficiency for IT operations. However, virtualized environments also come with their own unique data protection challenges that require new approaches to backup and recovery.

What is virtualization?

Virtualization refers to technologies that allow multiple virtual machines (VMs), operating systems, applications, and network resources to run on a single physical server. The physical server is known as the host, while the VMs running on top are known as guest machines. Each VM contains its own virtual CPU, memory, storage, network interfaces, and other components that make it act like a standalone computer.

Some key virtualization technologies include:

  • Hypervisors: Software layers like VMware ESXi, Microsoft Hyper-V, Citrix XenServer that create and manage VMs on the host hardware.
  • Virtual storage: Allowing physical storage resources to be pooled and allocated between VMs as needed.
  • Virtual networks: Enabling VMs to connect to each other and physical networks without dedicated NICs.

By pooling resources in this way, virtual infrastructure enables greater consolidation of workloads and provides the agility to provision, decommission, and scale VMs on demand. Multiple VMs can be run securely side-by-side on the same physical server.

Benefits of virtualization

Virtualization delivers a number of advantages for IT operations:

  • Improved hardware utilization: Virtualization allows organizations to reduce server sprawl and drive up utilization rates on existing hardware by consolidating multiple workloads.
  • Flexibility and scalability: Virtualized resources like storage and RAM can be dynamically allocated across VMs based on changing needs.
  • Better workload isolation: VMs provide stronger segregation and security domains between workloads running on the same physical host.
  • Enhanced availability: If a physical host fails, its VMs can be automatically restarted on another host without service interruption.
  • Simplified disaster recovery: VMs can be backed up or replicated easily across sites for DR purposes.
  • Streamlined provisioning: New VMs can be deployed rapidly from templates without physical installation.

These capabilities allow organizations to become more agile, flexible and resilient when managing workloads and infrastructure.

Virtualization platforms

There are two main virtualization architecture models:

  • Type 1 hypervisor: Runs directly on the system hardware with no underlying OS. Examples include VMware ESXi and Microsoft Hyper-V.
  • Type 2 hypervisor: Runs as an application on top of a conventional OS. Examples include VMware Workstation and Oracle VM VirtualBox.

Type 1 hypervisors can provide better performance as they have direct access to physical resources. Type 2 solutions are typically used for development, testing and desktop virtualization scenarios.

Leading enterprise virtualization platforms include:

  • VMware vSphere: The dominant player providing a mature stack of hypervisor, management and orchestration tools.
  • Microsoft Hyper-V: Hypervisor built into Windows Server at no additional licensing cost.
  • Citrix XenServer: Free open-source hypervisor focused on scalability and integration with cloud infrastructure.
  • Red Hat Virtualization: Open source platform built on KVM hypervisor and Linux-based management framework.

Organizations typically implement virtualization using a mix of solutions tailored to their specific needs and environments.

Virtualization data protection challenges

While virtualization enables greater efficiency and flexibility, it also introduces new complexities and risks from a data protection standpoint:

  • Increased storage capacity: Virtualization encourages massive data consolidation onto fewer servers, putting more data at risk in case of failures.
  • Virtual machine mobility: VMs can move between physical hosts, complicating backup targeting and recovery.
  • Virtual networks: Connectivity changes as VMs come online or move between hosts.
  • Resource contention: Backups may impact production VMs if not properly orchestrated.
  • Virtualization-aware backups: Agent-based backups need hypervisor integration to capture VMs effectively.
  • Heterogeneous platform support: Environments may run multiple hypervisors and versions across hosts.

Traditional agent-based backup solutions can struggle with these scenarios. Newer data protection platforms purpose-built for virtual environments are usually required.

Critical capabilities for virtual backup and recovery

Data protection solutions for virtualized environments should provide:

  • VM-centric operations: Backup, recovery and monitoring workflows operate at the VM rather than host or storage volume level.
  • Hypervisor integration: Tight integration with major hypervisors using APIs, plug-ins, VMware Virtual Storage APIs.
  • Application consistency: Leverage snapshot APIs to create application consistent images across VMs in a distributed application.
  • Auto discovery: Automatically track VMs brought online and scale protection policies accordingly.
  • Agentless implementation: Eliminate the overhead of installing and maintaining agents within VMs where possible.
  • vMotion/Live Migration support: Maintain VM backups across hypervisor migrations without interruption.
  • Virtual lab for test/dev: Spin up backed up VMs instantly for test/dev purposes.

Solutions that provide these capabilities can overcome the data protection hurdles introduced by virtualization more effectively than traditional backup tools.

Backup repositories

Virtual environments demand storage targets that match their consolidated nature and performance requirements:

  • Deduplicating storage: Reduces backup data footprint by storing unique blocks only once.
  • Purpose-built backup appliances (PBBA): Pre-configured devices optimized for backup workloads.
  • All-flash arrays: Delivers performance for short backup windows and rapid restores.
  • Cloud object storage: Virtually unlimited capacity for long-term retention at low cost.

Using a disk-based repository with at least 10-15MB/s of throughput per busy VM is a good rule of thumb. Faster storage can enable smaller backup windows and rapid recovery turnaround.

New data protection strategies

Here are some key data protection strategies that align well with virtualized environments:

  • Agentless VM backup: Hypervisor integration avoids agent management overhead.
  • Incremental forever backups: Small daily backups are fast and storage efficient.
  • Application consistent snapshots: Leverage VMware VADP, Microsoft VSS for app-consistent images.
  • Backup from storage snapshots: Offload backup load using storage array native snapshots.
  • Immutable backups: Retain fixed recovery points on immutable object storage.
  • Software-defined data protection: Converged scale-out backup repositories with policy-based automation.

Adopting these approaches provides flexible, scalable protection for dynamic virtual environments while optimizing backup performance.

Backup target options

Some of the most common backup target options for virtualized workloads include:

  • Deduplicating disk appliances: Offer short-term retention with deduplication to reduce capacity demands.
  • Purpose-built backup appliances (PBBA): Provide turnkey backup storage with integration into backup software.
  • All-flash arrays: Deliver ultra-fast backup and restore performance but higher cost.
  • Converged systems: Pre-configured appliances combining storage, servers, networking and software.
  • Cloud object storage: For cost-effective long-term retention and offsite disaster recovery.
  • Immutable object storage: Retain fixed recovery points indefinitely using WORM storage.

Organizations may utilize a disk target for operational backups combined with object storage for archiving and disaster recovery copies.

Backup methods

Some key backup methods and capabilities for virtual environments include:

  • Full backups: Initial complete copy of the VM image. Infrequent due to size.
  • Incremental backups: Copies changed blocks since the last backup. Faster and smaller.
  • Differential backups: Copies blocks changed since the last full backup. Balance of size and speed.
  • Backup from snapshots: Leverage hypervisor or storage array snapshots. Minimal impact.
  • Agentless backup: Uses hypervisor APIs instead of in-guest agents. Lower maintenance.
  • Application consistent backups: Quiesce apps to create consistent backup images.
  • Changed block tracking: Optimizes incremental backups by only reading changed blocks.
  • Compression and deduplication: Reduce backup data footprint through data reduction techniques.

Utilizing incremental forever backup strategies with application consistency typically offers the best balance of recovery points, performance and storage efficiency.

Data recovery options

Recovering VMs in a virtual environment enables multiple flexible options:

  • Full VM recovery: Recover the entire VM image to its original or a new location.
  • File-level recovery: Retrieve specific files from within a VM backup image.
  • Instant VM recovery: Boot a VM directly from a compressed backup image.
  • Item-level recovery: Recover specific application objects like Exchange emails.
  • Bare metal restore to new hardware: Recover VMs to entirely different physical servers.

Advanced virtual backup solutions allow browsing and searching VM backups just like live environments for more flexible, granular restores.

Cloud integration

Hybrid and multi-cloud adoption is accelerating, so cloud-integrated data protection is a must for many organizations:

  • Cloud gateways: On-prem appliances that replicate backups to public cloud object storage.
  • Cloud Disaster Recovery (DR): Failover VMs to cloud Infrastructure-as-a-Service for DR.
  • Cloud backup-as-a-Service: Consume backup directly from a managed cloud service.
  • Cloud analytics: Cloud-based analytics examine backups across environments for insights and compliance.

This level of cloud integration enables more flexible, agile data protection and DR strategies across hybrid and multi-cloud infrastructure.

Security considerations

Data protection introduces potential security risks that must be evaluated:

  • Authentication and access controls: Control access to backup activities using role-based access control, multi-factor authentication.
  • Network traffic encryption: Encrypt data in flight and data at rest throughout the backup and recovery process.
  • Immutable storage and WORM disks: Make backup images unchangeable using physical or logical write-once capabilities.
  • Air gapped backups: Isolate backups from network access completely by keeping them offline.
  • Ransomware protection: Detect and stop ransomware attacks trying to encrypt or delete backup repositories.

Backup platforms should provide security capabilities natively or integrate with broader IT security infrastructure tools and policies.

Backup testing and monitoring

Capabilities such as the following help ensure backup integrity and availability:

  • Backup reporting: Reports detail backup status, size, performance and trends over time.
  • Alerting: Proactively notify administrators about backup failures or anomalies.
  • Auditing: Track detailed backup and recovery events including user, timestamp, etc.
  • Backup validation: Test recovery of specific files or full VMs from backups periodically.
  • Backup monitoring: Monitor backup infrastructure health, capacity, performance from central consoles.
  • SLA compliance reporting: Report on backup SLAs like RPOs and retention being met across VMs.

Monitoring and testing backup environments regularly is essential to ensuring recoverability and meeting SLAs.

Conclusion

Virtualization brings new data protection requirements. Purpose-built backup solutions with VM-centric workflows, hypervisor integration, enhanced recoverability and cloud support are required. Integrating people, processes and technology is key to maximize the agility, efficiency and availability promises of virtualization without compromising data protection.