Friday, December 12, 2014

NetApp From the Ground Up - A Beginner's Guide Part 8

HA Pair


HA Pair is basically two controllers which both have connection to their own and partner shelves. When one of the controllers fails, the other one takes over. It’s called Cluster Failover (CFO). Controller NVRAMs are mirrored over NVRAM interconnect link. So even the data which hasn’t been committed to disks isn’t lost.

Note: HA Pair can’t failover when disk shelf fails, because partner doesn't have a copy to service requests from.

Mirrored HA Pair


You can think of a Mirrored HA Pair as HA Pair with SyncMirror between the systems. You can implement almost the same configuration on HA pair with SyncMirror inside (not between) each system (because the odds of the whole storage system (controller + shelves) going down is highly unlikely). But it can give you more peace of mind if it’s mirrored between two system.

It cannot failover like MetroCluster, when one of the storage systems goes down. The whole process is manual. The reasonable question here is why it cannot failover if it has a copy of all the data? Because MetroCluster is a separate functionality, which performs all the checks and carry out a cutover to a mirror. It’s called Cluster Failover on Disaster (CFOD). SyncMirror is only a mirroring facility and doesn't even know that cluster exists. 



MetroCluster provides failover on a storage system level. It uses the same SyncMirror feature beneath it to mirror data between two storage systems (instead of two shelves of the same system as in pure SyncMirror implementation). Now even if a storage controller fails together with all of its storage, you are safe. The other system takes over and continues to service requests.

HA Pair can’t failover when disk shelf fails, because partner doesn’t have a copy to service requests from.


  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
After Disaster Recovery, let's consider Continuous Availability. Customers must be able to recover critical applications after a system failure seamlessly and instantaneously. Critical applications include financial applications and manufacturing operations, which must be continuously available with near-zero RTO and zero RPO. NetApp MetroCluster is a unique array-based clustering solution that can extend up to 200km and enable a zero RPO (no data loss) with zero or near-zero RTO.

MetroCluster enhances the built-in redundancy of NetApp hardware and software, providing an additional layer of protection for the entire storage and host environment. MetroCluster seamlessly maintains application and virtual infrastructure availability when storage outages occur (whether the outage is due to a network connectivity issue, loss of power, a problem with cooling systems, a storage array shutdown, or an operational error). Most MetroCluster customers report that their users experience no application interruption when a cluster recovery occurs.

MetroCluster enables single command fail over for seamless cutover of applications that is transparent to the end user.

It supports a distance of up to 100 kilometers.

Integrated Data Protection


  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
When discussing data protection strategies with your customers, you should consider the overall NetApp data protection portfolio, which we call NetApp Integrated Data Protection. NetApp Integrated Data Protection enables customers to:
  • Deliver backup
  • High availability
  • Business continuity
  • Continuous availability
from a single platform. It is a single suite of integrated products that works across all NetApp solutions and with non NetApp storage. Your customers can use a single platform for all data protection, the process of building, implementing, and managing data protection over time is simpler, because they have fewer systems from fewer vendors to install and manage. And because the portfolio uses NetApp storage efficiency technology, cost and complexity can be up to 50% lower than for competitive solutions.

 NetApp Snapshot copies are the answer to shrinking backup windows. They are nearly instantaneous, and as a result do not impact the application. As a result, multiple Snapshot copies can be made per day - hourly or even more often. They are the primary solution for protecting against user errors or data corruption.

MetroCluster is NetApp's solution for Continuous Availability, enabling zero data loss in the event of wide variety of failure scenarios. MetroCluster, in conjunction with VMware's HA and Fault Tolerance capabilities, give your customers continuous availability of their ESX servers and storage. For single failures, the storage systems will perform an automated, transparent failover. MetroCluster has been certified with VMware's High Availability and Fault Tolerant solutions.

SnapMirror provides asynchronous mirroring across unlimited distances to enable Disaster Recovery from a secondary site. SnapMirror, in conjunction with VMware's Site Recovery Manager, delivers automated, global failover of the entire virtual environment to a recovery site, and integrates with SnapMirror and FlexClone.

NetApp enables your customers to use less expensive SATA drives as nearline storage, and lower-cost controllers for asymmetric backups and backup consolidation from multiple sources. NetApp solutions enable rapid search and retrieval of backup data, and also support the re-use of backup data for other business uses via our unique, near-zero impact FlexClones.

 NetApp enables your customers to perform flexible backup vaulting: Disk to Disk to Tape, full tape management as well as full cataloging of disk and tape backups. In addition, NetApp allows your customers to choose how they want to manage their data protection workflows. Your customers can use NetApp products such as SnapProtect for end-to-end backup management including catalog and disk-to-disk-to-tape; and for specific applications your customers can leverage the SnapManager software.

Considering the Overall Data Protection Portfolio

  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
NetApp Integrated Data Protection enables customers to deliver high availability, business continuity, continuous availability, and backup and compliance from a single platform. A single platform that works across all NetApp solutions and with non NetApp storage. Because customers can use a single platform for all data protection, the process of building, implementing, and managing data protection over time is simpler, because they have fewer systems from fewer vendors to install and manage. And because the portfolio uses NetApp storage efficiency technology, cost and complexity can be up to 50% lower than for competitive solutions. For example, a customer can use MetroCluster to provide a zero RPO and then replicate data with SnapVault software to a remote site for long-term backup and recovery. If the customer later decides to implement long-distance disaster recovery with a short RPO, the customer can use SnapMirror software to do so.

Other Posts in this Series:

As always, if you have any questions or have a topic that you would like me to discuss, please feel free to post a comment at the bottom of this blog entry, e-mail at, or drop me a message on Twitter (@OzNetNerd).

Note: This website is my personal blog. The opinions expressed in this blog are my own and not those of my employer.

Sunday, November 16, 2014

NetApp From the Ground Up - A Beginner's Guide Part 7



SyncMirror mirror aggregates and work on a RAID level. You can configure mirroring between two shelves of the same system and prevent an outage in case of a shelf failure.

SyncMirror uses a concept of plexes to describe mirrored copies of data. You have two plexes: plex0 and plex1. Each plex consists of disks from a separate pool: pool0 or pool1. Disks are assigned to pools depending on cabling. Disks in each of the pools must be in separate shelves to ensure high availability. Once shelves are cabled, you enable SyncMiror and create a mirrored aggregate using the following syntax:

aggr create aggr_name -m -d disk-list -d disk-list


SyncMirror mirror aggregates and work on a RAID level. You can configure mirroring between two shelves of the same system and prevent an outage in case of a shelf failure.

SyncMirror uses a concept of plexes to describe mirrored copies of data. You have two plexes: plex0 and plex1. Each plex consists of disks from a separate pool: pool0 or pool1. Disks are assigned to pools depending on cabling. Disks in each of the pools must be in separate shelves to ensure high availability.

Plex & Disk Pools

By default Data ONTAP without syncmirror license will keep all disks in pool0 (default). So you will have only one plex.
You need to have syncmirror license to get two plexes, which will enable RAID-level mirroring on your storage system.

The following Filerview online help will give more information in this.

Managing Plexes

The SyncMirror software creates mirrored aggregates that consist of two plexes, providing a higher level of data consistency through RAID-level mirroring. The two plexes are simultaneously updated; therefore, the plexes are always identical.

When SyncMirror is enabled, all the disks are divided into two disk pools, and a copy of the plex is created. The plexes are physically separated, (each plex has its own RAID groups and its own disk pool), and the plexes are updated simultaneously. This provides added protection against data loss if there is a double-disk failure or a loss of disk connectivity, because the unaffected plex continues to serve data while you fix the cause of the failure. Once the plex that has a problem is fixed, you can resynchronize the two plexes and reestablish the mirror relationship.

You can create a mirrored aggregate in the following ways:
  • You can create a new aggregate that has two plexes.
  • You can add a plex to an existing, unmirrored aggregate.
An aggregates cannot have more than two plexes.

Note: Data ONTAP names the plexes of the mirrored aggregate. See the Data ONTAP Storage Management Guide for more information about the plex naming convention.

How Data ONTAP selects disks

Regardless of how you create a mirrored aggregate, Data ONTAP determines which disks to use. Data ONTAP uses the following disk-selection policies when selecting disks for mirrored aggregates:
  • Disks selected for each plex must come from different disk pools.
  • The number of disks selected for one plex must equal the number of disks selected for the other plex.
  • Disks are first selected on the basis of equivalent bytes per sector (bps) size, then on the basis of the size of the disk.
  • If there is no equivalent-sized disk, Data ONTAP selects a larger-capacity disk and uses only part of the disk.

Disk selection policies if you select disks

Data ONTAP enables you to select disks when creating or adding disks to a mirrored aggregate. You should follow the same disk-selection policies that Data ONTAP follows when selecting disks for mirrored aggregates. See the Data ONTAP Storage Management Guide for more information.

More Information

A plex is a complete copy of an aggregate. If you do not have mirroring enabled, you'll only be using Plex0. If you enabling mirroring, Plex1 will be created. Plex1 will synchornise with Plex0 so you will have two complete copies of the one aggregate. This provides full redundancy should Plex0's shelf go off line or suffer a multi-disk failure.
  • SyncMirror protects against data loss by maintaining two copies of the data contained in the aggregate, one in each plex.
  • A plex is one half of a mirror (when mirroring is enabled). Mirrors are used to increase fault tolerance. A mirror means, that whatever you write on one disk gets written on a second disk - at least that is the general idea- immediately. Thus mirroring is a way to prevent data loss from loosing a disk.
  • If you do not mirror, there is no reason to call the disk in an aggregate a plex really. But it is easier - for consistency etc.- to call the first bunch of disks that make up an aggregate plex 0. Once you decide to make of mirror -again to ensure fault tolerance- you need the same amount of disks the aggregate is made of for the second half of the mirror. This second half is called plex1.
  • So bottom-line, unless you mirror an aggregate, plex0 is just a placeholder that should remind you of the ability to create a mirror if needed.
  • By default all your raidgroups will be tied towards plex0, the moment you enable syncmirror things will change. After enabling the syncmirror license you move disks from default pool pool0 to pool1. Then when you syncmirror your aggregate you will find pool0 disks will be tied with plex0 and pool1 will be under plex1.
  • A plex is a physical copy of the WAFL storage within the aggregate. A mirrored aggregate consists of two plexes; unmirrored aggregates contain a single plex.
  • A plex is a physical copy of a filesystem or the disks holding the data. A DataONTAP volume normally consists of one plex. A mirrored volume has two or more plexes, each with a complete copy of the data in the volume. Multiple plexes provides safety for your data as long as you have one complete plex, you will still have access to all your data.
  • A plex is a physical copy of the WAFL storage within the aggregate. A mirrored aggregate consists of two plexes; unmirrored aggregates contain a single plex. In order to create a mirrored aggregate, you must have a filer configuration that supports RAID-level mirroring. When mirroring is enabled on the filer, the spare disks are divided into two disk pools. When an aggregate is created, all of the disks in a single plex must come from the same disk pool, and the two plexes of a mirrored aggregate must consist of disks from separate pools, as this maximizes fault isolation.

Protection provided by RAID and SyncMirror

Combining RAID and SyncMirror provides protection against more types of drive failures than using RAID alone.

You can use RAID in combination with the SyncMirror functionality, which also offers protection against data loss due to drive or other hardware component failure. SyncMirror protects against data loss by maintaining two copies of the data contained in the aggregate, one in each plex. Any data loss due to drive failure in one plex is repaired by the undamaged data in the other plex.

For more information about SyncMirror, see the Data ONTAP Data Protection Online Backup and Recovery Guide for 7-Mode.

The following tables show the differences between using RAID alone and using RAID with SyncMirror:

Lab Demo

 See this page for a lab demonstration.

SyncMirror Vs SnapVault

SyncMirror synchronously mirrors aggregates on the same or a remote system in the case of MetroCluster. While it is not exactly the same, it might help to think of it to being analogous to RAID-10. As Aggregates in the NetApp world store volumes, once you have a sync-mirrored aggregate, any volume and the subsequent data placed in them is automatically mirrored in a synchronous manner.

SnapVault is very different. It takes qtrees (directories managed by Data ONTAP) from a source system, and replicates them (asynchronously) to a volume on a destination system. Usually many source qtrees are replicated into one volume, compressed, deduplicated and then archived via a Snapshot on the destination. In general terms, SnapVault enables backup and archive of data from one or many source systems to a centralised backup system. This methodology enables many copies of production data to be retained on a secondary system with only the block differential data being transferred between each backup.

SyncMirror Vs SnapMirror

SyncMirror is for mirroring data between aggregates synchronously, usually on the same system, but can be on a remote system in the case of MetroCluster.

SnapMirror operates at the volume level (can be Qtree as well but differs slightly to volume SnapMirror), and is usually deployed for asynchronous replication. It has no distance limitations (whereas SyncMirror in a MetroCluster configuration is limited to 100km), replicates over IP (MetroCluster requires fibre links between the sites), has compression and is dedupe aware. If your source volume has been deduplicated, then SnapMirror will replicate it in it's deduplicated format. SnapMirror also has features built in to make it easy to fail over, fail back, break the mirror and resync the mirror. It's compatible with SRM, and also integrates with other NetApp features such as MultiStore for configuring DR on a vFiler level.

More Information

Absolutely you can do SyncMirror within the same system - you just need to create two identical aggregates for this purpose and you will have two synchronously mirrored aggregates with all the volumes they contain on the one system.

SnapMirror is a great asynchronous replication method and instead of replicating aggregates, it is set up at the volume layer. So you have a source volume and a destination volume, and there will be some lag time between them both based on how frequently you update the mirror (e.g. 15 minutes, 30 minutes, 4 hours, etc.). You can very easily make the SnapMirror destination read/write (it is read only while replication is taking place), and also resynchronise back to the original source after a fail over event.

One of the issues with mirroring data is that the system will happily mirror corrupt data from the application perspective - it just does what it is told. SnapVault fits in the picture here as it offers a longer term Snapshot retention of data on a secondary system that is purely for backup purposes. By this I mean that the destination copy of the data needs to be restored back to another system - it is generally never read/write but write only. SnapMirror, and SyncMirror destinations contain an exact copy of the source volume or aggregate including any Snapshots that existed when replication occured. SnapVault is different because the Snapshot that is retained on the secondary system is actually created after the replication update has occurred. So you can have a “fan in” type effect for many volumes or qtrees into a SnapVault destination volume, and once the schedule has completed from the source system(s), the destination will create a Snapshot for retention purposes, then de-duplicate the data. You can end up with many copies of production data on your Snapvault destination - I have a couple of customers that are retaining 1 years worth of backup using Snapvault.

It is extremely efficient as compression and deduplication work very well in this case, and allows customers to keep many generations of production data on disk. The customer can also stream data off to tape from the SnapVault destination if required. As with SnapMirror, the SnapVault destination does not need to be the same as source, so you can have a higher performance system in production (SAS or SSD drive) and a more economical, dense system for your backup (3TB SATA for example).

In a nutshell - SnapMirror/SyncMirror come under the banner of HA, DR and BCP, whereas SnapVault is really a backup and archive technology.

Open Systems SnapVault (OSSV)


  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
Open Systems SnapVault provides the same features as SnapVault but to storage not produced by NetApp. SnapVault disk-to-disk backup capability is a unique differentiator-no other storage vendor enables replication for long-term disk-to-disk backup within an array. Open System SnapValut(OSSV) enables the customers to backup non-NetApp data to secondary NetApp target. Open Systems SnapVault leverages the block-level incremental backup technology found in SnapVault to protect Windows, Linux, UNIX, SQL Server, and VMware systems running on mixed storage.

More Information

Designed to safeguard data in open-storage platforms, NetApp Open Systems SnapVault (OSSV) software leverages the same block-level incremental backup technology and NetApp Snapshot copies found in our SnapVault solution. OSSV extends this data protection to Windows, Linux, UNIX, SQL Server, and VMware systems running mixed storage.

OSSV improves performance and enables more frequent data backups by moving data and creating backups from only changed data blocks rather than entire changed files. Because no redundant data is moved or stored, you need less storage capacity and a smaller storage footprint–giving you a cost-effective solution. OSSV is well suited for centralizing disk-to-disk (D2D) backups for remote offices.

Other Posts in this Series:

As always, if you have any questions or have a topic that you would like me to discuss, please feel free to post a comment at the bottom of this blog entry, e-mail at, or drop me a message on Twitter (@OzNetNerd).

Note: This website is my personal blog. The opinions expressed in this blog are my own and not those of my employer.

NetApp From the Ground Up - A Beginner's Guide Part 6


  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
Affordable NetApp protection software safeguards your data and business-critical applications.
Explore the range of NetApp protection software products available to protect your valuable data and applications and to provide optimal availability, IT efficiency, and peace of mind.
We have a number of different types of data protection applications - they are:
Let's look at these in detail.

Disk-to-Disk Backup and Recovery Solutions

  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio 
  • Disk-to-Disk Back up and Recovery Solutions
  • SnapVault software speeds and simplifies backup and data recovery, protecting data at the block level.
  • Open Systems SnapVault (OSSV) software leverages block-level incremental backup technology to protect Windows, Linux, UNIX, SQL Server, and VMware systems running on mixed storage.
  • SnapRestore data recovery software uses stored Data ONTAP Snapshot copies to recover anything from a single file to multi-terabyte volumes, in seconds.

Application-Aware Backup and Recovery Solutions for Application and Backup Administrators

  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio 
  • Application-Aware Backup and Recovery Solutions for Application and Backup Admins
  • The SnapManager management software family streamlines storage management and simplifies configuration, backup, and restore operations.
  • SnapProtect management software accelerates and simplifies backup and data recovery for shared IT infrastructures.
  • OnCommand Unified Manager automates the management of physical and virtual storage for NetApp storage systems and clusters.

Business Continuity and High Availability Solutions

  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio 
  • Business continuity and High Availability Solutions
  • SnapMirror data replication technology provides disaster recovery protection and simplifies the management of data replication.
  • MetroCluster high-availability and disaster recovery software delivers continuous availability, transparent failover protection, and zero data loss.


Basic Snapshots

In the beginning, snapshots were pretty simple: a backup, only faster. Read everything on your primary disk, and copy it to another disk.

Simple. Effective. Expensive.

Think of these kinds of snapshots as being like a photocopier. You take a piece of paper, and write on it. When you want a snapshot, you stop writing on the paper, put it into the photocopier, and make a copy. Now you have 2 pieces of paper.

A big database might take up 50 pieces of paper. Taking a snapshot takes a while, because you have to copy each page. And the cost adds up. Imagine each piece of paper cost $5k, or $10k.

Still, it’s faster than hand-copying your address book into another book every week.

It’s not a perfect analogy, but it’s pretty close.

Copy-on-Write Snapshots

Having to copy all the data every time is a drag, because it takes up a lot of space, takes ages, and costs more. Both taking the snapshot, and restoring it, take a long time because you have to copy all the data.

But what if you didn’t have to? What if you could copy only the bits that changed?

Enter copy-on-write snapshots. The first snapshot records the baseline, before anything changes. Since nothing has changed yet, you don’t need to move data around.

But as soon as you want to change something, you need to take note of it somewhere. Copy-on-write does this by first copying the original data to a (special, hidden) snapshot area, and then overwriting the original data with the new data. Pretty simple, and effective.

And now it doesn’t take up as much space, because you’re just recording the changes, or deltas.

But there are some downsides.

Each time you change a block of data, the system has to read the old block, write it to the snapshot area, and then write the new block. So, for each write, the disk actually does two writes and one read. This slows things down.

It’s a tradeoff. You lose a bit in write performance, but you don’t need as much disk to get snapshots. With some clever cacheing and other techniques, you can reduce the performance impact, and overall you save money but get some good benefits, so it was often worth it.

But what if you didn’t have to copy the original data?

NetApp Snapshots

NetApp snapshots (and ZFS snapshots, incidentally) do things differently. Instead of copying the old data out of the way before it gets overwritten, the NetApp just writes the new information to a special bit of disk reserved for storing these changes, called the SnapReserve. Then, the pointers that tell the system where to find the data get updated to point to the new data in the SnapReserve.

That’s why the SnapReserve fills up when you change data on a NetApp. And remember that deleting is a change, so deleting a bunch of data fills up the SnapReserve, too.

This method has a bunch of advantages. You’re only recording the deltas, so you get the disk savings of copy-on-write snapshots. But you’re not copying the original block out of the way, so you don’t have the performance slowdown. There’s a small performance impact, but updating pointers is much faster, which is why NetApp performance is just fine with snapshots turned on, so they’re on by default.

It gets better.

Because the snapshot is just pointers, when you want to restore data (using SnapRestore) all you have to do is update the pointers to point to the original data again. This is faster than copying all the data back from the snapshot area over the original data, as in copy-on-write snapshots.

So taking a snapshot completes in seconds, even for really large volumes (like, terabytes) and so do restores. Seconds to snap back to a point in time. How cool is that?

Snapshots Are Views

It’s much better to think of snapshots as a View of your data as it was at the time the snapshot was taken. It’s a time machine, letting you look into the past.

Because it’s all just pointers, you can actually look at the snapshot as if it was the active filesystem. It’s read-only, because you can’t change the past, but you can actually look at it and read the data.

This is incredibly cool.

Seriously. It’s amazing. You get snapshots with almost no performance overhead, and you can browse through the data to see what it looked like yesterday, or last week, or last month. Online.

So if you accidentally delete a file, you don’t have to restore the entire G:, or suck the data off a copy on tape somewhere. You can just wander through the .snapshot (or ~snapshot) directory and find the file, and read it. You can even copy it back out into the active file system if you want.

All without ringing the helpdesk.

inodes & pointers

Snapshot is the point in time of copy of a volume/file system. Snapshots are useful for backup and recovery purposes. With snapshot technology the file system/volume can be backed up within a matter of few seconds. With traditional backups to tape, recovery of file/directory involves checking the media onto which backup was written, loading that media into tape library and restoring the file. This process takes long time and in some cases users/application developers needs the file urgently to complete their tasks. With snapshot technology the snapshot of a file/volume stores in the system itself and administrator can restore the file within a fraction of seconds which helps users to complete their tasks.

How snapshot works

Snapshot copies the file system/volume when requested to do so. If we have to copy all the data in file system/volume using traditional OS mechanisms, it takes a lot of time and consumes lot of space in system. Snapshots overcome this problem by copying only the blocks that have changed. This is explained below.

If we take a snapshot of a file system/volume, no new data is created or new space is consumed in the system. Instead of this system copies the inode information of the file system to snapshot volume. inode information consists of:
  • file permissions
  • owner
  • group
  • access/modification times
  • pointers to data blocks
  • etc
The inode pointers of snapshot volume point to same data blocks of the file system for which snapshot created. In this way snapshot consumes very minimal space (metadata of original file system).

What happens if block has been changed in original file system? Before changing the data block, system copies the data block to snapshot area and overwrites the original data block with new data. Inode pointer will be updated in snapshot to point to the data block that is written in snapshot area. In this manner, changing the data block involves reading the original data block (read operation), writing it to snapshot area and overwriting the original data block with new data (two write operations). This causes performance degradation to some extent. But you don’t need much disk space with this method, as we will record only the changes made to file system/volume. This is called Copy-On-Write (COW) snapshot.

Now we will see how Netapp does it differently.

While changing the block in volume with snapshot created, instead of copying the original block to snapshot area, Netapp writes the new volume to snapreserve space. This involves only one write instead of two writes and one read with COW snapshot. However this also has some performance impact as this involves the changing the inode pointers of file system/volume, but this is minimal if compared to CO which is why Netapp snapshot method is superior when compared to other vendor snapshots.

Also during restores also, Netapp changes the pointers of filesystem to snapshot block. With COW snapshot we need to copy the file from snapshot volume to original volume. So restore with NetApp if faster when compared to COW snapshots.

Thus Netapp snapshot methodology is superior and faster compared to COW snapshots.


  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
Snapshot copies are stored on the local disk to enable fast backup and fast recovery. But what if the local disk goes offline or fails? NetApp SnapVault software protects against this type of failure and enables long-term backup and recovery. SnapVault software delivers disk-to-disk backup, protecting NetApp primary storage by replicating Snapshot copies to inexpensive secondary storage. SnapVault software replicates only new or changed data using Snapshot copies, and stores the data in native format, keeping both backup and recovery times short and minimizing the storage footprint on secondary storage systems.

SnapVault software can be used to store months or even years of backups, so that fewer tape backups are needed and recovery times stay short. As with the rest of the NetApp Integrated Data Protection portfolio, SnapVault software uses deduplication to further reduce capacity requirements and overall backup costs. SnapVault software is available to all NetApp storage systems that run the Data ONTAP operating system.

More Information

Providing speed and simplicity for data backup and recovery, NetApp SnapVault software leverages block-level incremental replication and NetApp Snapshot copies for reliable, low-overhead disk-to-disk (D2D) backup.

NetApp’s flagship D2D backup solution, SnapVault provides efficient data protection by copying only the data blocks that have changed since the last backup, instead of entire files. As a result, you can back up more frequently while reducing your storage footprint, because no redundant data is moved or stored. And with direct backups between NetApp systems, SnapVault D2D minimises the need for external infrastructure and appliances.

By changing the backup paradigm, NetApp SnapVault software simplifies your adaptation to data growth and virtualisation, and it streamlines the management of terabytes to petabytes of data. Use the SnapVault backup solution as a part of NetApp's integrated data protection approach to help create a flexible and efficient shared IT infrastructure.

By transferring only new or changed blocks of data, traditional backups which would usually take hours or days to complete, only take minutes. Further to this, it also enables you to store months or years of Point-in-Time Backup Copies on disk. When used in conunction with deduplication, Tape backups are reduced or even eliminated.

Backups can be used for:
  • Development and Testing
  • Reporting
  • Cloning
  • DR Replicaiton



SnapMirror is a volume level replication, which normally works over IP network (SnapMirror can work over FC but only with FC-VI cards and it is not widely used).
  • SnapMirror Asynchronous replicates data according to schedule.
  • SnapMirror Sync uses NVLOGM shipping (described briefly in my previous post) to synchronously replicate data between two storage systems.
  • SnapMirror Semi-Sync is in between and synchronizes writes on Consistency Point (CP) level.
SnapMirror provides protection from data corruption inside a volume. But with SnapMirror you don’t have automatic failover of any sort. You need to break SnapMirror relationship and present data to clients manually. Then resynchronize volumes when problem is fixed.


  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
Because customers require 24×7 operations, they must protect business applications and virtual infrastructures from site outages and other disasters.

SnapMirror software is the primary NetApp data protection solution. It is designed to reduce the risk, cost, and complexity of disaster recovery. It protects customers' business-critical data and enables customers to use their disaster recovery sites for other business activities which greatly increases the utilization of valuable resources. SnapMirror software is for disaster recovery and business continuity, whereas SnapVault software is for long-term retention of point-in-time backup copies.

SnapMirror software is built into the Data ONTAP operating system, enabling a customer to build a flexible disaster recovery solution. SnapMirror software replicates data over IP or Fibre Channel to different models of NetApp storage and to storage not produced by NetApp but managed by V Series systems. SnapMirror software can use deduplication and built-in network compression to minimize storage and network requirements, which reduces costs and accelerates data transfers.

Customers can also benefit from SnapMirror products if they have virtual environments, regardless of the vendors they use. NetApp software integrates with VMware, Microsoft Hyper-V, and Citrix XenServer to enable simple failover when outages occur.

More Information

SnapMirror data replication leverages NetApp unified storage to turn disaster recovery into a business accelerator.

Built on NetApp’s unified storage architecture, NetApp SnapMirror technology provides fast, efficient data replication and disaster recovery (DR) for your critical data, to get you back to business faster.

With NetApp’s flagship data replication product, use a single solution across all NetApp storage arrays and protocols, for any application, in both virtual and traditional environments, and in a variety of configurations. Tune SnapMirror technology to meet recovery point objectives ranging from zero seconds to hours.

Cost-effective SnapMirror capabilities include new network compression technology to reduce bandwidth utilization; accelerated data transfers to lower RPO; and improved storage efficiency in a virtual environment using NetApp deduplication. Integrated with FlexClone volumes for instantaneous, space-efficient copies, SnapMirror software also lets you use replicated data for DR testing, business intelligence, and development and test—without business interruptions.

Combine SnapMirror with NetApp MultiStore and Provisioning Manager software to gain application transparency with minimal planned downtime.

Disaster Recovery

  • Reference: NetApp Training - Fast Track 101: NetApp Portfolio
In addition to enabling rapid, cost-effective disaster recovery, SnapMirror software simplifies testing of disaster recovery processes to make sure they work as planned - before an outage occurs. Typically, testing is painful for organizations. Companies must bring in people on weekends, shut down production systems, and perform a failover to see if applications and data appear at the remote site. Studies indicate that disaster recovery testing can negatively impact customers and their revenues and that one in four disaster recovery tests fail.

Because of the difficulty of testing disaster recovery, many customers do not perform the testing that is necessary to ensure their safety. With SnapMirror products and FlexClone technology, customers can test failover any time without affecting production systems or using much storage.

To test failover with SnapMirror products and FlexClone technology, a customer first creates space-efficient copies of disaster recovery data instantaneously. The customer uses the copies for testing. After finishing testing, the customer can delete the clones in seconds.

SnapVault Vs SnapMirror

When I was getting into NetApp I had a big trouble understanding the difference between snapvault and snapmirror. I heard an explanation: Snapvault is a backup solution where snapmirror is a DR solution. And all I could do was say ‘ooh, ok’ still not fully understanding the difference…

 The first idea that popped out to my head was that snapmirror is mainly set on volume level, where snapvault is on qtree level. But that is not always the case, you can easily setup QSM (Qtree snapmirror).

The key to understanding the difference is really understand the sentence: Snapvault is a backup solution where snapmirror is a DR solution.

What does it mean that SnapVault is a backup solution?

Let me bring some picture to help me explain:


Example has few assumptions:
  • We’ve got filerA in one location and filerB in other location
  • That customer has a connection to both filerA and FilerB, although all shares to customers are available from filerA (via CIFS, NFS, iSCSI or FC)
  • All customer data is being transfered to the filerB via Snapvault

What we can do with snapvault?

  • As a backup solution, we can have a longer snapshot retention time on filerB, so more historic data will be available on filerB, if filerB has slower disks, this solution is smart, because slower disk = cheaper disks, and there is no need to use 15k rpm disk on filer that is not serving data to the customer.
  • If customer has an network connection and access to shares on filerB he can by himself restore some data to filerA, even single files
  • If there is a disaster within filerA and we loose all data we can restore the data from filerB

What we cannot do with snapvault?

  • In case of an disaster within filerA we cannot “set” filerB as a production side. We cannot “revert the relationship” making the qtree on filerB as a source, and make them read/write. They are Snapvault destinations so they are read-only.
  • (Having snapmirror license available on filerB we can convert Snapvault qtree to snapmirror qtree which solves that ‘issue’).

What does it mean that SnapMirror is a DR solution?

Again, let me bring the small picture to help me explain:

Example has few assumptions:
  • We’ve got filerA in one location and filerB in other location
  • That customer has a connection to both filerA and FilerB, although all shares to customers are available from filerA
  • All customer data is being transfered to the filerB via snapmirror

What we can do with snapmirror?

  • As a backup solution we can restore the accidentally deleted, or lost data on filerA, if the snapmirror relationship has not been updated meantime
  • If there is some kind or issue with filerA (from a network problem, to a total disaster) we can easily reverse the relationship. We can make the volume or qtree on filerB, as a source, and make it read-write, provide an network connection to the customer and voila – we are back online! After the issue has been solved we can resync the original source with changes made at the destination and reverse the relationship again.

To sum up

This is not the only difference between snapmirror and snapvault. But I would say this is the main one. Some other differences are that snapmirror can be actually in sync or semi-sync mode. The async mode can be updated even once a minute. Where the snapvault relationship cannot be updated more often then once an hour. If we have few qtrees on the same volume with snapvault they share the same schedule, while with QSM they can have different schedules, etc.. ;)

More Information

If you would like to know more check out this document: SnapVault Best Practices Guide

Also see this page for more information.

Other Posts in this Series:

As always, if you have any questions or have a topic that you would like me to discuss, please feel free to post a comment at the bottom of this blog entry, e-mail at, or drop me a message on Twitter (@OzNetNerd).

Note: This website is my personal blog. The opinions expressed in this blog are my own and not those of my employer.

NetApp From the Ground Up - A Beginner's Guide Part 5


  • Reference: NetApp University - Introduction to NetApp Products
Data production and data protection are the basic capabilities of any storage system. Data ONTAP software goes further by providing advanced storage efficiency capabilities. Traditional storage systems allocate data disk by disk, but the Data ONTAP operating system uses flexible volumes to drive higher rates of utilization and to enable thin provisioning. NetApp FlexVol technology gives administrators the flexibility to allocate storage at current capacity, rather than having to guess at future needs. When more space is needed, the administrator simply resizes the flexible volume to match the need. If the system is nearing its total current capacity, more storage can be added while in production, enabling just-in-time purchase and installation of capacity.

Note: You can't have a flexible volume without an aggregate.


Infinite Volume

Target Workloads and Use Cases

Infinite Volume was developed to provide a scalable, cost-effective solution for big content workloads. Specifically, Infinite Volume addresses the requirements for large unstructured repositories of primary data, which are also known as enterprise content repositories.

Enterprise content repositories can be subdivided into workloads with similar access patterns, data protection requirements, protocol requirements, and performance requirements.

Infinite Volume is focused on use cases that can be characterized by input/output (I/O) patterns in which data is written once and seldom changed. However, this data is used for normal business operations, and therefore content must be kept online for fast retrieval, rather than being moved to secondary storage.One example of this type of workload is a video file archive. Libraries of large video files are kept in a repository from which they are periodically retrieved and sent to broadcast sites. These repositories typically grow as large as 5PB.

Another example is enterprise content management storage. This can be used to store large amounts of unstructured content such as documents, graphics, and scanned images. These environments commonly can contain a million or more files.

More Information

Enter the world of vServers and the quintessentially named “infinite volumes”. If you want to do seriously large file systems (like an Isilon normally does), there are some interesting restrictions in play.

In ONTAP 8.1.1 to get meaningfully large file systems, you start with a dedicated hardware/software partition within your 8.1.1 cluster. This partition will support one (and apparently only one) vServer, or visible file system. Between the two constructs exists a new entity: the “infinite volume” – an aggregator of, well, aggregates running on separate nodes.

This “partitioned world” of dedicated hardware, a single vServer and the new infinite volume is the only place where you can start talking about seriously large file systems.

Additional Information

Big content storage solutions can be categorized into three main categories based on the storage, management, and retrieval of the data into file services, enterprise content repositories, and distributed content repositories. NetApp addresses the business challenges of big content by providing the appropriate solution for all of these different environments.
  • File services represent the portion of the unstructured data market in which NetApp has traditionally been a leader, including project shares and home directory use cases.
  • The enterprise content repository market, by contrast, is less driven by direct end users and more by applications that require large container sizes with an increasing number of files.
  • Distributed content repositories take advantage of object protocols to provide a global namespace that spans numerous data centers.
Infinite Volume addresses the enterprise content repository market and is optimized for scale and ease of management. Infinite Volume is a cost-effective large container that can grow to PBs of storage and billions of files. It is built on NetApp’s reliable fabric-attached storage (FAS) and V-Series systems, and it inherits the advanced capabilities of clustered Data ONTAP.

By providing a single large container for unstructured data, e-mail, video, and graphics, Infinite Volume eliminates the need to build data management capabilities into applications with big content requirements. For these environments, Infinite Volume takes advantage of native storage efficiency features, such as deduplication and compression, to keep storage costs low.

Further, since Infinite Volume is built into clustered Data ONTAP, the customer is able to host both Infinite Volume(s) and FlexVol volumes together in a unified scale-out storage solution. This provides the customer with the ability to host a variety of different applications in a multi-tenancy environment, with nondisruptive operations and the ability to use both SAN and NAS in the same storage infrastructure leveraging the same hardware.

Advantages of Infinite Volume

Infinite Volume offers many business advantages for enterprise content repositories. For example, an Infinite Volume for an enterprise content repository solution can be used to:
  • Reduce the cost of scalability
    • Lower the effective cost per GB
    • Efficiently ingest, store, and deliver large amounts of data
  • Reduce complexity and management overhead
    • Simplify and automate storage management operations
    • Provide seamless operation and data and service availability
Infinite Volume leverages dense storage shelves from NetApp with the effective use of large-capacity storage disks. The solution is built on top of the proven foundation of Data ONTAP with storage efficiency features like deduplication and compression.

Infinite Volume gives customers a single, large, scalable container to help them manage huge amounts of growth in unstructured data that might be difficult to manage by using several containers. Data is automatically load balanced across the Infinite Volume at ingest. This manageability allows storage administrators to easily monitor the health state and capacity requirements of their storage systems.

Infinite Volumes are configured within a Data ONTAP cluster and do not require dedicated hardware. Infinite Volumes can share the same hardware with FlexVol volumes.

Overview of Infinite Volume

NetApp Infinite Volume is a software abstraction hosted over clustered Data ONTAP. It provides a single mountpoint that can scale to 20PB and 2 billion files, and it integrates with NetApp’s proven technologies and products, such as deduplication, compression, and NetApp SnapMirror® replication technology.

Infinite Volume writes an individual file in its entirety to a single node but distributes the files across several controllers within a cluster.

Figure 1 shows how an Infinite Volume appears as a single large container with billions of files stored in numerous data constituents.

In the first version of Infinite Volume, data access was provided over the NFSv3 protocol. Starting in clustered Data ONTAP 8.2, Infinite Volume added support for NFSv4.1, pNFS, and CIFS. Like a FlexVol volume, Infinite Volume data is protected by using NetApp Snapshot, Raid-DP, and SnapMirror technologies, and NFS or CIFS mounted tape backups.

FlexVol Vs Infinite Vol

Both FlexVol volumes and Infinite Volumes are data containers. However, they have significant differences that you should consider before deciding which type of volume to include in your storage architecture.

The following table summarizes the differences and similarities between FlexVol volumes and Infinite Volumes:


In the IT world, there are countless situations in which it is desirable to create a copy of a dataset: Application development and test (dev/test) and the provisioning of new virtual machines are common examples. Unfortunately, traditional copies don’t come free. They consume significant storage capacity, server and network resources, and valuable administrator time and energy. As a result, your operation probably makes do with fewer, less up-to-date copies than you really need.

This is exactly the problem that NetApp FlexClone technology was designed to solve. FlexClone was introduced to allow you to make fast, space-efficient copies of flexible volumes (FlexVol volumes) and LUNs. A previous Tech OnTap article describes how one IT team used the NetApp rapid cloning capability built on FlexClone technology (now incorporated as part of the NetApp Virtual Storage Console, or VSC) to deploy a 9,000-seat virtual desktop environment with flexible, fast reprovisioning and using a fraction of the storage that would normally be required. NetApp uses the same approach for server provisioning in its own data centers.

Figure 1 FlexClone technology versus the traditional approach to data copies.

Using FlexClone technology instead of traditional copies offers significant advantages. It is:
  • Fast. Traditional copies can take many minutes or hours to make. With FlexClone technology even the largest volumes can be cloned in a matter of seconds.
  • Space efficient. A clone uses a small amount of space for metadata, and then only consumes additional space as data is changed or added.
  • Reduces costs. FlexClone technology can cut the storage you need for dev/test or virtual environments by 50% or more.
  • Improves quality of dev/test. Make as many copies of your full production dataset as you need. If a test corrupts the data, start again in seconds. Developers and test engineers spend less time waiting for access to datasets and more time doing productive work.
  • Lets you get more from your DR environment. FlexClone makes it possible to clone and fully test your DR processes, or use your DR environment for dev/test without interfering with ongoing replication. You simply clone your DR copies and do dev/test on the clones.
  • Accelerates virtual machine and virtual desktop provisioning. Deploy tens or hundreds of new VMs in minutes with only a small incremental increase in storage.
Most Tech OnTap readers probably know about the use of FlexClone for cloning volumes. What’s less well known is that, starting with Data ONTAP 7.3.1, NetApp also gave FlexClone the ability to clone individual files and improved the capability for cloning LUNs.

This chapter of Back to Basics explores how NetApp FlexClone technology is implemented, the most common use cases, best practices for implementing FlexClone, and more.

Other Posts in this Series:

As always, if you have any questions or have a topic that you would like me to discuss, please feel free to post a comment at the bottom of this blog entry, e-mail at, or drop me a message on Twitter (@OzNetNerd).

Note: This website is my personal blog. The opinions expressed in this blog are my own and not those of my employer.