Received an interesting snapshot restore inquiry from the field and thought it was worth incorporating into a blog article. The scenario this this: A large amount of data needs to be restored on a cluster. Unfortunately, the SnapshotIQ policies are configured at the root /ifs level and it is not feasible to restore every subdirectory under the snapshot. Although the files themselves are not that large, the subdirectories contain anywhere from thousands to tens of millions of files. Restores are taking a very long time when copying the directories manually.


So, there are two main issues at play here:


  • Since the snapshot is taken at a lower level in the directory tree and the entire snapshot cannot be restored in place, using the SnapRevert job is not an option here.
  • The sheer quantity of files involved mean that a manual, serial restore of the data will be incredibly time consuming.


Fortunately, there is a solution that involves using replication. SyncIQ allows for snapshot subdirectories to be included or excluded, plus also provides the performance benefit of parallel job processing.


SyncIQ contains an option only available via the command line (CLI) which allows replicate out of a snapshot.


The procedure is as follows:


1)     Create a snapshot of a root directory.

# isi snapshot snapshots create --name snaptest3 /ifs/data


2)     List the available snapshots and select the desired instance.

 

For example:


# isi snapshot list

ID Name Path

----------------------------------------------------

6 FSAnalyze-Snapshot-Current-1529557209 /ifs

8    snaptest3                             /ifs/data

----------------------------------------------------

Total: 2


Note that there are a couple of caveats:


  • The subdirectory to be restored must still exist in the HEAD filesystem (ie. not have been deleted since the snapshot was taken).
  • You cannot replicate data from a SyncIQ generated snapshot.

 

3)     Create a local SyncIQ replication policy with the snapshot source as the original location and a new directory location on ‘localhost’ as the destination. The ‘—source-include-directories’ argument lists the desired subdirectory(s) to restore.

 

For example, via the CLI:

 

# isi sync policies create snapshot_sync3 sync /ifs/data localhost /ifs/file_sync3 --source-include-directories /ifs/data/local_qa

 

Or via the WebUI:

 

SyncIQ_snapshot_replication_1.png

 

Note:  You cannot configure the snapshot into the policy, or set source=snapshot.


4)     Next, run the sync job to replicate a subset of a snapshot. This step is CLI only (not WebUI) since the SyncIQ policy needs to be executed with ‘--source-snapshot’ argument specified.

 

For example:


# isi sync job start snapshot_sync3 --source-snapshot=snaptest3


Note: This command is essentially a change root for the single run of the SyncIQ Job.


5)     Finally, rename the original directory to something else with mv, and then rename the restore location to the original name.

 

For example:

 

# mv /ifs/data/local_qa /ifs/data/local_qa_old

# mv /ifs/file_sync3/local_qa /ifs/data/local_qa


If you do not have a current replication license on your cluster, you can enable the OneFS SyncIQ trial license from the WebUI by browsing to Cluster Management > Licensing.


Using SyncIQ in this manner is a very efficient way to recover large amounts of data from within snapshots. However, this scenario also illustrates one of the drawbacks of taking snapshots at the root directory level. Consider whether it’s more advantageous to configure snapshot schedules to capture at the subdirectory directory level instead.