Last update: July 20, 2021
Migration Update (07/01/21)
The Stockyard Migration is now complete.
- The contents of your original
/work
directory can be accessed via/work_old
. This file system is mounted only on each resource's login nodes. - The
/work2
file system has been renamed to/work
.
Migration Update (06/23/2021)
TACC staff will complete the Stockyard migration on June 29th, 2021.
Currently the "/work
" file system is mounted as "read-only" on both the login and compute nodes of all TACC resources. On June 29th TACC will initiate the following switch over.
The
/work
file system will be dismounted from all TACC resources, on both the compute and login nodes."
/work
" will transform into a symbolic link pointing to the new/work2
file system. The following commands will then be equivalent, taking you to your resource-specific sub-directory of your new$WORK
space:login1$ cdw login1$ cd $WORK login1$ cd $WORK2
To allow for file transfers, the "
/work
" file system will then be re-mounted as "/work_old
, on each resource's login nodes only.
After June 29, 2021, all data on the old /work
file system will be inaccessible to any running jobs on any resource.
Introduction
TACC's Stockyard-hosted file system, currently synonymous with the "/work
" file system, is approaching end-of-life. A new file system, /work2
, is now available to facilitate data migration and eventually replace /work
. At this time, all users with files on /work
must determine whether to migrate their data to /work2
, move it to a non-TACC facility or leave it on /work
where it will eventually become unavailable. The TACC Ranch archive resource is also available for data you no longer need for processing but wish to preserve long-term.
Before /work2
permanently replaces the current /work
users must migrate any current data they wish to keep to /work2
while continuing to use their existing workflows. Users should complete the migration before June 29, 2021 when /work2
becomes the work file system mounted on compute resources.
Prior to migrating data, review your data to identify that which can be deleted. The Ranch Archive Facility is also available for data that no longer needs to be in /work
but must be retained long term.
Stockyard Layout
The /work2
file system is mounted on all of the major TACC clusters and is becoming available anywhere else that /work
is available. We expect that any remaining systems that have /work
but are missing /work2
will be mounted shortly. TACC staff may begin data migrations immediately where /work2
is mounted.
The $STOCKYARD
environment variable points to the highest-level directory that you own on the Global Shared File System. This variable is consistent across all TACC resources that mount Stockyard. The $WORK
environment variable, on the other hand, is resource-specific and varies across systems. $WORK
is a subdirectory of $STOCKYARD
and its name corresponds to the associated TACC resource.
Figure 1. Stockyard, TACC's Global Shared File System
All subdirectories contained in your $STOCKYARD
directory are available to you from any other TACC system that mounts the file system. If you have accounts on both Stampede2 and Maverick2, for example, the $STOCKYARD/stampede2
directory is available from your Maverick2 account, and $STOCKYARD/maverick2
is available from your Stampede2 account. Your quota and reported usage on the Global Shared File System reflects all files that you own on Stockyard, regardless of their actual location on the file system.
Migration Timeline
The dates below are subject to fluctuation depending on the overall migration status.
Migration from /work
to /work2
begins in March 2021. Both /work
and /work2
will exist during the migration. but we intend to rename and mount the new filesystem as /work
once complete and the original Stockyard filesystem goes offline. This means that it would be beneficial not to create too much automation targeting /work2
locations, use references to it sparingly or make them easy to change.
Time Period | /work permissions | /work2 permissions | |
---|---|---|---|
Phase 1 | present - May 18, 2021 | read and write | read and write |
Phase 2 | May 18 - June 15, 2021 | read only | read and write |
Phase 3 | June 15 - June 29, 2021 | mounted only on HPC login nodes | read and write |
Phase 4 | Migration period complete June 29, 2021 | unmounted | /work2 renamed to /work |
Final Destination Determination
Prior to any file migration, it's imperative that you take stock of your existing data and migrate only what is needed for your current research over to the new file system. This is the perfect time to delete or put aside non-relevant files, compress the rest and transfer. All other data should either be migrated to long-term storage (Ranch), moved to a non-TACC facility or deleted. During this process, please take care if you're creating tar balls on /work
that you don't exceed your quota.
For all your data, and for every file, determine whether to:
Stockyard is not backed up, neither /work
nor /work2
.
Take care when deleting, especially if doing so recursively.
Migrating your Files from /work
to /work2
If you have allocations on multiple TACC resources, then your /work
directory will consist of several resource-specific subdirectories. See Figure 1.
At this time, we are recommending you migrate one Stockyard resource-specific subdirectory at a time, e.g. rsync /work/lonestar5
, then /work/stampede2
, etc. You may transfer any non-resource-specific directories at any time.
Use "rsync
" to Move Files
The current /work
and the new /work2
are both Lustre file systems. There is no need to stripe the receiving /work2
directories as any new files are not expected to exceed 2TB. Your quotas across /work
and /work2
will remain consistent.
Contrary to our usual file transfer recommendations, TACC staff advises you to use the "rsync
" command to transfer your /work
contents over to /work2
. We are advising against using the "tar
" utility in order to avoid temporary storage issues on /work
as much as possible.
Use "rsync
" to transfer files between the /work
and /work2
file systems. The corresponding directories on /work2
will already exist.
Example rsync
Command
In this example command, user bjones transfers over the maverick2
subdirectory to the new file system. The command options indicate:
––partial
check if the file is already there and don't re-copy it if it's already there.-a
archive mode, preserve permissions and ownerships-z
compression which would cost us some CPU cycles - possibly on login nodes at the worst case and save on bandwidth.-v
verbose mode
$ rsync --partial -azv /work/01234/bjones/maverick2 /work2/01234/bjones
See the rsync
man page and take a deep dive into the rsync
options to learn further techniques.
Please limit your rsync
processes to no more than two concurrent processes.
If your rsync
session fails or the data integrity for some specific files is a big concern, then rerun the "rsync
" command with the "-checksum
" (or "-c
") option to ensure the target files are written correctly.
$ rsync -azvc /work/01234/bjones/maverick2 /work2/01234/bjones
Migrating your Files to Ranch
Follow the instructions in the Ranch User Guide to transfer your data to Ranch.
If you have access to long-term storage at other facilities, you may transfer your data to that facility using the rsync
command.
Migrating Guidelines
During this migration period, thousands of TACC users will be accessing both file systems. Be aware of the following guidelines.
Don't Stress Stockyard
The TACC Global Shared File System, Stockyard, is mounted on most TACC HPC resources as the /work
($WORK
) directory. This file system is accessible to all TACC users, and therefore experiences a lot of I/O activity (reading and writing to disk, opening and closing files) as users run their jobs, read and generate data including intermediate and checkpointing files. As TACC adds more users, the stress on the $WORK
file system is increasing to the extent that TACC staff is now recommending new job submission guidelines in order to reduce stress and I/O on Stockyard.
TACC staff now recommends that you run your jobs out of the $SCRATCH
file system instead of the global $WORK
file system. To run your jobs out $SCRATCH
:
- Copy or move all job input files to
$SCRATCH
- Make sure your job script directs all output to
$SCRATCH
- Compute nodes should not access either
/work
or/work2
.
File System Usage
Consider that $HOME
and $WORK
are for storage and keeping track of important items. Actual job activity, reading and writing to disk, should be offloaded to your resource's $SCRATCH
file system (see Table 2. File System Usage Recommendations. You can start a job from anywhere but the actual work of the job should occur only on the $SCRATCH
partition. You can save original items to $HOME
or $WORK
so that you can copy them over to $SCRATCH
if you need to re-generate results.
File System | Best Storage Practices | Best Activities |
---|---|---|
$HOME | cron jobs small scripts environment settings | compiling, editing |
$WORK | store software installations original datasets that can't be reproduced job scripts and templates | staging datasets |
$SCRATCH | Temporary Storage I/O files job files temporary datasets | all job I/O activity |
The $SCRATCH
file system, as its name indicates, is a temporary storage space. Files that have not been accessed in ten days are subject to purge. Deliberately modifying file access time (using any method, tool, or program) for the purpose of circumventing purge policies is prohibited.
Limit Input/Output (I/O) Activity
Limit I/O intensive sessions (lots of reads and writes to disk, rapidly opening or closing many files).
Avoid opening and closing files repeatedly. Every open/close operation on the file system requires interaction with the MetaData Service (MDS). The MDS acts as a gatekeeper for access to files on Stockyard's file system. Overloading the MDS will affect other users on the system. If possible, open files once at the beginning of your program/workflow, then close them at the end.
Don't get greedy. If you know or suspect your workflow is I/O intensive, don't submit a pile of simultaneous jobs. Please limit your
rsync
processes to no more than two concurrent processes.