Last update: April 1, 2022
Leveraging each node's /tmp
directory space can effectively minimize the I/O load on the global Lustre file system and can also improve the performance of I/O work. Due to its limited size, the /tmp
space is appropriate for executables/binaries, frequently-used object files, and small size common files, e.g. the global configuration files or the initial/pre-processed data files.
Collect-Distribute (CDTools) has been designed and developed to distribute files or directories to or from the /tmp
directory. In CDTools, "distribute.bash
" can be used to copy/clone the binaries and frequently accessed input files to the local /tmp
space on each compute node when a job starts, and "collect.bash
" can be used to collect output files and log files back to $WORK
or $SCRATCH
before a job finishes.
Using CD Tools
CDTools is currently installed on TACC's Frontera and Stampede2 resources.
1. Initialize CD Tools Environment Variable
To set up CDTools on Frontera or Stampede2 system, initialize these environment variables:
$ export CDTools=/home1/apps/CDTools/1.1 # for Frontera or Stampede2 $ export PATH=${PATH}:${CDTools}/bin
2. Distribute Files to Each Node's /tmp
Space
Distribute your files/directories to the local /tmp
space of each compute node allotted for your job:
$ distribute.bash ${SCRATCH}/inputfile #put the full path of your input file hereor
$ distribute.bash ${SCRATCH}/inputdir #put the full path of the directory of your input files here
If you ssh
to those compute nodes after running the above command, you would find an identical copy of your input file or directory in the /tmp
directory on each node.
3. Collect your Output Files
Once your job completes, collect the job output files from the /tmp
space of each node:
$ collect.bash /tmp/outputdir ${SCRATCH}/output_collectedor
$ collect.bash /tmp/outputfile ${SCRATCH}/output_collected
You will obtain a list of output files or directories copied back to your target directories in $SCRATCH
. These output files or directories have been appended with an underscore and a number that indicates the rank of compute nodes. For example, given a job run on four nodes: files "outputfile_0
", "outputfile_1
", "outputfile_2
" and "outputfile_3
" will all be placed in the "/output_collected
" directory.
Notes
- This tool should work for both batch mode and interactive mode. An example job script can be found in ${CDTools}/test.
- When using the tool, users should test their workflow with CDTools before any productive runs to make sure required files are successfully distributed and collected.
- Users should still understand and respect the
/tmp
limit and other I/O rules when using it.