iRODS at TACC
iRODS is a data grid/data management tool. It allows you to store data in a unified namespace using multiple storage resources, to replicate data so that copies exist on multiple systems, and to store checksums and arbitrary metadata with a file. The TACC iRODS configuration supports accessing iRODS through either the native iRODS tools such as the UNIX "i-Commands" or via the HTTPS and WebDAV protocols.
Each storage system accessed through iRODS is referred to as a "resource". In TACC's current Corral configuration, there are two disk-based resources accessible through iRODS, one for each of the Corral filesystems. The unreplicated GPFS filesystem is referred to within iRODS as "
gpfs-tacc" and the replicated filesystem is referred to as "
gpfs-repl". Access to the
/home3 filesystem on Ranch is also available through the "
|Resource Name||Storage System|
Access to the iRODS system on Corral is available only to allocated users who have requested it. If you wish to utilize iRODS for accessing Corral, please indicate so in your Corral allocation request. Users with existing allocations who wish to make use of iRODS should submit a user support request through the user portal. You may also e-mail firstname.lastname@example.org to discuss your data needs and we will be happy to make a recommendation on the best tools for managing your data.
Use of the Corral resource may be subject to allocation constraints. Please limit yourself to using the Ranch archive filesystems for long-term storage unless you have an allocation on Corral.
The IRODS command-line utilities, collectively referred to as "i-commands", use an environment file to store iRODS initialization information under a user's home directory. A template version of that file is shown below in Example 1 below. To set up your computing environment, copy and paste this text into a file called
~/.irods/.irodsEnv, then edit the file, replacing each instance of "
USERNAME" with your TACC username. You may also change the "
irodsDefResource" line to use a different default resource if you so choose. Once you have created and saved this file, you can issue the "
iinit" command to start your iRODS session, after which you can store and retrieve data normally using the i-commands as documented below. If you will be accessing iRODS only through the WebDAV mechanism you do not need to create this configuration file.
# iRODS personal configuration file. # iRODS server host name: irodsHost 'icat.corral.tacc.utexas.edu' # iRODS server port number: irodsPort 1247 # Default storage resource name: irodsDefResource 'gpfs-tacc' # Home directory in iRODS: irodsHome '/corralZ/home/USERNAME' # Current directory in iRODS: irodsCwd '/corralZ/home/USERNAME' # Account name: irodsUserName 'USERNAME' # Zone: irodsZone 'corralZ'
We recommend that you use the "Cyberduck" utility rather than a web browser to access the WebDAV-based iRODS service, as most web browsers have limitations on large data transfer, and Cyberduck supports drag-and-drop operations to copy data to and from your Windows or Mac desktop. If you do choose to use a web browser for uploading or downloading small files, or simply to manage permissions and metadata, you can use any web browser to load the URL
PATH" is replaced with the path to your data in iRODS e.g.
Download Cyberduck for free from http://cyberduck.ch. The screenshot below shows the configuration options to select when opening a new connection to the TACC iRODS server; simply change the username and password to your TACC iRODS username and password, and replace the path shown with the path to your home or project directory. Note that your TACC iRODS username and password may be different from your TACC username and password. Once you have initiated a connection to the iRODS WebDAV server, you can drag files or folders into the Cyberduck window to upload them, and drag them from the window onto the desktop to download them. Cyberduck will perform recursive uploads and downloads, i.e. dragging a folder into the Cyberduck window will upload both the folder and any files and folders contained within that folder.
Once you have configured the "
~/.irods/.irodsEnv" file, you can use i-commands to access and manipulate data in the system. The i-commands nomenclature mimcs that of UNIX but with an "
i" prepended to the command name e.g.
icd. Generally, i-commands are functionally equivalent to their UNIX counterparts. Complete i-commands documentation can be found on the iRODS site. The following table summarizes some of the most common i-commands.
irsync command to synchronize a local directory with iRODS, similar to the Unix
rsync command. It can be used to make an exact copy of a directory hierarchy on a local disk within iRODS, or retrieve an exact copy of a directory hierarchy already stored in iRODS. It may also be used to create an exact copy of a file or directory within iRODS. iRODS paths are identified with an i: prefix in the
For example, if you have created a directory within iRODS called "
/tacc/home/joeuser/myproject", and you wish to retrieve an exact copy of that directory on Stampede, run the command:
login1$ irsync -r i:/tacc/home/joeuser/myproject /path/to/joeusers/workdir
After editing the files on Stampede, you can then synchronize the data back into iRODS using the command:
login1$ irsync -r /path/to/joeusers/workdir i:/tacc/home/joeuser/myproject
If you are storing or retrieving data to Ranch with the "
-R ranch-main" option, you should also use the "
-s" switch - this will use the size rather than the checksum of the file to determine whether synchronization is necessary, thereby avoiding the need to retrieve all the files from tape to compute checksums. This will greatly improve the performance of synchronization with Ranch.
irm command to delete data entirely or just from a single resource.
irm without options deletes all copies of a file.
irm with the "
-n #" switch deletes a specific replica. For example, if you have stored data initially in the "
cache" resource and then replicated it to Ranch, replica 0 will be the copy stored on the cache, and the command:
login1$ irm -n 0 path_and_filename
will remove the file from the iRODS cache resource, while leaving the archived copy intact. Use "
ils -l" before deleting a replica to ensure that you have a copy in more that one resource, and that you are deleting the correct replica.
irm options include:
-fforce data removal
All users can see all other users' collections and files but cannot access, (read, write or own), where they do not have permissions. The
[ichmod](https://www.irods.org/index.php/ichmod) command, like the UNIX
chmod command, allows a user to give file access permission to other users or groups.
- Read Permission
login1$ ichmod read testuser testfile.txt ils -A testfile.txt
ils -A" command shows the Access Control List (ACL) for the file "
testuserhas been given read permission on testfile.
Giving another user ownership permission will enable them to change ACL for the file or folder. For example, to give testuser ownership permissions means testuser can then extend read/write/owner permissions to other users.
login1$ ichmod own testuser testfile.txt
You can assign "
null" to remove permissions from the ACL for a file or folder:
login1$ ichmod null testuser testfile.txt
ichmod command options include:
You may also use the iRODS i-Commands to view and manipulate your data, as well as storing and retrieving data, by downloading and installing the iRODS client tools on any Linux/Unix system. iRODS source code downloads and installation instructions are at http://irods.org/download/.
Last update: April 10, 2015