Maverick User Guide

System Overview

Maverick, an HP/NVIDIA Interactive Visualization and Data Analytics System, is TACC's latest addition to its suite of advanced computing systems, combines capacities for interactive advanced visualization and large-scale data analytics as well as traditional high performance computing. Recent exponential increases in the size and quantity of digital datasets necessitate new systems such as Maverick, capable of fast data movement and advanced statistical analysis. Maverick debuts the new NVIDIA K40 GPU for remote visualization and GPU computing to the national community.


  • 132 Intel Xeon E5-2680 v2 Ivy Bridge processors and run at 2.8GHz (dual socket) nodes
  • 132 NVIDIA Tesla K40 GPUs
  • TACC-developed remote vis software: ScoreVIS, DisplayCluster, GLuRay
  • Visualization software stack: Paraview, VisIT, EnSight, Amira


  • 132 1/4TB memory nodes
  • connected to 20PB file system
  • Mellanox FDR InfiniBand interconnect
  • comprehensive software includes: MATLAB, Parallel R

Maverick is intended primarily for interactive visualization and data analysis jobs to allow for interactive query of large-scale data sets. Normal batch queues will enable users to run simulations up to 6 hours for interactive jobs and 24 hours for GPGPU and HPC jobs. Jobs requiring run times and more cores than allowed by the normal queues will be run in a special queue after approval of TACC staff. Users will be able to run jobs using 132 of the NVIDIA Tesla K40s for both interactive graphics and for GPGPU jobs (at a lower priority).

System Configuration

The new Maverick HP/NVIDIA Interactive Visualization and Data Analytics System is configured with 132 HP ProLiant SL250s Gen8 compute nodes and 132 NVIDIA Tesla K40 GPU accelerators. In addition, with 256 GB of memory and 500 GB of storage per node, users have access to an aggregate of 33.7 TB of memory and 66 TB of local storage. Compute nodes have access to a 20 PB Lustre Parallel file system, Stockyard. An FDR InfiniBand switch fabric interconnects the nodes facilitating high-speed internode communication and I/O traffic.

  • Operating System - CentOS 6.4
  • CPU - Intel Xeon E5-2680 v2 Ivy Bridge, 2.80 GHz, 20 CPUs/node, 12.8 GB memory / core

File Systems

Maverick has several different file systems with distinct storage characteristics. There are predefined directories in these file systems for you to store your data. Since these file systems are shared with others, they are managed either by quota limits. There is no purge policy on Maverick.

Two local file systems are available: an NFS $HOME and $WORK, a Lustre filesystem on the TACC backbone Stockyard. The $HOME directory has a 10GB quota. All file systems also impose an inode limit, which affects the number of files allowed.

The $WORK filesystem on Maverick is shared with Stampede, though a user's $WORK directory path on Stampede will differ from that on Maverick. For example a user's $WORK directory on Stampede will have the path format similar to: "/work/01158/janeuser", and on Maverick similar to: "/work/01158/janeuser/maverick".

System Access

Maverick is accessed either using the secure-shell ssh program (for batch-mode access, but which can be used to initiate interactive VNC access) or via the TACC Visualization Portal (formerly Longhorn Visualization Portal).

SSH access

Unix-based systems, including Linux and Mac OS X have an ssh client available; freely available clients are also available; a popular choice for Windows is PuTTY.

To initiate an SSH connection to a Maverick login node from a UNIX or Linux system with an SSH client already installed, execute the following command:

login1$ ssh

where username is replaced with the Maverick user name assigned to you during the allocation process.

XSEDE Single-Sign-On Hub

We recommend XSEDE users make use of the XSEDE Single-Sign-On Hub, a single point of command-line access to XSEDE's High Performance Computing, High Throughput Computing and Visualization resources.

Establishing Interactive Access Via VNC

Please see Running Applications on the VNC Desktop.

Transferring files to Maverick

Maverick does NOT have a local parallel filesystem or additional nodes to run GridFTP services as on Ranch, Stampede, or Lonestar. Maverick shares the TACC backbone Stockyard's large $WORK parallel file system (1TB quota) with Stampede. Since users' Stampede and Maverick $WORK filesystems are NOT in the same location on Stockyard, users may transfer files with globus-url-copy to Maverick's $WORK filesystem by using Stampede's GridFTP endpoint ( with your Maverick $WORK directory path. For a full list of XSEDE endpoints please see XSEDE's Data Transfers & Management GridFTP Endpoints table.

The following Stampede session demonstrates Globus' globus-url-copy to copy "mybigfile" from NICS' Nautilus to the user's Maverick $WORK directory:

login1$ module load CTSSV4
login1$ myproxy-logon
Enter MyProxy pass phrase:
A credential has been received for user slindsey in /home1/01158/slindsey/.globus/userproxy.pem.
login1$ globus-url-copy -vb \
>      gsi \
>      gsi

Source: gsi
Dest:   gsi


Also, Maverick's small $HOME file system is not available to Stampede's gridftp servers. Users will need to employ the rsync and scp mechanisms to copy files to their Maverick $HOME directory.

Computing Environment


Maverick employs the environment modules system to manage a user's environment. To see all the software that is available across all compilers and mpi stacks, issue:

login1% module spider

To see which software packages are available with your currently loaded compiler and mpi stack:

login1$ module avail

You may also consult the TACC Software page for a listing of all available software for all TACC resources.

Application Development

In general, application development on Maverick is identical to that on Stampede, including the availability and usage of compilers, the parallel development libraries (e.g. MPI and OpenMP), tuning and debugging.

Additional visualization-oriented libraries available on Maverick are made accessible through the modules system. Library and include-file search path environment variables are modified when modules are loaded. For detailed information on the effect of loading a module, use:

login1$ module help modulename

Running your applications

Jobs are run on Maverick using one of two methods: Batch jobs can be submitted from the Maverick login node,, and interactively from a remotely accessed VNC desktop running on an allocated Maverick compute node.

Maverick Queues

Table 1. Maverick SLURM Queues

Queue Name Purpose Max Runtime Max Nodes/Procs Max Jobs in Queue Node Pool
vis Visualization 4 hrs 32 nodes (640 cores) 20 all compute nodes
gpu GPU 12 hrs 32 nodes (640 cores) 20 all compute nodes except c225 rack
systest special request - - - all compute nodes

Running Batch Jobs on Maverick

Batch jobs are run on Maverick via the SLURM job scheduler. Please consult the Stampede User Guide's SLURM Batch Environment section for detailed information on the SLURM interface to batch job control.

The number of SUs billed depends on the total number of nodes used:
SUs billed = # nodes * 20 cores/node * wallclock time

The TACC Visualization Portal

The TACC Visualization Portal is available at It provides a very simple mechanism to run interactive sessions on Maverick. It presents two choices: to create a VNC desktop (essentially wrapping the above in a much simplified manner, though at cost of some flexibility), and the ability to run RStudio server and iPython Notebook sessions. Please see the TACC Visualization page for more information.

Pulldowns on this page enable a user choose either to create a Maverick VNC desktop or an RStudio Server, or an iPython Notebook session. When VNC is selected, the user is presented with pulldowns for setting the various parameters of a VNC session, including the wayness, number of nodes, and desktop dimensions. The portal will then submit a VNC job to the Maverick vis queue. When the job starts, a VNC viewer will be established in in the portal; alternatively, the Jobs tab will present the a URL and port number that the can be used to connect an external VNC viewer. Note that the portal provides access to only some of the options available through the qsub interface, and the previous method of creating a VNC session through the qsub interface will be necessary in some cases.

The TACC Visualization Portal jobs page also shows the current usage of Maverick; it is a very easy mechanism to find the status of jobs. All jobs submitted to Maverick - either via qsub or via the Portal, running or in various wait queues, will appear in the status information shown.

Visualization on Maverick

While batch visualization can be performed on any Maverick node, a set of nodes have been configured for hardware-accelerated rendering. The vis queue contains a subset of 132 compute nodes configured with one NVIDIA K40 GPU each.

Remote Desktop Access

Remote desktop access to Maverick is formed through a VNC connection to one or more visualization nodes.

You must have an account on Maverick in order to start a VNC session. University of Texas faculty, staff and affiliates may request an account by submitting a help desk ticket through the TACC User Portal. XSEDE users may submit a help desk ticket via the XSEDE User Portal (XUP) Help Desk.

Users must first connect to a Maverick login node (see System Access) and submit a special interactive batch job that:

  • allocates a set of Maverick visualization nodes
  • starts a vncserver process on the first allocated node
  • sets up a tunnel through the login node to the vncserver access port

Once the vncserver process is running on the visualization node and a tunnel through the login node is created, an output message identifies the access port for connecting a VNC viewer. A VNC viewer application is run on the user's remote system and presents the desktop to the user.

If this is your first time connecting to Maverick, you must run vncpasswd to create a password for your VNC servers. This should NOT be your XSEDE login or Maverick password! This mechanism only deters unauthorized connections; it is not fully secure, as only the first eight characters of the password are saved. All VNC connections are tunnelled through SSH for extra security, as described below.

Follow the steps below to start an interactive session.

  1. Start a Remote Desktop

    TACC has provided a VNC job script (/share/doc/slurm/job.vnc) that requests one node in the vis queue for four hours, creating a VNC session.

     login1$ sbatch /share/doc/slurm/job.vnc

    You may modify or overwrite script defaults with sbatch command-line options:

    • "-t hours:minutes:seconds" modifies the job runtime
    • "-A projectnumber" specifies the project to be charged
    • "-N nodes" sets the number of nodes needed
    • "-p partition" to specify an alternate partition (queue).

    See more sbatch options in Stampede User Guide Table 7.3

    All arguments after the job script name are sent to the vncserver command. For example, to set the desktop resolution to 1440x900, use:

     login1$ sbatch /share/doc/slurm/job.vnc -geometry 1440x900

    The vnc.job script starts a vncserver process and writes to the output file, vncserver.out in the job submission directory, with the connect port for the vncviewer. Watch for the "To connect via VNC client" message at the end of the output file, or watch the output stream in a separate window with the commands:

     login1$ touch vncserver.out ; tail -f vncserver.out

    The lightweight window manager, xfce, is the default VNC desktop and is recommended for remote performance. Gnome is available; to use gnome, open the "~/.vnc/xstartup" file (created after your first VNC session) and replace "startxfce4" with "gnome-session". Note that gnome may lag over slow internet connections.

  2. Create an SSH Tunnel to Maverick

    TACC requires users to create an SSH tunnel from the local system to the Maverick login node ( to assure that the connection is secure. On a Unix or Linux system, execute the following command once the port has been opened on the Maverick login node:

     localhost$ ssh -f -N -L


    • "yyyy" is the port number given by the vncserver batch job
    • "xxxx" is a port on the remote system. Generally, the port number specified on the Maverick login node, yyyy, is a good choice to use on your local system as well
    • "-f" instructs SSH to only forward ports, not to execute a remote command
    • "-N" puts the ssh command into the background after connecting
    • "-L" forwards the port

    On Windows systems find the menu in the Windows SSH client where tunnels can be specified, and enter the local and remote ports as required, then ssh to Maverick.

  3. Connecting vncviewer

    Once the SSH tunnel has been established, use a VNC client to connect to the local port you created, which will then be tunneled to your VNC server on Maverick. Connect to localhost:xxxx, where xxxx is the local port you used for your tunnel. In the examples above, we would connect the VNC client to localhost::xxxx. (Some VNC clients accept localhost:xxxx).

    We recommend the TigerVNC VNC Client, a platform independent client/server application.

    Once the desktop has been established, two initial xterm windows are presented (which may be overlapping). One, which is white-on-black, manages the lifetime of the VNC server process. Killing this window (typically by typing "exit" or "ctrl-D" at the prompt) will cause the vncserver to terminate and the original batch job to end. Because of this, we recommend that this window not be used for other purposes; it is just too easy to accidentally kill it and terminate the session.

    The other xterm window is black-on-white, and can be used to start both serial programs running on the node hosting the vncserver process, or parallel jobs running across the set of cores associated with the original batch job. Additional xterm windows can be created using the window-manager left-button menu.

Running Applications on the VNC Desktop

From an interactive desktop, applications can be run from icons or from xterm command prompts. Two special cases arise: running parallel applications, and running applications that use OpenGL.

Running Parallel Applications from the Desktop

Parallel applications are run on the desktop using the ibrun wrapper.

ibrun: Enables parallel MPI jobs to be started from the VNC desktop. ibrun uses information from the user's environment to start MPI jobs across the user's set of Maverick compute nodes. This information is determined by the initial SLURM job submission, and includes the location of the hostfile created by SLURM (found in the $PE_HOSTFILE environment variable).

c442-001$ ibrun [ibrun options] application [application options]

Running OpenGL/X Applications On The Desktop

Running OpenGL/X applications on Maverick visualization nodes requires that the native X server be running on each participating visualization node. Like other TACC visualization servers, on Maverick the X servers are started automatically on each node.

Once native X servers are running, several scripts are provided to enable rendering in different scenarios.

  • vglrun: Because VNC does not support OpenGL applications, VirtualGL is used to intercept OpenGL/X commands issued by application code and re-direct it to a local native X display for rendering; rendered results are then automatically read back and sent to VNC as pixel buffers. To run an OpenGL/X application from a VNC desktop command prompt:
      c442-0011$ vglrun [vglrun options] application [application-args]
  • tacc_xrun: Some visualization applications present a client/server architecture, in which every process of a parallel server renders to local graphics resources, then returns rendered pixels to a separate, possibly remote client process for display. By wrapping server processes in the tacc_xrun wrapper, the $DISPLAY environment variable is manipulated to share the rendering load across the two GPUs available on each node. For example,
      c442-001$ ibrun tacc_xrun application application-args

    will cause the tasks to utilize each node, but will not render to any VNC desktop windows.

  • tacc_vglrun: Other visualization applications incorporate the final display function in the root process of the parallel application. This case is much like the one described above except for the root node, which must use vglrun to return rendered pixels to the VNC desktop. For example,
      c442-001$ ibrun tacc_vglrun application application-args

    will cause the tasks to utilize the GPU for rendering, but will transfer the root process' graphics results to the VNC desktop.

Visualization Applications

Maverick provides a set of visualization-specific modules listed below.:

Running Parallel VisIt on Maverick

VisIt is a free interactive parallel visualization and graphical analysis tool for viewing scientific data on Unix and PC platforms. Users can quickly generate visualizations from their data, animate them through time, manipulate them, and save the resulting images for presentations. VisIt contains a rich set of visualization features so that you can view your data in a variety of ways. It can be used to visualize scalar and vector fields defined on two- and three-dimensional (2D and 3D) structured and unstructured meshes. VisIt was designed to handle very large data set sizes in the terascale range and yet can also handle small data sets in the kilobyte range.

VisIt was compiled under the Intel compiler and the mvapich2 and MPI stacks.

After connecting to a VNC server on Stampede, as described above, load the VisIt module at the beginning of your interactive session before launching the Visit application:

c221-102$ module load visit
c221-102$ vglrun visit

VisIt first loads a dataset and presents a dialog allowing for selecting either a serial or parallel engine. Select the parallel engine. Note that this dialog will also present options for the number of processes to start and the number of nodes to use; these options are actually ignored in favor of the options specified when the VNC server job was started.

Preparing data for Parallel Visit

In order to take advantage of parallel processing, VisIt input data must be partitioned and distributed across the cooperating processes. This requires that the input data be explicitly partitioned into independent subsets at the time it is input to VisIt. VisIt supports [SILO]( data, which incorporates a parallel, partitioned representation. Otherwise, VisIt supports a metadata file (with a .visit extension) that lists multiple data files of any supported format that are to be associated into a single logical dataset. In addition, VisIt supports a "brick of values" format, also using the .visit metadata file, which enables single files containing data defined on rectilinear grids to be partitioned and imported in parallel. Note that VisIt does not support VTK parallel XML formats (.pvti, .pvtu, .pvtr, .pvtp, and .pvts). For more information on importing data into VisIt, see Getting Data Into VisIt; though this documentation refers to VisIt version 2.0, it appears to be the most current available.

Running Parallel ParaView on Maverick

After connecting to a VNC server on Stampede, as described above, do the following:

  1. Set the $NO_HOSTSORT environment variable to 1

    csh shell login1% setenv NO_HOSTSORT 1
    bash shell login1$ export NO_HOSTSORT=1
  2. Set up your environment with the necessary modules:

    If the user is intending to use the Python interface to Paraview via any of the following methods:

    • the Python scripting tool available through the ParaView GUI
    • pvpython
    • loading the paraview.simple module into python

    then load the python, qt and paraview modules in this order:

     c221-102$ module load python qt paraview

    else just load the qt and paraview modules in this order:

     c221-102$ module load qt paraview

    Note that the qt module is always required and must be loaded prior to the paraview module.

  3. Launch ParaView:
     c221-102$ vglrun paraview [paraview client options]
  4. Click the "Connect" button, or select File -> Connect
  5. If this is the first time you've used ParaView in parallel (or failed to save your connection configuration in your prior runs):

    1. Select "Add Server"
    2. Enter a "Name" e.g. "ibrun"
    3. Click "Configure"
    4. For "Startup Type" in the configuration dialog, select "Command" and enter the command:
      c221-102$ ibrun tacc_xrun pvserver [paraview server options]

      and click "Save"

    5. Select the name of your server configuration, and click "Connect"

You will see the parallel servers being spawned and the connection established in the ParaView Output Messages window.

Running IDL on Maverick

IDL (Interactive Data Language) is a popular interpreted language for data processing and analysis. IDL includes:

  • A rich library of high-performance, multi-threaded routines to analyze your data
  • The ability to add your own specialized routines to the library by writing procedures more quickly than other languages
  • Simple syntax, dynamic data typing, and array-oriented operations
  • Built-in functionality suitable for many data trends, with tools for 2- and 3-dimensional gridding and interpolation, routines for curve and surface fitting, and the ability to perform multi-threaded computations

To run IDL interactively in a VNC session, connect to a VNC server on Maverick as described above, then do the following:

  • load the vis and idl modules:
     c203-112$ module load vis idl
  • launch IDL
     c203-112$ idl

    or launch the IDL virtual machine:

     c203-112$ idl -vm

If you are running IDL in scripted form, without interaction, simply submit a SLURM job that loads IDL and runs your script.

If you need to run IDL interactively from an xterm from your local machine outside of a VNC session, you will need to run an interactive SLURM job in the vis queue to allocate a Maverick compute node. To do this, use the SLURM command srun to allocate an interactive shell. This command uses the same arguments as sbatch:

srun -A Acct -n num_cores_requested -p queue -t time --pty /bin/bash -l

for example:

login1$ srun -A My-acct -n 20 -p vis -t 2:0:0 --pty /bin/bash -l

will charge SUs to My-acct and request one node (20 cores) in the vis queue for two hours and run a bash login shell.

Note that any graphics windows opened from this command prompt may be significantly slower than when run through a VNC session.

GPU Programming

Accelerator (CUDA) Programming

NVIDIA's CUDA compiler and libraries are accessed by loading the CUDA module:

login1$ module load cuda

Use the nvcc compiler on the login node to compile code, and run executables on nodes with GPUs-there are no GPUs on the login nodes. Maverick's K40 GPUs are compute capability 3.5 devices. When compiling your code, make sure to specify this level of capability with:

nvcc -arch=compute_35 -code=sm_35 ...

GPU nodes are accessible through the gpu queue for production work and the devel-gpu queue for development work. Production job scripts should include the "module load cuda" command before executing cuda code; likewise, load the cuda module before or after acquiring an interactive, development gpu node with the "srun" command.

The NVIDA CUDA debugger is cuda-gdb. Applications must be debugged through a VNC session or an interactive srun session. Please see the relevant srun and VNC sections for more details.

The NVIDIA Compute Visual Profiler, computeprof, can be used to profile both CUDA and OpenCL programs that have been developed in NVIDIA CUDA/OpenCL programming environment. Since the profiler is X based, it must be run either within a VNC session or by ssh-ing into an allocated compute node with X-forwarding enabled. The profiler command and library paths are included in the $PATH and $LD_LIBRARY_PATH variables by the CUDA module. The computeprof executable and libraries can be found in the following respective directories:


For further information on the CUDA compiler, programming, the API, and debugger, please see:

  • $TACC_CUDA_DIR/doc/pdf/CUDA_Compiler_Driver_NVCC.pdf
  • $TACC_CUDA_DIR/doc/pdf/CUDA_C_Programming_Guide.pdf
  • $TACC_CUDA_DIR/doc/pdf/CUDA_Samples.pdf
  • $TACC_CUDA_DIR/doc/cuda-gdb.pdf

Heterogeneous (OpenCL) Programming

The OpenCL heterogeneous computing language is supported on all Maverick computing platforms. The Intel OpenCL environment will support the Xeon processors and Xeon Phi coprocessors, and the NVIDIA OpenCL environment supports the Tesla accelerators.

Using the Intel OpenCL Environment

The Intel OpenCL stack is not yet installed on Maverick. A user news announcement will be sent once it is installed.

Using the NVIDIA OpenCL Environment

The NVIDIA OpenCL environment supports the v1.1 API is accessible through the cuda module:

login1$ module load cuda

For programming with NVIDIA OpenCL, please see the OpenCL specification at:

Use the g++ compiler to compile NVIDIA-based OpenCL. The include files are located in the $TACC_CUDA_DIR/include subdirectory. The OpenCL library is installed in the /usr/lib64 directory, which is on the default library path. Use this path and g++ options to compile OpenCL code:

login1$ export OCL=$TACC_CUDA_DIR
login1$ g++ -I $OCL/include -lOpenCL prog.cpp

Last update: November 25, 2014

No comments yet. Be the first.