Introduction to Wrangler

Wrangler is an innovative new XSEDE resource hosted at TACC that has been designed from the ground up to support data driven workflows that are limited by the IO rates, memory, and storage policies of most current XSEDE Resources. Leveraging a novel high-speed flash based storage system Wrangler will provide up to 1TB/sec and 200 million IOPS, accessed directly via PCI across the 96 compute nodes at TACC. Along with this component, Wrangler provides users with 10 PB of long-term storage replicated between TACC and Indiana University. Wrangler will support more traditional data-intensive jobs currently running on other HPC resources, such as R and Python based workflows, Wrangler will also host Hadoop MapReduce capabilities, SQL and noSQL databases, and tools to help users better manage their data.

In this webinar, we will touch on the hardware components that make up Wrangler, and how they will be allocated to users. We will explain how the Hadoop and Database features will be supported on the system, as well as cover the tools available for more traditional workflows that can run on current HPC systems, but which may run significantly better in the Wrangler environment. Finally we will talk about the ways Wrangler will help users manage their data collections.

The online chat session is available here:

The slides are available here: