NOTICE: Jetstream is in the beginning phases of production and staff are frequently updating the documentation. Please refer to the latest version of the Jetstream User Guide at Indiana University during this early user period.

 

Go here: Jetstream User Guide - latest version

Jetstream User Guide

System Overview

Jetstream is a user-friendly cloud environment designed to give researchers access to interactive computing and data analysis resources on demand, whenever and wherever they want to analyze their data. It provides a library of virtual machines designed to do discipline specific scientific analysis. Software creators and researchers will also be able to create their own customized virtual machines (VM) or their own private computing system within Jetstream.

Jetstream features a web-based user interface based on the popular Atmosphere cloud computing environment developed by the CyVerse (formerly known as the iPlant Collaborative) and extended to support science and engineering research generally. The operational software environment is based on OpenStack.

Jetstream is accessed via web interface (https://use.jetstream-cloud.org/) using XSEDE credentials via Globus Auth.

Jetstream is meant primarily for interactive research, small scale processing on demand, or as the backend to science gateways to send research jobs to other HPC or HTC resources. Jetstream is different in that you can work with GUIs that you couldn't otherwise use on most HPC systems.

Jetstream may be used for prototyping, for creating tailored workflows to either use at smaller scale with a handful of CPUs or to port to larger environments after doing your proof of concept work at a smaller level. Jetstream is not a typical High Performance Computing (HPC) or High Throughput Computing (HTC) environment and won't be used for large scale parallel processing or high-throughput computing.

Key Features

Jetstream utilizes Atmosphere, an easy to use web application, on-demand environment that is designed to accommodate computationally and data-intense research tasks, including Infrastructure as a Service (IaaS) with advanced APIs; Platform as a Service (PaaS) for developing and deploying software to the science community; and Software as a Service (SaaS).

Some of the key features include:

  • Access virtual machine images preconfigured with an operating system and software to help you do scientific computations in domain-specific tasks
  • Find and use tools with the intuitive self-service portal
  • Easily manage virtual machines
  • Publish your own software suites, create your own work environments, and run the software for community use
  • Integrate with existing infrastructure components using API services
  • Easily generate and manage statistical reporting of user resources for total CPU hours and memory usages, total instances and applications launched by user, cloud monitoring, and on-demand intelligence resource allocation

Within Atmosphere, you launch an instance (a launched image of a virtual machine), selecting from the list of available images (a template of a virtual machine containing an installed operating system, software, and configuration). It is recommended that you familiarize yourself with the Linux command-line as some actions require some degree of knowledge of the command-line interface.

System Configuration

The computing environment consists of two homogenous clusters at Indiana University and Texas Advanced Computing Center (TACC) with a test environment at the University of Arizona. The system provides over 1/2 a PetaFLOPS of computational capacity and 2 petabytes of block and object storage. The individual nodes contain two Intel "Haswell" processors, 128 GB of RAM, 2 terabytes of local storage, and 10 gigabit Ethernet networking. The system leverages 40 gigabit Ethernet for network aggregation and each of the production clusters connect to Internet2 at 100 Gbps. The physically distributed system allows Jetstream to be highly available and resilient. GlobusAuth is used for large-scale file transfer and authentication.

System Specs

Table 1. System specs

System Configuration Aggregate information Per Node (Compute Node)
Machine type Dell (Various) Dell M630 Nodes
Operating system RHEL/CentOS/Ubuntu CentOS
Memory model N/A N/A
Processor cores 7,680 per site 24
CPUs 640 per site 2
Nodes 320 per site -
RAM 40TB 128GB
Network 100gbps to Internet2
40gbps x 4 to local infrastructure
10gbps to XSEDE
10gbps
Local Storage 640TB 2TB
Storage information Aggregate information Per Node
File systems Block storage for VMs, other storage options forthcoming Varies
Total disk space 640TB Local
960TB Attached Storage
2TB local
Total scratch space N/A N/A

XSEDE Service Units and Jetstream

Jetstream allocations are measured in XSEDE Service Units (SUs). On Jetstream, SUs are consumed at a rate of 1 SU per vCPU hour. For more information on XSEDE Service Units, see the XSEDE KB document "On XSEDE, how are compute jobs charged?"

SU cost per hour for each Jetstream VM size is outlined in Virtual Machine (VM) Sizes and Configurations.

Virtual Machine (VM) Sizes and Configurations

Jetstream can be used in several different virtual machine (VM) sizes which are charged in service units (SUs) based on how much of the total system resource is used. The table below outlines the VM sizes created for Jetstream.

Table 2. VM Sizes

VM Size vCPUs RAM (GB) Local Storage (GB) SU cost per hour
Tiny 1 2 8 1
Small 2 4 20 2
Medium 6 16 60 6
Large 10 30 120 10
XLarge 22 60 240 22
XX Large 44 120 480 44

This allocation information may be subject to changes in the future.

If your work requires 24 GB of RAM and 60 GB of local storage, then you would request 10 SUs per hour to cover a single Large VM instance.

If your work requires 10 GB of local storage in 1 thread using 3 GB of RAM, then you would request 2 SUs per hour for a Small VM instance. You would then multiply by the number of hours you will use that size VM in the next year and multiply by the number of VMs you will need.

To calculate the number of SUs you will need in the next year, first estimate the number of hours you expect to work on a particular project. For example, if you typically work 40 hours per week and expect to spend 25% of your time on this project that would be 10 hours per week. Next, calculate the total number of hours per year for this project:

Total hours = 10 hours per week * 52 weeks per year
Total hours = 520

Finally, calculate the total SUs for the year for a single instance medium VM:

Total SUs = 520 hours per year * vCPUs
Total SUs = 520 hours per year * 6vCPUs Total SUs = 3120

If project requires more than 1 medium size VM multiply total SUs by the number of VMs that you will need:

Total SUs needed for 3 medium size VMs = 3 * 3120
Total SUs = 9360

Shutdown your VM properly. The calculations above assume that your VM is shutdown properly.

For information on submitting a Research Allocation Request, please see https://portal.xsede.org/successful-requests. Note that all allocations above the startup level require a strong justification for the time being requested.

System Access

After you have provisioned a VM you can access it by using:

  • the Jetstream web interface, https://use.jetstream-cloud.org
  • a VNC desktop interface (if a graphical user interface has been built into the image)
  • ssh to a publicly accessible IP address (requires provisioning your VM with such an IP address and setting up sshd)

Find the public IP assigned to your instance from the VM command line. At this time, all VMs are provisioned with a public IP address but that this will change in the future. To find the public IP address for your VM, use wget from the command line:

$ wget http://ipinfo.io/ip -qO -

Step-by-step guide

  1. To start the VM provisioning process, navigate to https://use.jetstream-cloud.org.
  2. Click Login in the top right to authenticate using your XSEDE credentials.

  3. On the Globus Auth screen click Continue.

  4. Enter your XSEDE credentials.

  5. After you type in your XSEDE username and password, it will ask you to confirm whether you will allow your credentials to be used to access Jetstream. If you wish to use Jetstream, click Allow. You may wish to review the terms of service and privacy policies linked on that page. Generally, you will only see this screen the first time you log into Jetstream. However, changes to Globus Auth might mean you see this screen on a later login to Jetstream.

  6. To proceed, click Allow and the web interface to Jetstream will load.

  7. Once you are authenticated via Globus Auth, you will end up on the Jetstream landing page, also called the Dashboard. On this page you will be able to:

    • launch a new instance
    • browse help resources
    • change your settings
    • see your resources and usage history
    • view a Jetstream Community Activity feed

Adding SSH keys to the Jetstream Atmosphere environment

While Jetstream provides a web-based terminal for accessing your VM once it has been deployed, you might find that you wish to access your VM via SSH if you've provisioned it with a routable IP number. Please note that during early operations, all IP numbers offered will be routable - this will change in production.

If you need assistance creating SSH keys, please refer to the XSEDE KB article "How do I set up SSH public-key authentication to connect to a remote system?" Please do note that to get your keys on Jetstream, you will not follow the instructions In that article on placing your keys on a remote system.

Step-by-step guide

  1. To add your ssh key(s) to Jetstream, click on your username in the upper right hand corner and then click Settings.

  2. On the Settings screen, under Advanced, click Show More, to expand the section for adding your SSH key. Check the box that says "Enable ssh access into launched instances" and then click the green plus sign to actually add your key.

  3. On the next screen give the key a descriptive name and then paste the contents of your PUBLIC ssh key into the dialog box.

  4. After you have pasted in your SSH key, click Confirm. You will then be back at the Settings screen with your key shown in the SSH Configuration section.

Launching your VM

  1. To get started using a Jetstream virtual machine, click Launch New Instance from the Dashboard screen. This will take you to a search screen where you can search the image name, description, or tags of the image you would like to use or to scroll through all of the images that you have permission to use.

    For instance, if you want images named or tagged with "CentOS", enter that text in the search bar. The search is not case sensitive. Once you find an image that you wish to use, click on the name or icon and it will take you to the image information screen.

  2. On image information screen, you will see more details on the image, such as version history and what systems it is available on (Indiana, TACC, or both). On this screen you may add an image (photo) to your project, click the star to save it as favorite image, or actually launch the image.

    Click Launch to begin the process of creating an instance.

  3. Give your instance a name, select the version if there are multiple versions available, and choose which provider you want to run on, Indiana or TACC. Click Continue.

  4. Choose the instance size. This indicates the vCPUs, memory, and disk size for the VM. See the Virtual Machine Sizes table to show the available options and the SUs consumed per hour. Check projected resource usage then click Continue to move to the next screen.

  5. Select or create a project to hold this instance. If you have any existing projects, they will be shown here and you can select one. If you don't have any existing projects, click Create New Project and fill in Project Name and Description. A detailed description is optional, but it is recommended to include any grant names or other easily identifying details so others working with you may easily find it. Click Create to create the new project.

  6. On the project selection screen, click Launch to start the initialization of your instance.

  7. On the last screen, review the choices for provisioning your instance. If you need to make changes, click Back to return to previous screens.

  8. If all of your choices are correct, click Launch instance to start the build process.

  9. Below are several screens you might see during the provisioning process.

  10. The instance will be ready for use when you see a green dot and "Active" in the Status column.

Please note that it may take some time for instances to become active, 5-10 minutes on average. The start up time also depends on how busy the system is and on the size of the VM you requested.

Logging in to your VM

Jetstream allows multiple methods for accessing and using your VM. The 'Open Web Shell' link available on https://use.jetstream-cloud.org is the preferred method. If 'Open Web Shell' is unavailable in Atmosphere, log in to your instance via SSH for your operating system.

Logging in with Web Shell

  1. Log into Jetstream via the web interface, https://use.jetstream-cloud.org, and launch the instance.
  2. When the status shows as Active, click the Open Web Shell link found on the lower right hand side of the screen. If this link is unavailable try refreshing your window. If the link is still not enabled, log in to your instance via SSH for your operating system.
  3. Enter your XSEDE username and password.
  4. Click Connect and then enter your password again.

    A successful Web Shell login will look similar to the following.

Logging in with SSH

SSH login for MacOS X

  1. If you haven't already done so, add SSH keys to your account.
  2. Copy the instance IP address, either from the confirmation email or from the IP address displayed in the My Instances list.
  3. Open a terminal window for Mac OS X (from Finder, go to Applications, click Utilities, and then double-click Terminal).
  4. In the terminal window, enter the following command, using your XSEDE username and password, and the instance IP address:

    $ ssh xsede_username@ip_address
  5. Press Enter.
  6. If you need to login as the root user, then ssh in as the root:

    $ ssh root@ip_address

A successful login will look similar to the following:

SSH login for Windows using PuTTY

You can use PuTTY for logging in to an SSH window on a Windows operating system. PuTTY is an SSH client for Windows, and operates a bit differently than Terminal to make the initial SSH connection. For a useful guide to using PuTTY, see PuTTY - Remote Terminal and SSH Connectivity.

  1. If you haven't already done so, add SSH keys to your account.
  2. Copy the instance IP address, either from the confirmation email or from the IP address displayed in the My Instances list.
  3. Download the PuTTY application.
  4. Launch PuTTY.
  5. Enter your XSEDE username.
  6. Enter the IP address, either copied from your My Instances list or from the confirmation email, and click Connect.
  7. Enter your XSEDE password and click Enter.
  8. To login as the root user, make the username 'root' instead of your XSEDE username.

Logging in with VNC desktop

VNC is only available on certain images. Please look for the GUI or Desktop tags on the Featured images.

For VNC launches:

  1. Once your instance comes up, login via ssh.
  2. Type 'su' or 'sudo'.
  3. Change the password with 'passwd your_username'.
  4. Use realvnc viewer (free) to connect to x.x.x.x:1 (your_ip:1) with your username and the password you just set.

    Please note the default vnc session will only work for the user that launched the instance. While a solution for Jetstream users that wish to have multiple virtual machine users, each with individual desktops, is in development (no ETA presently), for now it is not supported.

  5. When finished with the VNC, please disconnect your VNC session via the Real VNC viewer.

If you log out via the window manager, your X session may be left in an unusable state and require restarting or rebooting the VM.

Customizing and saving a VM

An image is a type of template for a virtual machine (VM).

You can launch an instance, install the software and files you want to use, then request an image of the instance. This will save all of the changes and updates within Atmosphere. Saving instances as images saves resources. The saved image can be relaunched at any time so that it won't keep running, and using resources, when it is not being used.

After submitting the form, the Jetstream Atmosphere support staff will review and process the request. Future versions of Atmosphere will allow users to initiate the VM imaging process automatically.

Important notes before you request an image

Please note that these instructions are very much under review and may not be entirely functional during early operations.

About your /home/ directory: All files, directories, and icons located under /home/username/Desktop will be deleted. To preserve them, email help@jetstream-cloud.org. Any files installed in /home must be saved to your volume, to iRODS, or to another storage device or system external to your VM.

Here are some tips to help ensure a viable importable image:

  • Operating system: Base the image on
    • CentOS6, CentOS 7 and later
    • Ubuntu 14.04 and later, Long Term Support (LTS) versions of Ubuntu recommended
  • File system: Ext3, Ext4, or XFS.
  • Image format: RAW or QCOW2.
  • Software: Image must contain no licensed software that would prohibit use within a cloud or virtualized environment.

Before submitting a request for an image of your instance, remove the following software from the instance:

  • Licensed software that was not purchased by Jetstream.
  • Software in which the licensing otherwise prevents the use within a cloud or virtualized environment.

The following directories are deleted as part of the imaging process:

  • /home/
  • /mnt/
  • /tmp/
  • /root/

Volumes and iRODS FUSE mounts are not copied as part of the image.

The following system files are typically overwritten by the Jetstream imaging process for security and operational reasons:

  • /etc/fstab
  • /etc/group
  • /etc/host.allow
  • /etc/host.conf
  • /etc/host.deny
  • /etc/hosts
  • /etc/ldap.conf
  • /etc/passwd
  • /etc/resolve.conf
  • /etc/shadow
  • /etc/sshd/
  • /etc/sysconfig/iptables
  • /root/
  • /var/log

Request an image of an instance

You can request an image (a type of template for a virtual machine) of a running instance. This saves a complete copy of all changes and updates made to the instance since it was launched it so it can be reused at any time. It also saves resources by launching the instance only when you need it.

You also can see the list of all image requests you have made.

You can add a script before requesting the image that executes after an instance using the image is launched and active.

To get started:

  1. Log in to Jetstream web interface https://use.jetstream-cloud.org/.
  2. Detach all attached volumes from the instance. Detailed instructions for this will be coming later.
  3. Click Projects on the menu bar and open the project with the instance to use for the new image.
  4. Click the instance name. The instance must be in Active status.
  5. In the Actions list on the right, click Image.

Image Request - Image Info

The information you provide on here will help others to discover this image.

  1. New Image Name (required): Enter the name, up to 30 characters, to assign to the new image.
  2. Description of the Image (required): The description should include key words that concisely describe the tools installed, the purpose of the tools (e.g., This image performs X analysis), and the initial intent of the machine image (e.g. designed for XYZ workshop). Include key words that will help users search for this image.
  3. Image Tags (optional): Click in the field and select tags that will enhance search results for this image. You may include the operating system, installed software, or configuration information (e.g. Ubuntu, NGS Viewers, MAKER, QIIME, etc.). Tags can be added and removed later, if needed.
  4. Click Next.

Image Request - Version Info

Versioning is an important part of the imaging process. Use this information to track how your image changes over time. This information will also be helpful to others that wish to use your image.

  1. New Version Name (required): Enter the new (unique) name or number of the image. Versioning helps users understand how your changes relate to the overall progress of the Application. Versions are alphanumeric (e.g. 2.0-stable, 2.1-beta, 2.2-testing). Limit the name to 30 characters and keep versioning consistent.
  2. Change Log (required): Concisely describe what you've changed in this specific version. This description will help users understand how your application as changed over time.
  3. Click Next.

Image Request - Provider

  1. Select the cloud provider to use for the image. If you would like the image to be available on multiple clouds, email help@jetstream-cloud.org.
  2. Indicate minimum CPU and memory requirements (optional).
  3. Click Next.

Image Request - Privacy

  1. Select the visibility for the image:

    • Public: The image will be visible to all users and anyone will be able to launch it.
    • Private: The image will be visible only to you and only you will be able to launch it.
    • Specific Users: The image will be visible to only you and the users you specify. Only you and those specific users will be able to launch it. If you chose Specific Users, select the users who will be able to launch the image.
  2. Click Advanced Options or Submit.

    Advanced Options will allow you to:

    • Exclude files from the image
    • Add a deployment script
    • Require the user to verify understanding of any license restrictions

Image Request - Exclude Files (advanced option)

Note the list of directories that will automatically be excluded form the image:

  • /home/
  • /mnt/
  • /tmp/
  • /root/

In the box provided, list any additional files or directories to be excluded from the image. Write one path per line.

Image Request - Boot Scripts & Licenses (advanced option)

Deployment scripts are executed when a user launches the image and each time an instance is 'Started', 'Resumed', or 'Restarted'. These scripts should be able to handled being run multiple times without adverse effects.

Click Next to continue to the next screen without adding a new script or a software license.

  1. To add a deployment script, click in the search field and search for the title of the script.
  2. To create a new deployment script:
    • Click Create New Script, enter a title for the script, then either click URL and enter the script URL or click Full Text and enter the deployment script.
    • When done, click Create and Add, then click Next.
  1. To list any licensed software used in the image and require users to agree to the license agreement before launching, click in the search field and search for the license title.
  2. To create a new license:
    • Click Create New License, enter a title for a license, then either click URL and enter the license URL or click Full Text and enter the full license text.
    • When done, click Create and Add, then click Next.

Image Request - Review

On the Review screen, verify the information entered on the previous screens. Click Back to return to the previous screens and make corrections. When all is OK, click the checkbox certifying that the license does not contain any license-restricted software that is prohibited from being distributed within a virtual or cloud environment..

  1. Click Request Image.

You will receive an email from Support when the image is completed. Please email questions to help@jetstream-cloud.org.

Viewing your list of images and image requests

You can view your lists of images and image requests.

  1. Click Images on the top menu bar.
  2. Click:
    • MY IMAGES to view the list of your images.
    • MY IMAGE REQUESTS to view the list of your image requests.

Request that an image be deleted

Currently, the only way to delete (archive) an image you requested is to email help@jetstream-cloud.org. You will receive an email confirmation when your image has been archived.

Shutting down your VM

Linux VMs need to be shut down gracefully and securely. In a GUI environment on Linux, the methods may vary. Launching a terminal and running the command:

$ /sbin/shutdown -h now

as root (or "sudo /sbin/shutdown -h now" if you don't have root but have sudoers rights) should work consistently across Linux versions. Please keep in mind that this will stop all operations and any other active users on your VM will be logged out as the system powers down.

Once shutdown is complete and your VM session has ceased, you will still need to suspend or stop your instance from the Jetstream web interface. Stopping the instance will free up resources but will continue to burn XSEDE Service Units (SUs). You will need to click Suspend from the right side menu in order to completely "turn off" your VM so it stops consuming your allocation. See the screenshot below to see the options available from the Actions menu.

When your VM instance is completely shut down and no longer consuming SUs and resources, the status will have a red dot icon and report itself as "Suspended" as the screen below shows.

Getting Scientific Software

If you are using a CentOS/rpm based VM, you can utilize software packaged by the XSEDE Campus Bridging team for the XSEDE National Integration Toolkit (XNIT).

The XSEDE National Integration Toolkit (XNIT), formerly known as the XSEDE Yum Repository, is a collection of Red Hat Package Manager (RPM) packages assembled to simplify the process of converting a "bare-bones" Linux cluster into a high-performance, parallel computing system that can be used to support scientific discovery. The packages included in the repository are specific versions and builds of scientific, mathematical, and visualization applications recommended by the Extreme Science and Engineering Discovery Environment (XSEDE) for optimal compatibility with XSEDE digital services.

Please see the Knowledge Base entry about XNIT for details on what software is available and how to set up the XNIT repository on your VM.

If you are using a Debian based system such as Ubuntu, you can use the Alien package to convert RPMs to DEB packages for installation. This is not supported and may not perform exactly as expected. For more information on Alien, see Converting .rpm Packages To Debian/Ubuntu .deb Format With Alien.

Globus Connect file transfer tools

Globus Online is a fast, reliable, and secure file transfer service for easily moving data to, from, and between digital resources on the Extreme Science and Engineering Discovery Environment (XSEDE).

Please refer to the Globus Connect Personal GUI documentation or Globus Connect Personal Command Line documentation for detailed instructions on installing and using Globus Connect Personal on your Jetstream VM.

You will need a Globus account (sign up at http://www.globus.org/signup) and a valid SSH key (see instructions here if necessary) prior to the installation of Globus Personal Connect software.

Troubleshooting

Table 3. Troubleshooting

Problem Possible Solution
VM gets stuck trying to provision the network Click Projects (on the top menu) and select the Project with the instance that is stuck. From the Actions menu on the right try Redeploy first, Reboot if that fails, and Hard Reboot if the first two actions fail.

If none of these Actions work, don't delete the instance. Get the Alias (UID) and IP Address, if it is there, and email that information to .

Linux Resources

Table 4. Linux Resources

An Introduction to Linux Cornell Virtual Workshop (full access to the training materials requires logging in with XSEDE credentials)
Learn Linux section from Linux.com website
HowtoForge user-friendly Linux tutorials
Linux Knowledge Base tutorials, forums, and how-tos for Linux
Learning the Shell learn the Linux command line

Policies

Good citizenship: Your VM burns SUs for the time it is in operation. It is beneficial to you and other Jetstream users to shut it down when it is not in active use. This frees up resources for other users and also preserves your SUs for future use.

Security: To ensure that you do not inadvertently allow others to access your Jetstream account, please remember to log out from the menu at the top right (where it shows your username).

Glossary

Table 5. Glossary

image template of a virtual machine containing an installed operating system, software, and configuration  
instance launched image of a virtual machine  
Information as a Service (IaaS) form of cloud computing that provides virtualized computing resources over the Internet. Read more...
Platform as a Service (PaaS) cloud computing model that delivers applications over the Internet. Read more...
Software as a Service (SaaS) software distribution model in which applications are hosted by a vendor or service provider and made available to customers over a network, typically the Internet. Read more...

Last update: May 17, 2016