Stampede 2 Maintenance 16 January 2018

Posted by Jacob Getz on Jan 9, 2018 11:24:40 AM

Updated on Jan 23, 2018 10:32:44 PM

The Stampede2 maintenance is complete and the system is back in production. During the maintenance we upgraded the login nodes. This should have little to no adverse impact, but you may need to take minor actions to account for the new login nodes: e.g. reschedule cron jobs or update known hosts on your client.

Updated on Jan 23, 2018 7:27:31 PM

Stampede2 maintenance has been extended.  We will provide further updates as they become available.

Updated on Jan 22, 2018 11:04:59 AM

As a reminder, Stampede2 will undergo scheduled system maintenance on Tuesday, 23 Jan 2018 between 8:00AM and 7:30PM CST. This is the maintenance originally scheduled for 16 Jan 2018 that we delayed due to inclement weather. The system will be unavailable during this window.

 

Planned activities include upgrading the login nodes. This should have little to no adverse impact, but after the maintenance you may need to take minor actions to account for the new login nodes: e.g. rescheduling cron jobs or updating known hosts on your client.

 

If you submit a job before the maintenance, and your job cannot finish before the maintenance begins, your job will run when the maintenance is over. The squeue command will report "ReqNodeNotAvailable" ("Required Node Not Available"). The showq utility will list the job as "BLOCKED" and report its status as "WaitNod" ("Waiting for Nodes"). Note that the hours leading up to the maintenance are an excellent time to submit shorter, smaller jobs that can complete before the maintenance begins: as the queues drain there will be many nodes available, and your wait time may be short.

Updated on Jan 15, 2018 9:51:38 PM

The University of Texas will be closed due to inclement weather tomorrow, January 16, 2018. For this reason, the Stampede2 maintenance scheduled for this date will be rescheduled for Tuesday, January 23, 2018 from 8am to 7:30pm (CST).

Original Posting

Stampede2 will be unavailable from 8 a.m. to 7:30 p.m. (CT) on Tuesday, 16 January 2018. System maintenance will be performed during this time.

 

If you submit a job before the maintenance, and the time you request exceeds the time remaining until the maintenance begins, your job will run when the maintenance is over. The squeue command will report "ReqNodeNotAvailable" ("Required Node Not Available"). The showq utility will list the job as "BLOCKED" and report its status as "WaitNod" ("Waiting for Nodes"). Note that the hours leading up to the maintenance are an excellent time to submit shorter, smaller jobs that can complete before the maintenance begins: as the queues drain there will be many nodes available, and your wait time may be short.