Frontera /scratch1 filesystem Friday 15 May 2020

Posted by Mark Brueschke on May 15, 2020 10:50:47 AM

Updated on May 15, 2020 4:08:36 PM

The /scratch1 filesystem recovery has completed and all of the compute queues are now open for normal production.

Please submit a consulting ticket if you continue to see errors or problems with any files in the /scratch1 filesystem.

Original Posting

Frontera's /scratch1 filesystem currently has two of its Lustre storage targets offline due to errors on them and we are working with the vendor to restore it, but it might take some time. For now, the compute queues have been closed and the target has been deactivated on the login nodes to prevent hangs when trying to use the filesystem, but users will get errors if they attempt to access files residing on the offline target. If the repair will take more than a few hours, we will deactivate the target on the compute nodes and re-open the queues, however, jobs will fail if they try to access a file on that target. New files can be created on the other storage targets for /scratch1 without any errors.