Updates


Event Date Summary
The planned upgrade to the project file system during the May 25th outage has been postponed. The backup system on Graham was down for a week and it is still processing a backlog of files. We do not want to perform the project file system upgrade without a complete backup of all data. We will still be taking this opportunity to perform other needed maintenance tasks.

Incident description

System Incident status Start Date End Date
Graham Closed
Created by Kaizaad Bilimorya on

Title


Planned Outage - Arrêt planifié


Summary


Starting Wednesday, May 25th, 2022, at 9 am ET, the Graham cluster will be unavailable to all users as we perform an upgrade of the project file system. All running jobs will be terminated but queued jobs will remain. Queued jobs that would extend past the beginning of the outage will not be allowed to launch until after the maintenance.  The work will be completed by Thursday, May 26th, 2022 at 10 am.

At this time we will also reduce the maximum job duration to 7 days on Graham. This is the same scheduler configuration as the Narval and Beluga clusters. Cedar will continue to accept jobs with a duration of more than 7 days.

Please watch https://status.computecanada.ca for updates on the availability of Graham and all other national systems.


We are upgrading the project file system to provide better overall performance. The upgrade is not expected to cause data loss; however, as with any file system maintenance, issues could potentially arise. Should issues arise, we could recover files from the daily backups.

This outage will impact the cluster, login nodes, visualization nodes (VDI) as well as data transfer nodes (DTN). There will be no impact to the Graham cloud.

        Start Time: 9 am ET, Wednesday, May 25, 2022
        Anticipated End Time: 10 am ET, Thursday, May 26, 2022

Users will be notified by email when the cluster is up and running again.
For questions, or assistance migrating to other national systems, please email support@computecanada.ca


Updated by Kaizaad Bilimorya on