Updates


Event Date Summary
Cedar is available again as of 2020-04-20, 0:00 (Pacific). On April 15, 2020 it was determined that the Cedar system had been compromised and crypto-mining binaries were running on the compute nodes. The symptoms of the compromise were slowness of research programs and an increased load on the compute nodes. When one of the research groups mentioned that the slowness only occurred between 00:00 and 6:00 (Pacific) in the morning the local team discovered around 2:00am on April 15 a hidden executable running on the compute nodes. A trace was attached to the executable and it was found that it sent data of the mining operation via a compromised server of the ATLAS project to an IP address in Estonia. The processes were started by a standard cronjob on the Linux system, however, the script that is executed by that cronjob had been replaced. On April 16 it was discovered that the initial compromise happened around March 21 to 23 (March 23 being the most probable date) through a compromised user account. That user had sudo privileges on the ATLAS servers. These root privileges were used on the ATLAS storage servers to insert a suid-root binary in one of the NFS exported directories. Since that directory was mounted on all compute nodes the intruder was able gain root privileges there as well and able to install the cypto-mining binaries. The two headnodes (cedar1 and cedar5) do not mount this directory and therefore were never compromised. The crypto-mining operation was disabled just before midnight on April 15 by removing the binary that the cronjob started. On April 16 the compromised user account was disabled. The suid binary on the NFS exported directory was removed. The following steps were taken to better avoid such incidents in the future: a) servers where the intruder gained root privileges have been rebuilt; this includes all compute nodes and all ATLAS servers, b) the NFS client options were changed to include the nosuid flag that prevents privilege escalation via suid binaries, c) since the intruder had access to all users’ ssh keys on the compute nodes, all ssh keys had to be removed and were added to a revocation list so that they cannot be used in the future. Users who use ssh keys to access Cedar need to create new ssh keys. It needs to be emphasized that while the use of ssh keys generally improves security, ssh keys with weak passphrases are a risk for the system; generally the use of ssh keys without passphrase is not permitted. The incident response team was able to fully discover and understand the mechanism of the exploit. Nevertheless, early in the process a security consulting firm was involved. That company advised that crypto-mining intruders usually try to disturb the system as little as possible to avoid detection. For that reason it is unlikely that any user data were affected due to this compromise. A breach of privacy is not expected. The incident response team has discovered no indication at all that the intruder ever looked at users’ directories and files.

Incident description

System Incident status Start Date End Date
Cedar Closed
Created by Martin Siegert on

Title


Outage - Panne


Summary


Cedar is currently unavailable for emergency maintenance. We will update this incident when we have more details. Cedar est présentement non-disponible pour une opération de maintenance d'urgence. Nous mettrons à jour cet incident lorsque nous aurons d'autres détails.


Updated by Martin Siegert on