Events:
The cause of the UPS shutdown was an overheat condition. Currently it is not clear if the UPS failed and caused the room to overheat or the room cooling failed and caused the UPS to overheat first. Shutdown temperature was 42deg Celsius which lower than the specified maximum temperature of the UPS. Several rectifier modules in the UPS are broken and have to be replaced. Eaton is shipping replacement parts. Until the UPS is operational again, it is bypassed and all calculation nodes are directly on the KTH power distribution. All servers are on a second seperate UPS system, too.
During debugging of an overload condition on fileserver glycine, the server aborted (crashed) after delivering some debug information.
Currently salvaging file systems on glycine. Typical duration: 50minutes.