HPC Node "Striker" Unavailablity

Incident Report for Gobbato IT Solutions

Resolved

The Instance has been fixed.
Posted Mar 12, 2025 - 21:15 CET

Investigating

We encountered a Kernel Panic on one of our High Performance Compute Nodes again.
The System is currently unstable and we are unable to Export the Systems on it.
Posted Mar 09, 2025 - 10:47 CET

Update

We encountered a Kernel Panic on one of our High Performance Compute Nodes.
We needed to force-Reset the Node and are Investigating the Issue and Stability of the System.

[2556371.064301] BUG: unable to handle page fault for address: 0000000000010018
[2556371.064577] #PF: supervisor read access in kernel mode
[2556371.064791] #PF: error_code(0x0000) - not-present page
Posted Mar 09, 2025 - 00:52 CET

Monitoring

The System mentioned is currently Offline and we are investigating.
Posted Mar 09, 2025 - 00:25 CET
This incident affected: Cloud Plattforms (Cloud Plattform (EGH)).