Message boards : ATLAS application : Failed tasks not cleaning up and exiting in reasonable time
Message board moderation

To post messages, you must log in.

AuthorMessage
Glohr

Send message
Joined: 13 Jan 24
Posts: 39
Credit: 5,957,635
RAC: 19,527
Message 51915 - Posted: 28 May 2025, 16:48:25 UTC

Today I found two VirtualBox ATLAS tasks in a sort of zombie state with stderr containing:

2025-05-26 22:19:39 (7140): Guest Log: [INFO] Probing /cvmfs/atlas.cern.ch... OK
2025-05-26 22:19:39 (7140): Guest Log: [INFO] Detected branch: prod
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Failed to copy ATLASJobWrapper-prod.sh
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] VM early shutdown initiated due to previous errors.
2025-05-26 22:21:58 (7140): Guest Log: [DEBUG] Cleanup will take a few minutes...
2025-05-26 23:46:54 (7140): Status Report: Elapsed Time: '6000.000000'
2025-05-26 23:46:54 (7140): Status Report: CPU Time: '31.187500'
[...]
2025-05-28 07:44:48 (7140): Status Report: Elapsed Time: '114000.000000'
2025-05-28 07:44:48 (7140): Status Report: CPU Time: '343.546875'

The other log is similar.

Cleanup seems to have failed so I will abort both.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422888662
https://lhcathome.cern.ch/lhcathome/result.php?resultid=422887835

Other ATLAS tasks have been completing successfully on the same system. Does anyone have an explanation for this behavior?
ID: 51915 · Report as offensive     Reply Quote

Message boards : ATLAS application : Failed tasks not cleaning up and exiting in reasonable time


©2025 CERN