Message boards :
ATLAS application :
Zombie Thread after the Workunit is finished.
Message board moderation
Author | Message |
---|---|
Send message Joined: 4 Mar 17 Posts: 20 Credit: 8,234,904 RAC: 12,546 |
I have the problem that Zombie Threads are still running after the Workunit has finished and the slot of the workunit is already deleted. It very likely started this weekend. In htop i see them as /bin/bash ./runpilot2-wrapper.sh -q BOINC_MCORE -j managed --pilot-user ATLAS --harvester-submit-mode PUSH -w generic --job-type managed --resource-type SCORE_HIMEM --pilotversion 3.7.0.36 -z -t --piloturl local --mute --container They take ~40% cputime per workunit so it takes up quite some CPU time after a few hours and i need to restart boinc to get rid of them They gets started around 6-8minutes after start of the Workunit. Before the python runargs.EVNTtoHITS.py starts to use the CPU. Could this be something that is broken with the current batch(the last 2 days) or could it be something that is broken by an Arch linux update or caused by my currently unstable internet. Has anyone else the same problem? |
Send message Joined: 3 Nov 12 Posts: 36 Credit: 117,967,568 RAC: 128,018 |
+1 |
Send message Joined: 3 Nov 12 Posts: 36 Credit: 117,967,568 RAC: 128,018 |
I see this in Manjaro Linux. (A modification of arch) Tested kernel 6.6 and 6.7. Same result. |
Send message Joined: 4 Mar 17 Posts: 20 Credit: 8,234,904 RAC: 12,546 |
Seems like the zombie Thread problem is gone on my device. Somewhere around 22 Jan 2024, 16:58:50 UTC(time where the task was sent) and 24 Jan 2024, 17:28:26 UTC(where the task did run) was it fixed. |
©2024 CERN