Message boards : ATLAS application : ATLAS jobs survive BOINC client restart: CVMFS orphan processes pegging CPU cores and the fix
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 25 Mar 24 Posts: 3 Credit: 1,191,600 RAC: 5,619 |
Thread: ATLAS jobs survive BOINC client restart: CVMFS orphan processes pegging CPU cores and the fix Hi all, Posting this because it took significant investigation to understand what was happening and as far as I can tell there is no clear documentation of this behaviour or its mitigations anywhere. If you're running Einstein@Home ATLAS tasks and have ever noticed a CPU core that won't settle down even after stopping BOINC — this is why. The Symptom After stopping the BOINC client, one or more CPU cores remain pegged at 100%. The BOINC interface shows no running tasks. pgrep boinc returns nothing. But ps aux shows a process — typically python runargs.EVNTtoHITS.py — consuming 200-300% CPU, owned by root, with PID 1 as its parent. BOINC has no idea it exists. It will run indefinitely until manually killed or the machine is rebooted. Root Cause — CVMFS Process Re-parenting ATLAS tasks run their simulation software via CVMFS — the CERN Virtual Machine File System, a distributed read-only software repository mounted at /cvmfs/atlas.cern.ch/. At some point in the wrapper execution chain, a process launched via CVMFS gets re-parented to PID 1 (init/systemd). This is a standard Unix behaviour: when a parent process exits before its child, the child is adopted by init. In the ATLAS execution chain this re-parenting is a consequence of how the CVMFS-hosted wrapper scripts manage their child processes. The GEANT4 simulation process — python runargs.EVNTtoHITS.py — ends up with PID 1 as its parent, completely detached from BOINC's process tree. The result: The process is invisible to any tool that walks the process tree from the boinc PID downward It survives BOINC client stops and restarts entirely It runs as root via CVMFS, consuming 200-300% CPU (multi-threaded) It persists until manually killed or the machine reboots Why KillMode=process Makes It Worse If you're using the systemd boinc-client.service with KillMode=process in a drop-in (a common recommendation for allowing BOINC to clean up gracefully), this setting explicitly tells systemd to kill only the main boinc process when the service stops — not the entire cgroup. This means even processes that haven't escaped the process tree can persist after boinc-client stops. The fix is to change to KillMode=control-group in your boinc-client drop-in: ini# /etc/systemd/system/boinc-client.service.d/docker-dep.conf [Service] KillMode=control-group This nukes the entire cgroup when the service stops — every process associated with the boinc-client unit regardless of process tree position. Critical caveat: This only works when BOINC is started via systemctl start boinc-client. If BOINC is started manually (e.g. sudo boinc --redirectio &), the process is not in the boinc-client service cgroup and KillMode=control-group has no effect on it. Detection — Name-Based Process Search The standard approach to finding BOINC worker processes — walking the process tree from the boinc client PID using pgrep -P recursively — will never find orphaned ATLAS processes. They have PID 1 as their parent and are invisible to tree-based traversal. The workaround is parallel name-based detection. In our CPU affinity management script we added a dedicated function: bashget_atlas_pids() { pgrep -f "runargs\|EVNTtoHITS\|AtlasG4\|Sim_tf\|Gen_tf\|python.*atlas\|python.*cern" 2>/dev/null } This runs alongside the normal descendant walk and results are merged: bashmapfile -t ALL_DESCENDANTS < <({ get_descendants "$CLIENT_PID"; get_atlas_pids; } | sort -u) Orphaned ATLAS processes are now visible to the affinity manager. They receive CPU affinity assignments and renice -n 19 treatment — which means even an escaped ATLAS simulation running as root at 238% CPU gets pushed to lowest scheduling priority and pinned to a rotating window of cores rather than hammering the same cores indefinitely. This doesn't kill the orphans. It manages them — constraining their impact while they run to completion. Cleanup — Pre-Start pkill Orphans from a previous session need to be eliminated before a new BOINC session begins. Add the following to the top of your startup script, before launching the boinc daemon: bashsudo pkill -f "runargs.EVNTtoHITS" 2>/dev/null sudo pkill -f "EVNTtoHITS" 2>/dev/null The 2>/dev/null suppression means this is silent when there's nothing to kill — which is most of the time. When there is something to kill, executing this before boinc starts ensures the new session doesn't inherit stale orphans from the previous one. Manual Cleanup To kill an active orphan immediately: bash# Find it ps aux | grep -i "EVNTtoHITS\|runargs" | grep -v grep # Kill it sudo pkill -f "runargs.EVNTtoHITS" sudo pkill -f "EVNTtoHITS" # Verify it's gone pgrep -af "EVNTtoHITS" Known Remaining Limitations With all mitigations in place the orphan problem is managed but not fully eliminated: KillMode=control-group only protects sessions started via systemctl. Manual startup sessions remain exposed to orphan creation if CVMFS re-parents a process mid-session. The affinity script detects and manages orphans by name pattern — new ATLAS process names introduced by CERN software updates may need to be added to the get_atlas_pids() pattern list. Orphans from a manually started session that wasn't cleanly shut down will persist until the next startup sequence runs the pre-start pkill. In practice on our setup: orphans are cleaned up at the next start sequence. The combination of name-based detection, pre-start cleanup, and renice management keeps the system stable for long-running sessions. Summary of Mitigations MitigationWhat It DoesLimitationKillMode=control-groupKills entire cgroup on service stopOnly works with systemctl startget_atlas_pids() in affinity scriptDetects and manages orphans by nameRequires pattern updates for new process namesPre-start pkill in startup scriptEliminates previous session orphansReactive, not preventiverenice -n 19 on detected orphansLimits impact of running orphansDoesn't terminate them System Configuration Tested on: Kubuntu 24.04 LTS BOINC 8.2.8 Einstein@Home ATLAS tasks via CVMFS docker-ce 29.3.0 I'm not aware of a permanent upstream fix for the re-parenting behaviour itself — it appears to be a consequence of how CVMFS-hosted wrapper scripts manage process lifecycle, not a bug in any single component. If anyone has additional context on the CVMFS side of this I'd be very interested to hear it. Full documentation including scripts, systemd units, and troubleshooting history: github.com/black-vajra/sable-boinc_admin |
©2026 CERN