Message boards : Theory Application : Error while computing
Message board moderation
| Author | Message |
|---|---|
GuySend message Joined: 9 Feb 08 Posts: 64 Credit: 2,244,547 RAC: 1,751 |
All my Theory jobs are failing - https://lhcathome.cern.ch/lhcathome/results.php?userid=95350 For eg: Run time 5 sec This line is common to all Stderr outputs. msg="The cgroupv2 manager is set to systemd but there is no systemd user session available"I have set up lingering. Help would be welcomed. Guy |
GuySend message Joined: 9 Feb 08 Posts: 64 Credit: 2,244,547 RAC: 1,751 |
Only the "v302.10 docker" apps are failing. The "v301.00 vbox64_theory" apps are ok. Linux kernel 6.12.66 BOINC 8.2.8 i7-4790K, 32GB, m.2, RTX 2060 mini 6GB |
|
Send message Joined: 3 Nov 12 Posts: 97 Credit: 192,167,898 RAC: 63,982 |
|
|
Send message Joined: 5 Apr 25 Posts: 82 Credit: 2,451,771 RAC: 8,756 |
In reply to Guy's message of 27 Mar 2026: Only the "v302.10 docker" apps are failing. The "v301.00 vbox64_theory" apps are ok. I have the same issue and couldn't fix it yet, not even with apparently very precise AI help. Still have some ATLAS to finish off, then I'll try again.
|
|
Send message Joined: 18 Dec 15 Posts: 1980 Credit: 160,690,625 RAC: 43,221 |
one of my PCs produces a lot of faulty tasks, they only run for 2 seconds before they fail, stderr says "wsl_init(): no usable WSL distro" https://lhcathome.cern.ch/lhcathome/result.php?resultid=434332900 any idea what the reason for this problem might be? |
|
Send message Joined: 9 Apr 23 Posts: 3 Credit: 73,626 RAC: 1,499 |
I have am crunching tasks using the LHC Theory application with BOINC 8.2.9 (Alpha test). This is a Windows 11 PC using Docker WSL. These tasks use Theory Simulation v302.10 (docker). All (about 50) have completed successfully in a few minutes to a few hours, except one. This task is still running after 24 hours: https://lhcathome.cern.ch/lhcathome/result.php?resultid=434294656 On this PC, the task VmmemWSL is busy with CPU time and memory. Should I let it keep running? |
|
Send message Joined: 4 Mar 17 Posts: 45 Credit: 12,938,644 RAC: 6,477 |
When you click on the link of that task. On that page you click on the Workunit 240197090 and on that page you have at the bottom "input file" (could be that your browser complains about insecure connection) In that input file you can read in the first few lines echo "runspec=boinc pp z1j 8000 70 - herwig7 7.2.1 nlo-dipole 22000 733" Herwig7 tasks are long running tasks(up to 10days in some cases) https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6251 in this Thread is talked about the long running herwig tasks. In the BoincManager you can click on the task and then press the [Show Graphics] button. That should open a browser website where you can see the logs, (Reloading the webpage to see more of the logs) Here it is counting up from Integrate 1 of 760: and in a second step 99000 events processed (Your linked task should have 22'000 tasks after it finished the Integrate 1 of 760 step) |
|
Send message Joined: 27 Sep 08 Posts: 935 Credit: 781,495,911 RAC: 76,972 |
BOINC didn't detect that podman is there so it can do the work https://github.com/BOINC/boinc/wiki/Installing-Docker-on-Windows See at the bottom what should be in the logs Looking at this computer its setup with VirtualBox so you need to remove this and setup podman/docker if you want. https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10555784 or if you don't want to run docker, you can set in the cc_config:
<cc_config>
<options>
<dont_use_docker>1</dont_use_docker>
</options>
</cc_config>
I feel like this is an issue in 8.2.9, where in 8.2.8, it would not send you docker WU since you didn't have the right config where as now it sends both. |
|
Send message Joined: 9 Apr 23 Posts: 3 Credit: 73,626 RAC: 1,499 |
Thank you @Toggleton! I will keep that task running. I did glance at the work unit's input file briefly before. Your helpful guidance about the metadata helps me know what to look for and is appreciated. |
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1302 Credit: 95,799,211 RAC: 21,759 |
In reply to Toby Broom's message of 31 Mar 2026:
I just found one that got a Docker with BOINC version 8.2.4 that is only supposed to get VB https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10621464 In fact lots of them and still getting more https://lhcathome.cern.ch/lhcathome/results.php?hostid=10621464 |
|
Send message Joined: 18 Nov 17 Posts: 135 Credit: 59,156,305 RAC: 1,368 |
A lot of my Theory tasks falls with error: -163 (0xFFFFFF5D) ERR_FILE_MISSING Example: https://lhcathome.cern.ch/lhcathome/result.php?resultid=434572037 |
|
Send message Joined: 15 Jun 08 Posts: 2742 Credit: 302,356,212 RAC: 85,603 |
Your computer reports an outdated VirtualBox version. Please consider to upgrade to a recent version. 2026-04-09 09:23:53 (45684): Detected: VirtualBox VboxManage Interface (Version: 5.2.44) |
|
Send message Joined: 9 Apr 23 Posts: 3 Credit: 73,626 RAC: 1,499 |
Besides herwig, other task tasks that can run a long time include powheg-box and pythia |
|
Send message Joined: 10 Jul 18 Posts: 1 Credit: 857,859 RAC: 12,418 |
All Theory tasks failing at random times even after 3 hours of computation with no Stderr output. Any insight to what might be going on here? Is it something on my end? Boinc 8.2.9 Theory Simulation 302.10 (Docker) |
|
Send message Joined: 15 Jun 08 Posts: 2742 Credit: 302,356,212 RAC: 85,603 |
In reply to Dave Studdert's message of 16 Apr 2026: All Theory tasks failing at random times even after 3 hours of computation with no Stderr output. https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=11089134 This computer did not return any log, did't it? Hence, you may need to ensure that docker works independent from BOINC or use Podman instead like on your other computer. https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=11077476 This computer reports what's going wrong: time="2026-04-15T15:21:03+09:30" level=warning msg="The cgroupv2 manager is set to systemd but there is no systemd user session available" time="2026-04-15T15:21:03+09:30" level=warning msg="For using systemd, you may need to log in using a user session" time="2026-04-15T15:21:03+09:30" level=warning msg="Alternatively, you can enable lingering with: `loginctl enable-linger 112` (possibly as root)" time="2026-04-15T15:21:03+09:30" level=warning msg="Falling back to --cgroup-manager=cgroupfs" So, you either set 'cgroup_manager="cgroupfs"' in the [engine] section of the boinc user's containers.conf or you enable lingering for userid 112 as shown above, A BOINC client restart may not be required but is recommended. Additional hint Both computers report 40 cores. Please ensure you run a local CVMFS client on each of them. If no CVMFS client is present on the host, each Theory container will use it's internal CVMFS. So, 40 containers on a host will increase the network load by a factor of 40. |
©2026 CERN