Message boards : Theory Application : Theory in containers
Message board moderation

To post messages, you must log in.

AuthorMessage
ProfileLaurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 420
Credit: 240,048
RAC: 118
Message 52898 - Posted: 28 Jan 2026, 7:49:30 UTC
Last modified: 28 Jan 2026, 8:40:52 UTC

A new version of the Theory app which runs in containers in now available as a beta. In order to run this you will need BOINC client v8.2 or newer. Podman should also be available on your system. The documentation for this can be found on the BOINC wiki.
ID: 52898 · Report as offensive     Reply Quote
Schizm

Send message
Joined: 30 Sep 21
Posts: 2
Credit: 9,032,345
RAC: 23,533
Message 52900 - Posted: 28 Jan 2026, 9:26:17 UTC

Hi,

I have received a bunch of these workunits and they all seem to fail within seconds.
Normally i get native ATLAS workunits and that seems to work fine; eg CVMFS and podman seem to be working as intended.

Is there a setting i have to change on my end to not have these workunits fail?

The gist of the errorlogs:

time="2026-01-28T09:59:58+01:00" level=warning msg="The cgroupv2 manager is set to systemd but there is no systemd user session available"
time="2026-01-28T09:59:58+01:00" level=warning msg="For using systemd, you may need to login using an user session"
time="2026-01-28T09:59:58+01:00" level=warning msg="Alternatively, you can enable lingering with: `loginctl enable-linger 126` (possibly as root)"
time="2026-01-28T09:59:58+01:00" level=warning msg="Falling back to --cgroup-manager=cgroupfs"
time="2026-01-28T09:59:58+01:00" level=warning msg="The cgroupv2 manager is set to systemd but there is no systemd user session available"
time="2026-01-28T09:59:58+01:00" level=warning msg="For using systemd, you may need to login using an user session"
time="2026-01-28T09:59:58+01:00" level=warning msg="Alternatively, you can enable lingering with: `loginctl enable-linger 126` (possibly as root)"
time="2026-01-28T09:59:58+01:00" level=warning msg="Falling back to --cgroup-manager=cgroupfs"


For now i have disabled getting tasks to not flood you with broken workunits.

Kind regards
ID: 52900 · Report as offensive     Reply Quote
Toggleton

Send message
Joined: 4 Mar 17
Posts: 37
Credit: 12,435,212
RAC: 6,778
Message 52901 - Posted: 28 Jan 2026, 9:34:22 UTC

On my device(Arch linux with docker) do they run longer but they make the finish steps and boinc count them as computation failure

********* Total number of errors, excluding junctions = 0 *************
********* Total number of errors, including junctions = 0 *************
********* Total number of warnings = 0 *************
********* Fraction of events that fail fragmentation cuts = 0.00000 *********


Generator run finished successfully
INFO: rivet analysis finished: numEvents=100000 crossSection=23.5899
---
the last line of the log
data: REF_ALEPH_2004_I636645_d91-x01-y01.dat -> /scratch/dat/ALEPH_2004_I636645-ee-189/zhad-C-aleph1-d91-x01-y01/ALEPH_2004_I636645.dat

https://lhcathome.cern.ch/lhcathome/result.php?resultid=432008470
https://lhcathome.cern.ch/lhcathome/result.php?resultid=432008470
ID: 52901 · Report as offensive     Reply Quote
ProfileLaurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 420
Credit: 240,048
RAC: 118
Message 52902 - Posted: 28 Jan 2026, 9:49:25 UTC - in response to Message 52900.  

For Linux, you might have to enable linger for the boinc user.

sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 boinc
sudo loginctl enable-linger boinc


cat /var/lib/boinc/.config/containers/containers.conf
[engine]
cgroup_manager = "cgroupfs"
ID: 52902 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1491
Credit: 9,985,849
RAC: 991
Message 52903 - Posted: 28 Jan 2026, 9:55:35 UTC - in response to Message 52900.  

In reply to Schizm's message of 28 Jan 2026:
Hi,

I have received a bunch of these workunits and they all seem to fail within seconds.
Normally i get native ATLAS workunits and that seems to work fine; eg CVMFS and podman seem to be working as intended.

Is there a setting i have to change on my end to not have these workunits fail?
I'm a layman on Linux and containers, but I think you have to set in containers.conf "cgroup-manager=cgroupfs"
ID: 52903 · Report as offensive     Reply Quote
ProfileLaurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 420
Credit: 240,048
RAC: 118
Message 52904 - Posted: 28 Jan 2026, 10:19:33 UTC - in response to Message 52901.  

I can't see any output from the job. This is strange since even if the job failed, we should still see some output from the client or docker wrapper.
ID: 52904 · Report as offensive     Reply Quote
Toggleton

Send message
Joined: 4 Mar 17
Posts: 37
Credit: 12,435,212
RAC: 6,778
Message 52905 - Posted: 28 Jan 2026, 10:35:58 UTC - in response to Message 52904.  

Here are some logs of this task https://lhcathome.cern.ch/lhcathome/result.php?resultid=431994641
https://pastebin.com/jGVBsSBT stderr.txt
https://pastebin.com/m7YVeS1G runRivet.log

Once the current tasks have finished i will try to switch to podman if that works better than docker.
ID: 52905 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1491
Credit: 9,985,849
RAC: 991
Message 52906 - Posted: 28 Jan 2026, 12:27:24 UTC

@Laurence: Why are you running every 10 seconds these 2 commands:

ps --all -f
and
stats --no-stream --format "{{.CPUPerc}} {{.MemUsage}}"

Do you consider to reduce this?
ID: 52906 · Report as offensive     Reply Quote
Toggleton

Send message
Joined: 4 Mar 17
Posts: 37
Credit: 12,435,212
RAC: 6,778
Message 52908 - Posted: 28 Jan 2026, 12:54:00 UTC - in response to Message 52905.  

In reply to Toggleton's message of 28 Jan 2026:
Once the current tasks have finished i will try to switch to podman if that works better than docker

Not sure why docker did not work but Podman with linger enabled works fine with multiple tasks finished successfully.
ID: 52908 · Report as offensive     Reply Quote
ProfileLaurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 420
Credit: 240,048
RAC: 118
Message 52910 - Posted: 28 Jan 2026, 13:26:37 UTC - in response to Message 52906.  
Last modified: 28 Jan 2026, 13:27:42 UTC

In reply to Crystal Pellet's message of 28 Jan 2026:
@Laurence: Why are you running every 10 seconds these 2 commands:

ps --all -f
and
stats --no-stream --format "{{.CPUPerc}} {{.MemUsage}}"

Do you consider to reduce this?


Not sure, this in in the upstream code. Feel free to post the question in the issue tracker.
ID: 52910 · Report as offensive     Reply Quote
ProfileLaurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 420
Credit: 240,048
RAC: 118
Message 52911 - Posted: 28 Jan 2026, 13:29:11 UTC - in response to Message 52908.  

In reply to Toggleton's message of 28 Jan 2026:
In reply to Toggleton's message of 28 Jan 2026:
Once the current tasks have finished i will try to switch to podman if that works better than docker

Not sure why docker did not work but Podman with linger enabled works fine with multiple tasks finished successfully.


Great!
ID: 52911 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1491
Credit: 9,985,849
RAC: 991
Message 52912 - Posted: 28 Jan 2026, 14:22:10 UTC - in response to Message 52910.  

In reply to Laurence's message of 28 Jan 2026:
In reply to Crystal Pellet's message of 28 Jan 2026:
@Laurence: Why are you running every 10 seconds these 2 commands:

ps --all -f
and
stats --no-stream --format "{{.CPUPerc}} {{.MemUsage}}"

Do you consider to reduce this?


Not sure, this in in the upstream code. Feel free to post the question in the issue tracker.
Thanks Laurence. It suppose to be a check every 10 seconds, whether the job has exited.
ID: 52912 · Report as offensive     Reply Quote
[VENETO] boboviz
Avatar

Send message
Joined: 7 May 08
Posts: 266
Credit: 2,118,600
RAC: 2,026
Message 52913 - Posted: 28 Jan 2026, 14:34:39 UTC

Up to now, no problems with my Win11 64bit
ID: 52913 · Report as offensive     Reply Quote
ProfileLaurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 420
Credit: 240,048
RAC: 118
Message 52914 - Posted: 28 Jan 2026, 14:43:57 UTC - in response to Message 52912.  

In reply to Crystal Pellet's message of 28 Jan 2026:
In reply to Laurence's message of 28 Jan 2026:
In reply to Crystal Pellet's message of 28 Jan 2026:
@Laurence: Why are you running every 10 seconds these 2 commands:

ps --all -f
and
stats --no-stream --format "{{.CPUPerc}} {{.MemUsage}}"

Do you consider to reduce this?


Not sure, this in in the upstream code. Feel free to post the question in the issue tracker.
Thanks Laurence. It suppose to be a check every 10 seconds, whether the job has exited.



I think I can turn the logging verbosity down to remove this from the logs. For now it is good to see more details just in case we have any issues.
ID: 52914 · Report as offensive     Reply Quote
Schizm

Send message
Joined: 30 Sep 21
Posts: 2
Credit: 9,032,345
RAC: 23,533
Message 52916 - Posted: 28 Jan 2026, 17:24:56 UTC - in response to Message 52902.  

Enabled linger locally, this did not change anything.
Subuids and subgids were set for both boinc group and local user.

Using cgroupfs instead of systemd only seems to hide docker (and podman) from the boinc client (for all projects); as far as i know cgroupv2 was needed for rootless containers.
When systemd is used LHC detects podman and world community grid detects docker.
Boinc client used on this machine: 8.2.8

podman version 3.4.4
Docker version 28.2.2, build 28.2.2-0ubuntu1~22.04.1


Running docker's hello-world results in this:

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.


How the running processes for podman and docker look:

schizm@Enceladus:~$ ps -ef|grep podman
boinc       1337    1243  0 13:15 ?        00:00:00 /usr/bin/podman
boinc       2510       1  0 13:15 ?        00:00:00 /usr/bin/slirp4netns --disable-host-loopback --mtu=65520 --enable-sandbox --enable-seccomp -c -e 3 -r 4 --netns-type=path /tmp/podman-run-126/netns/cni-c2ec05af-66c0-e5de-7d6e-27e58d346990 tap0
schizm      2888    2796  0 13:15 ?        00:00:00 /usr/bin/podman
schizm    143078    3676  0 16:29 pts/0    00:00:00 grep --color=auto podman
schizm@Enceladus:~$ ps -ef|grep docker
root        1631       1  0 13:15 ?        00:00:01 /usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
schizm    143090    3676  0 16:30 pts/0    00:00:00 grep --color=auto docker 


Another note to add: There was an issue with virtualbox and podman running simultaneously, could something similar be the case here?

Can not test my latest changes for theory since i'm getting atlas tasks again instead, but i might have to run the docker daemon under another uid as a fix. Will try to update here after i get new theory tasks.
ID: 52916 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 898
Credit: 770,607,830
RAC: 175,526
Message 52920 - Posted: 28 Jan 2026, 19:19:57 UTC - in response to Message 52916.  
Last modified: 28 Jan 2026, 21:18:58 UTC

I didn't have issues to run poadman and virtualbox at the same time before.

just got a few sucessfuls running both.

You don't need both docker and podman, I chose podman
ID: 52920 · Report as offensive     Reply Quote
[VENETO] boboviz
Avatar

Send message
Joined: 7 May 08
Posts: 266
Credit: 2,118,600
RAC: 2,026
Message 52924 - Posted: 28 Jan 2026, 20:44:12 UTC - in response to Message 52913.  

In reply to [VENETO] boboviz's message of 28 Jan 2026:
Up to now, no problems with my Win11 64bit


Mmm, some errors, like 432010091

<message>
Funzione non corretta.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
docker_wrapper 17 starting
docker_wrapper config:
workdir: /boinc_slot_dir
use GPU: no
create args: --cap-add=SYS_ADMIN --device /dev/fuse
verbose: 1
Using WSL distro Ubuntu
Using podman
running docker command: ps --all --filter "name=^boinc__lhcathome.cern.ch_lhcathome__theory_2922-4892047-474_0$" --format "{{.Names}}|{{.Status}}"
program: podman
command output:
EOM
creating container boinc__lhcathome.cern.ch_lhcathome__theory_2922-4892047-474_0
running docker command: images
program: podman
command output:
REPOSITORY TAG IMAGE ID CREATED SIZE
docker.io/library/almalinux 9 df3270cc8bc8 10 months ago 217 MB
EOM
building image
running docker command: build "." -t boinc__lhcathome.cern.ch_lhcathome__theory_2922-4892047-474 -f Dockerfile
program: podman
read_from_pipe() error: timeout
build_image() failed: -182
ID: 52924 · Report as offensive     Reply Quote

Message boards : Theory Application : Theory in containers


©2026 CERN