Message boards :
Theory Application :
New native version v300.08
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
The new native version (currently in beta) includes an updated cranky (v0.1.4) with support for cgroups v2. To achieve this cleanly, container execution (using runc) is now managed through systemd, employing the following command: sudo [sudo options] systemd-run [systemd options] runc [runc options] [container] As a result, sudo must be configured to allow the boinc user to run systemctl and systemd-run commands. To facilitate this setup, execute the following command once. It downloads and runs a setup script from this URL: sudo /bin/bash -c "export script=\"prepare_theory_native_environment\" && wget https://lhcathome.cern.ch/lhcathome/download/\$script -O /tmp/\$script && chmod u+x /tmp/\$script && /tmp/\$script && rm /tmp/\$script" If your system does not meet the requirements, cranky will revert to the legacy cgroups v1 mode. An additional improvement is found in the logfile (stderr.txt), which now displays basic requirements that are missing, recommended options (e.g., for the local CVMFS client), and provides a hint on obtaining information about the running task via systemctl. This is currently in beta to provide an opportunity to test and evaluate the setup. Please report any issues in this thread. Special thanks to computezrmle, who contributed significantly to the porting work for this systemd approach. |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
This version will move out of beta on Monday. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
Stumbled over a few logs that show a possible issue if the group "boinc" doesn't exist and/or the user running the BOINC client is not a member of that group. This can easily be fixed running the commands below as root, should be done before the switch to the new version: # if the group boinc does not yet exist sudo groupadd boinc # if the user (usually "boinc") is not a member of that group sudo usermod -aG boinc boinc # if the username running boinc is "someotheruser" run sudo usermod -aG boinc someotheruser |
Send message Joined: 2 May 07 Posts: 2229 Credit: 173,840,049 RAC: 18,079 |
This version will move out of beta on Monday. When using in production, get 300.07 (native_theory) and not 300.08 (native_theory) (beta test) prefs - using -native is active. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10806714 |
Send message Joined: 4 Mar 17 Posts: 24 Credit: 10,058,832 RAC: 6,880 |
I get from time to time .7 too. but most are .8. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10819840&offset=0&show_names=0&state=6&appid= Will likely be no problem when the beta is pushed to production and no new .7 work is sent anymore. "Run test applications?" is set too? |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
The beta tag has been removed so this is now the default version. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
I'm still getting v300.07. Did you restart all BOINC server instances? |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
I have deprecated the old version and restarted the server. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
Looks like at least 1 server instance has not been restarted. Meanwhile I got v300.08 through 1 request and "No tasks are available for Theory Simulation" through other requests. |
Send message Joined: 3 Nov 12 Posts: 56 Credit: 139,448,155 RAC: 93,457 |
Try to run cgroups v2 with v300.008 This gives the following error: 10:07:08 (5076): wrapper (7.15.26016): starting 10:07:08 (5076): wrapper (7.15.26016): starting 10:07:08 (5076): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 () 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Detected Theory App 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] This application must have permanent access to 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] online repositories via a local CVMFS service. 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] requirements are fulfilled. 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Most important: 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - init process is systemd 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Checking local requirements. 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.15p2. 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Missing 'CVMFS_HTTP_PROXY="auto;DIRECT"' in '/etc/cvmfs/default.local'. 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/domain.d/cern.ch.local'. 10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/config.d/cvmfs-config.cern.ch.local'. 10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK 10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK 10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK 10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK 10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] 2.10.1.0 http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch http://192.168.101.42:3128 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'. 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Creating container filesystem. 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Starting runc container. 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] To get some details on systemd level run 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] systemctl status Theory_2390-1119669-1066_1.scope 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 7000 40 - pythia8 8.301 tune-2m 100000 1066 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ sudo: Ein Passwort ist notwendig 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Container Theory_2390-1119669-1066_1 finished with status code 1. 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Preparing output. 10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [ERROR] No output found. 10:07:11 (5076): cranky exited; CPU time 0.528149 10:07:11 (5076): app exit status: 0xce 10:07:11 (5076): called boinc_finish(195) What's wrong? What password?? |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
You must apply an sudoer file that allows a few systemd commands to run without a password. Laurence already mentioned that in his OP and posted a command pointing to a script that does this automatically. Run that script once and all should be fine. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6075&postid=48978 |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
Some Other Hints Many volunteers may have followed the old method to allow suspend/resume like: - disable cgroups v2 - manually apply a boinc cgroup - run a script to apply the required access right to "freezer" within the cgroups v1 hierarchy All of that is no longer necessary and should not be used any more if the local Linux system is recent enough to run sudo 1.9.10 or higher. If so - pause all Theory native 300.08 tasks that are downloaded but not yet started - finish all Theory native 300.07 tasks and all 300.08 tasks currently running - stop BOINC - undo all changes made for the old method (especially: reenable cgroups v2) - apply the sudoer file already mentioned; this must be done as root, hence can't be done via BOINC - reboot - start BOINC - resume paused tasks |
Send message Joined: 3 Nov 12 Posts: 56 Credit: 139,448,155 RAC: 93,457 |
You must apply an sudoer file that allows a few systemd commands to run without a password. Did this before and now again; no success. "Found Sudo-Version 1.9.15p2. /etc/sudoers.d/50-lhcathome_boinc_theory_native already exists. Will save it as /etc/sudoers.d/50-lhcathome_boinc_theory_native.backup-DGUPcJvU." Again this "sudo: Ein Passwort ist notwendig" Other suggestions? |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
Try if a reboot helps. |
Send message Joined: 3 Nov 12 Posts: 56 Credit: 139,448,155 RAC: 93,457 |
Try if a reboot helps. Already done....nope Something else I can do to locate the problem? |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
What user does does the boinc client run under. |
Send message Joined: 2 May 07 Posts: 2229 Credit: 173,840,049 RAC: 18,079 |
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10806714 CentOS9-VM is running v300.08 successful. <core_client_version>7.20.2</core_client_version> <![CDATA[ <stderr_txt> 13:05:48 (35410): wrapper (7.15.26016): starting 13:05:48 (35410): wrapper (7.15.26016): starting 13:05:48 (35410): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 () 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Detected Theory App 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] This application must have permanent access to 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] online repositories via a local CVMFS service. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] requirements are fulfilled. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Most important: 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - init process is systemd 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Checking local requirements. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.5p2. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] To run this task in new mode 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Sudo-Version must be at least 1.9.10. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Missing 'CVMFS_HTTP_PROXY="auto;DIRECT"' in '/etc/cvmfs/default.local'. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/domain.d/cern.ch.local'. 13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/config.d/cvmfs-config.cern.ch.local'. 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] 2.11.2.0 http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch http://10.116.178.201:3128 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'. 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Minor requirements are missing. Will try to run this task in legacy mode. 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Checking runc. 13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Creating container filesystem. 13:05:54 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4 mkdir: das Verzeichnis „/sys/fs/cgroup/unified“ kann nicht angelegt werden: Das Dateisystem ist nur lesbar mkdir: das Verzeichnis „/sys/fs/cgroup/unified“ kann nicht angelegt werden: Das Dateisystem ist nur lesbar mkdir: das Verzeichnis „/sys/fs/cgroup/unified“ kann nicht angelegt werden: Das Dateisystem ist nur lesbar mkdir: das Verzeichnis „/sys/fs/cgroup/unified“ kann nicht angelegt werden: Das Dateisystem ist nur lesbar mkdir: das Verzeichnis „/sys/fs/cgroup/unified“ kann nicht angelegt werden: Das Dateisystem ist nur lesbar 13:05:54 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Running Container 'runc'. 13:05:54 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] mcplots runspec: boinc pp elastic 7000 - - pythia6 6.427 a 100000 1070 13:08:28 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Container 'runc' finished with status code 0. 13:08:28 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Preparing output. 13:08:29 (35410): cranky exited; CPU time 129.095876 13:08:29 (35410): called boinc_finish(0) </stderr_txt> |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
CentOS9-VM is running v300.08 successful. Right, in legacy mode. The log explains why. |
Send message Joined: 15 Jun 08 Posts: 2520 Credit: 252,322,195 RAC: 136,296 |
Try if a reboot helps. Given 'boinc_user' is the useraccount running your boinc service try this. In case your boinc user has a different name use that name. su - boinc_user # ensure you are 'boinc_user' whoami # next command intentionally without 'sudo' ls -hal /etc/sudoers.d/50-lhcathome_boinc_theory_native #if 'ls' fails run the next command sudo grep -i 'includedir' /etc/sudoers Post the output here. |
Send message Joined: 3 Nov 12 Posts: 56 Credit: 139,448,155 RAC: 93,457 |
$ su - boinc_user su: Benutzer boinc_user existiert nicht oder der Benutzereintrag enthält nicht alle erforderlichen Felder Is this because user is boinc not boinc_user? But user boinc has no login. $ groups boinc boinc |
©2024 CERN