Message boards : Theory Application : New native version v300.08
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 48978 - Posted: 5 Dec 2023, 10:55:59 UTC
Last modified: 5 Dec 2023, 14:01:23 UTC

The new native version (currently in beta) includes an updated cranky (v0.1.4) with support for cgroups v2. To achieve this cleanly, container execution (using runc) is now managed through systemd, employing the following command:

sudo [sudo options] systemd-run [systemd options] runc [runc options] [container]


As a result, sudo must be configured to allow the boinc user to run systemctl and systemd-run commands. To facilitate this setup, execute the following command once. It downloads and runs a setup script from this URL:


sudo /bin/bash -c "export script=\"prepare_theory_native_environment\" && wget https://lhcathome.cern.ch/lhcathome/download/\$script -O /tmp/\$script && chmod u+x /tmp/\$script && /tmp/\$script && rm /tmp/\$script"


If your system does not meet the requirements, cranky will revert to the legacy cgroups v1 mode.

An additional improvement is found in the logfile (stderr.txt), which now displays basic requirements that are missing, recommended options (e.g., for the local CVMFS client), and provides a hint on obtaining information about the running task via systemctl.

This is currently in beta to provide an opportunity to test and evaluate the setup. Please report any issues in this thread.

Special thanks to computezrmle, who contributed significantly to the porting work for this systemd approach.
ID: 48978 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 48995 - Posted: 8 Dec 2023, 12:22:46 UTC - in response to Message 48978.  

This version will move out of beta on Monday.
ID: 48995 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 48996 - Posted: 8 Dec 2023, 12:50:47 UTC

Stumbled over a few logs that show a possible issue if the group "boinc" doesn't exist and/or the user running the BOINC client is not a member of that group.

This can easily be fixed running the commands below as root,
should be done before the switch to the new version:

# if the group boinc does not yet exist
sudo groupadd boinc


# if the user (usually "boinc") is not a member of that group
sudo usermod -aG boinc boinc


# if the username running boinc is "someotheruser" run
sudo usermod -aG boinc someotheruser
ID: 48996 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2220
Credit: 173,696,209
RAC: 24,770
Message 48999 - Posted: 9 Dec 2023, 8:11:29 UTC - in response to Message 48995.  
Last modified: 9 Dec 2023, 8:27:36 UTC

This version will move out of beta on Monday.

When using in production, get 300.07 (native_theory) and not 300.08 (native_theory) (beta test)
prefs - using -native is active.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10806714
ID: 48999 · Report as offensive     Reply Quote
Toggleton

Send message
Joined: 4 Mar 17
Posts: 23
Credit: 10,023,478
RAC: 9,154
Message 49000 - Posted: 9 Dec 2023, 8:58:51 UTC - in response to Message 48999.  

I get from time to time .7 too. but most are .8. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10819840&offset=0&show_names=0&state=6&appid=

Will likely be no problem when the beta is pushed to production and no new .7 work is sent anymore. "Run test applications?" is set too?
ID: 49000 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 49004 - Posted: 11 Dec 2023, 8:22:04 UTC - in response to Message 48995.  

The beta tag has been removed so this is now the default version.
ID: 49004 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49005 - Posted: 11 Dec 2023, 8:35:52 UTC - in response to Message 49004.  

I'm still getting v300.07.
Did you restart all BOINC server instances?
ID: 49005 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 49006 - Posted: 11 Dec 2023, 8:54:31 UTC - in response to Message 49005.  
Last modified: 11 Dec 2023, 8:54:44 UTC

I have deprecated the old version and restarted the server.
ID: 49006 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49007 - Posted: 11 Dec 2023, 9:04:48 UTC - in response to Message 49006.  

Looks like at least 1 server instance has not been restarted.
Meanwhile I got v300.08 through 1 request and "No tasks are available for Theory Simulation" through other requests.
ID: 49007 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 55
Credit: 138,571,173
RAC: 111,411
Message 49008 - Posted: 11 Dec 2023, 9:24:21 UTC

Try to run cgroups v2 with v300.008
This gives the following error:

10:07:08 (5076): wrapper (7.15.26016): starting
10:07:08 (5076): wrapper (7.15.26016): starting
10:07:08 (5076): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 ()
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Detected Theory App
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] This application must have permanent access to
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] online repositories via a local CVMFS service.
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] requirements are fulfilled.
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Most important:
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - init process is systemd
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Checking local requirements.
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.15p2.
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Missing 'CVMFS_HTTP_PROXY="auto;DIRECT"' in '/etc/cvmfs/default.local'.
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/domain.d/cern.ch.local'.
10:07:08 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/config.d/cvmfs-config.cern.ch.local'.
10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK
10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK
10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK
10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK
10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
10:07:09 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] 2.10.1.0 http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch http://192.168.101.42:3128
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'.
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Creating container filesystem.
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Starting runc container.
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] To get some details on systemd level run
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] systemctl status Theory_2390-1119669-1066_1.scope
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 7000 40 - pythia8 8.301 tune-2m 100000 1066
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
sudo: Ein Passwort ist notwendig
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Container Theory_2390-1119669-1066_1 finished with status code 1.
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Preparing output.
10:07:10 CET +01:00 2023-12-11: cranky-0.1.4: [ERROR] No output found.
10:07:11 (5076): cranky exited; CPU time 0.528149
10:07:11 (5076): app exit status: 0xce
10:07:11 (5076): called boinc_finish(195)

What's wrong? What password??
ID: 49008 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49009 - Posted: 11 Dec 2023, 9:41:46 UTC - in response to Message 49008.  

You must apply an sudoer file that allows a few systemd commands to run without a password.
Laurence already mentioned that in his OP and posted a command pointing to a script that does this automatically.
Run that script once and all should be fine.

https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6075&postid=48978
ID: 49009 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49010 - Posted: 11 Dec 2023, 10:19:42 UTC

Some Other Hints

Many volunteers may have followed the old method to allow suspend/resume like:
- disable cgroups v2
- manually apply a boinc cgroup
- run a script to apply the required access right to "freezer" within the cgroups v1 hierarchy

All of that is no longer necessary and should not be used any more if the local Linux system is recent enough to run sudo 1.9.10 or higher.

If so
- pause all Theory native 300.08 tasks that are downloaded but not yet started
- finish all Theory native 300.07 tasks and all 300.08 tasks currently running
- stop BOINC
- undo all changes made for the old method (especially: reenable cgroups v2)
- apply the sudoer file already mentioned; this must be done as root, hence can't be done via BOINC
- reboot
- start BOINC
- resume paused tasks
ID: 49010 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 55
Credit: 138,571,173
RAC: 111,411
Message 49011 - Posted: 11 Dec 2023, 11:07:16 UTC - in response to Message 49009.  

You must apply an sudoer file that allows a few systemd commands to run without a password.
Laurence already mentioned that in his OP and posted a command pointing to a script that does this automatically.
Run that script once and all should be fine.



Did this before and now again; no success.

"Found Sudo-Version 1.9.15p2.
/etc/sudoers.d/50-lhcathome_boinc_theory_native already exists.
Will save it as /etc/sudoers.d/50-lhcathome_boinc_theory_native.backup-DGUPcJvU."

Again this "sudo: Ein Passwort ist notwendig"

Other suggestions?
ID: 49011 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49012 - Posted: 11 Dec 2023, 11:56:38 UTC - in response to Message 49011.  

Try if a reboot helps.
ID: 49012 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 55
Credit: 138,571,173
RAC: 111,411
Message 49013 - Posted: 11 Dec 2023, 12:01:46 UTC - in response to Message 49012.  

Try if a reboot helps.


Already done....nope

Something else I can do to locate the problem?
ID: 49013 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 49014 - Posted: 11 Dec 2023, 12:10:06 UTC - in response to Message 49013.  

What user does does the boinc client run under.
ID: 49014 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2220
Credit: 173,696,209
RAC: 24,770
Message 49015 - Posted: 11 Dec 2023, 12:13:48 UTC
Last modified: 11 Dec 2023, 12:17:35 UTC

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10806714
CentOS9-VM is running v300.08 successful.
<core_client_version>7.20.2</core_client_version>
<![CDATA[
<stderr_txt>
13:05:48 (35410): wrapper (7.15.26016): starting
13:05:48 (35410): wrapper (7.15.26016): starting
13:05:48 (35410): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 ()
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Detected Theory App
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] This application must have permanent access to
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] online repositories via a local CVMFS service.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] requirements are fulfilled.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Most important:
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - init process is systemd
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Checking local requirements.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.5p2.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] To run this task in new mode
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Sudo-Version must be at least 1.9.10.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Missing 'CVMFS_HTTP_PROXY="auto;DIRECT"' in '/etc/cvmfs/default.local'.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/domain.d/cern.ch.local'.
13:05:48 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Can't find '/etc/cvmfs/config.d/cvmfs-config.cern.ch.local'.
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] 2.11.2.0 http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch http://10.116.178.201:3128
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'.
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Minor requirements are missing. Will try to run this task in legacy mode.
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Checking runc.
13:05:53 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Creating container filesystem.
13:05:54 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4
mkdir: das Verzeichnis &#226;&#128;&#158;/sys/fs/cgroup/unified&#226;&#128;&#156; kann nicht angelegt werden: Das Dateisystem ist nur lesbar
mkdir: das Verzeichnis &#226;&#128;&#158;/sys/fs/cgroup/unified&#226;&#128;&#156; kann nicht angelegt werden: Das Dateisystem ist nur lesbar
mkdir: das Verzeichnis &#226;&#128;&#158;/sys/fs/cgroup/unified&#226;&#128;&#156; kann nicht angelegt werden: Das Dateisystem ist nur lesbar
mkdir: das Verzeichnis &#226;&#128;&#158;/sys/fs/cgroup/unified&#226;&#128;&#156; kann nicht angelegt werden: Das Dateisystem ist nur lesbar
mkdir: das Verzeichnis &#226;&#128;&#158;/sys/fs/cgroup/unified&#226;&#128;&#156; kann nicht angelegt werden: Das Dateisystem ist nur lesbar
13:05:54 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Running Container 'runc'.
13:05:54 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] mcplots runspec: boinc pp elastic 7000 - - pythia6 6.427 a 100000 1070
13:08:28 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Container 'runc' finished with status code 0.
13:08:28 CET +01:00 2023-12-11: cranky-0.1.4: [INFO] Preparing output.
13:08:29 (35410): cranky exited; CPU time 129.095876
13:08:29 (35410): called boinc_finish(0)

</stderr_txt>
ID: 49015 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49016 - Posted: 11 Dec 2023, 12:38:41 UTC - in response to Message 49015.  

CentOS9-VM is running v300.08 successful.

Right, in legacy mode.
The log explains why.
ID: 49016 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49017 - Posted: 11 Dec 2023, 13:08:27 UTC - in response to Message 49013.  

Try if a reboot helps.


Already done....nope

Something else I can do to locate the problem?

Given 'boinc_user' is the useraccount running your boinc service try this.
In case your boinc user has a different name use that name.

su - boinc_user
# ensure you are 'boinc_user'
whoami
# next command intentionally without 'sudo'
ls -hal /etc/sudoers.d/50-lhcathome_boinc_theory_native
#if 'ls' fails run the next command
sudo grep -i 'includedir' /etc/sudoers

Post the output here.
ID: 49017 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 55
Credit: 138,571,173
RAC: 111,411
Message 49018 - Posted: 11 Dec 2023, 14:08:10 UTC - in response to Message 49017.  


su - boinc_user
# ensure you are 'boinc_user'
whoami
# next command intentionally without 'sudo'
ls -hal /etc/sudoers.d/50-lhcathome_boinc_theory_native
#if 'ls' fails run the next command
sudo grep -i 'includedir' /etc/sudoers

Post the output here.

$ su - boinc_user
su: Benutzer boinc_user existiert nicht oder der Benutzereintrag enthält nicht alle erforderlichen Felder

Is this because user is boinc not boinc_user?
But user boinc has no login.

$ groups boinc
boinc
ID: 49018 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Theory Application : New native version v300.08


©2024 CERN