Message boards : Theory Application : New native version v300.08
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2220
Credit: 173,696,209
RAC: 24,770
Message 49519 - Posted: 11 Feb 2024, 12:41:26 UTC

Theory is back with Tasks.
Thank you. Sunday!
ID: 49519 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 43
Credit: 6,078,645
RAC: 2,687
Message 49811 - Posted: 22 Mar 2024, 12:31:28 UTC - in response to Message 49136.  

Tried on both Linux Mint 20.3 (Ubuntu 20.04) and 21.2 (Ubuntu 22.04). I guess I can't run Theory any more until sudo 1.9.10 gets added to the Ubuntu repository.
Found Sudo-Version 1.9.9.
This sudo version is lower than 1.9.10.
It does not support regular expressions.
Hence, sudoers will not be modified.
Error running /tmp/prepare_theory_native_environment


I know I am most likely late to the party on this one, but I only just started again with Theory Native, and ran into the same problem as Aurum.

I went and visited the SUDO website at https://www.sudo.ws/, and grabbed the latest package for my distribution.

Installed the latest version and then re ran the command in Laurence's OP.

Here is one my of validated workunits -

https://lhcathome.cern.ch/lhcathome/result.php?resultid=408095114

Cheers
ID: 49811 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49812 - Posted: 22 Mar 2024, 13:24:19 UTC - in response to Message 49811.  

Looks good.


Since a major goal of that version is to make suspend/resume work via systemd you may want to test this.

Select a currently running task in BOINC manager (or your preferred BOINC tool) and pause the task.
Test this with a task that has already started the container (see stderr.txt).
Then this should happen:

1.
You should find a corresponding line in the task's stderr.txt

2.
run the "systemd status ..." command shown in stderr.txt (press 'q' to exit the pager).
The output should mention the scope as "frozen".


A while later resume the task via the BOINC management tool.
Check again stderr.txt and the scope status.


Hint:
Although it would be possible to manually freeze/thaw the scope via systemctl this should not be done because BOINC will not be notified.
Hence, always use BOINC for this.
ID: 49812 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 43
Credit: 6,078,645
RAC: 2,687
Message 49813 - Posted: 22 Mar 2024, 13:57:56 UTC - in response to Message 49812.  

Ok, will try and find some time on the weekend and give it a go.

Cheers
ID: 49813 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 43
Credit: 6,078,645
RAC: 2,687
Message 49819 - Posted: 23 Mar 2024, 1:05:11 UTC - in response to Message 49813.  

Good morning,

As suggested, went an selected a running theory task and paused it in BOINC.

Looked in the matching stderr.txt for task to get the command.

Ran the command - systemctl status Theory_2743-2733248-9_1.scope

This was the output - Unit Theory_2743-2733248-9_1.scope could not be found.

Resumed task, task reset itself back to 0.00 percent and then finished as a computation error.

Here is the task - https://lhcathome.cern.ch/lhcathome/result.php?resultid=408107778

Hopefully someone else has had success and that would mean my setup is partially correct, but the suspend/resume is not setup correctly.

Cheers
ID: 49819 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 49820 - Posted: 23 Mar 2024, 9:07:53 UTC - in response to Message 49819.  

The log entries should look a bit like these:
06:14:37 CET +01:00 2024-03-23: cranky: [INFO] Starting runc container.
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] To get some details on systemd level run
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] systemctl status Theory_2743-2785673-11_0.scope
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] mcplots runspec: boinc pp jets 7000 80,-,1060 - herwig++ 2.7.1 UE-EE-5 100000 11
06:14:38 CET +01:00 2024-03-23: cranky: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
06:21:28 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope
06:22:27 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope
06:32:49 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope
06:32:58 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope
06:33:24 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope
06:34:03 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope
07:42:57 CET +01:00 2024-03-23: cranky: [INFO] Container Theory_2743-2785673-11_0 finished with status code 0.
07:42:57 CET +01:00 2024-03-23: cranky: [INFO] Preparing output.
07:42:58 (102851): cranky exited; CPU time 5042.031816
07:42:58 (102851): called boinc_finish(0)



Yours look weird:
06:16:59 AWST +08:00 2024-03-23: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 260 - pythia6 6.428 ambt1 100000 9
06:16:59 AWST +08:00 2024-03-23: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
07:39:34 (590135): wrapper (7.15.26016): starting
07:39:34 (590135): wrapper (7.15.26016): starting
.
.
.
time="2024-03-23T07:39:38+08:00" level=error msg="container with id exists: Theory_2743-2733248-9_1"

It looks like the task stared from scratch (for an unknown reason).
It finally failed because runc didn't remove the container id from the 1st attempt.


Which systemctl version do you use (must be at least v246)?
Please post the output of "systemctl --version" plus the status output of a currently running Theory task.
You get the latter via a command like this:
systemctl --no-pager status Theory_2743-2733248-9_1.scope
ID: 49820 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 43
Credit: 6,078,645
RAC: 2,687
Message 49821 - Posted: 23 Mar 2024, 10:53:46 UTC - in response to Message 49820.  

Evening,

Thank you for the extra steps to look at and also an output of what it is supposed to look like.

I have to step out for the evening, but I can post this bit of info on my system, using the following version of systemd.

systemd 249 (249.11-0ubuntu3.12)

Regards
ID: 49821 · Report as offensive     Reply Quote
Profile tazzduke

Send message
Joined: 24 Jun 10
Posts: 43
Credit: 6,078,645
RAC: 2,687
Message 49822 - Posted: 23 Mar 2024, 23:52:48 UTC - in response to Message 49821.  

It's all working now, but not to sure why, but maybe a reboot did something lol.

But here is a work unit from LHC-Dev (( had it running a few of the theory tasks over there)

https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3310417

I am getting the pause and resume now, with no errors.

Also this is the output in the stderr.txt in one of my theory tasks from here (not dev) which I tried it as well and you can see it is working as well.

05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Starting runc container.
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] To get some details on systemd level run
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] systemctl status Theory_2743-2722097-13_1.scope
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 160 - pythia8 8.308 tune-A2 100000 13
05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
07:40:32 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Pausing systemd unit Theory_2743-2722097-13_1.scope
07:43:01 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Resuming systemd unit Theory_2743-2722097-13_1.scope

Regards
ID: 49822 · Report as offensive     Reply Quote
M0CZY

Send message
Joined: 27 Apr 24
Posts: 10
Credit: 563,349
RAC: 1,808
Message 50129 - Posted: 6 May 2024, 11:09:47 UTC

I'm struggling to get native Theory working on my Ubuntu 22.04.
I've manually upgraded my sudo to version: Sudo version 1.9.14p2
I've run the sudoer's file, and rebooted.
Here is my latest computation error. I don't know how to proceed from here.
<core_client_version>7.24.1</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
11:58:44 (3508): wrapper (7.15.26016): starting
11:58:44 (3508): wrapper (7.15.26016): starting
11:58:44 (3508): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 ()
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Detected Theory App
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] This application must have permanent access to
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] online repositories via a local CVMFS service.
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] requirements are fulfilled.
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Most important:
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - init process is systemd
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Checking local requirements.
11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.14p2.
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] 2.11.3.0 http://s1ral-cvmfs.openhtc.io/cvmfs/alice.cern.ch DIRECT
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'.
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Creating container filesystem.
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Starting runc container.
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] To get some details on systemd level run
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] systemctl status Theory_2743-2839118-133_2.scope
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] mcplots runspec: boinc pp z1j 13000 110 - pythia6 6.428 380 100000 133
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
time="2024-05-06T11:58:48+01:00" level=error msg="operation not permitted"
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Container Theory_2743-2839118-133_2 finished with status code 1.
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Preparing output.
11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [ERROR] No output found.
11:58:49 (3508): cranky exited; CPU time 0.314435
11:58:49 (3508): app exit status: 0xce
11:58:49 (3508): called boinc_finish(195)

</stderr_txt>
]]>
ID: 50129 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 50131 - Posted: 6 May 2024, 11:40:59 UTC - in response to Message 50129.  

This error is from runc:
time="2024-05-06T11:58:48+01:00" level=error msg="operation not permitted"

The log states it is a version from grid.cern.ch:
Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'


Please try to get/install a more recent runc from your Linux vendor.

In addition please check if this option is set in your boinc-client.service file:
ProtectSystem=strict
If so, change "strict" to "full", preferably via an overlay file (see the systemd manual or many posts here).
ID: 50131 · Report as offensive     Reply Quote
M0CZY

Send message
Joined: 27 Apr 24
Posts: 10
Credit: 563,349
RAC: 1,808
Message 50132 - Posted: 6 May 2024, 12:51:33 UTC - in response to Message 50131.  

I've installed runc version 1.1.7-0ubuntu1~22.04.2, and checked that in the boinc-client.service file it says "ProtectSystem=full", then I rebooted my computer. This is from the latest error Stderr output.
This is the line that seems to be where the problem lies:
time="2024-05-06T13:38:35+01:00" level=error msg="runc run failed: fchown fd 7: operation not permitted"
It seems to be a permissions thing that I'm unaware of?
<core_client_version>7.24.1</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
13:38:33 (3775): wrapper (7.15.26016): starting
13:38:33 (3775): wrapper (7.15.26016): starting
13:38:33 (3775): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 ()
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Detected Theory App
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] This application must have permanent access to
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] online repositories via a local CVMFS service.
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] requirements are fulfilled.
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Most important:
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - init process is systemd
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Checking local requirements.
13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.14p2.
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] 2.11.3.0 http://s1ral-cvmfs.openhtc.io/cvmfs/alice.cern.ch DIRECT
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found a local runc version 1.1.7-0ubuntu1~22.04.2.
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Creating container filesystem.
13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Starting runc container.
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] To get some details on systemd level run
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] systemctl status Theory_2743-2749343-140_0.scope
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 660 - herwig7 7.2.0 default 100000 140
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
time="2024-05-06T13:38:35+01:00" level=error msg="runc run failed: fchown fd 7: operation not permitted"
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Container Theory_2743-2749343-140_0 finished with status code 1.
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Preparing output.
13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [ERROR] No output found.
13:38:35 (3775): cranky exited; CPU time 0.317553
13:38:35 (3775): app exit status: 0xce
13:38:35 (3775): called boinc_finish(195)

</stderr_txt>
]]>
ID: 50132 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 50135 - Posted: 6 May 2024, 14:30:43 UTC - in response to Message 50132.  

You have another computer attached to the project that runs Arch Linux [6.8.9-arch1-1|libc 2.39].
That one successfully runs Theory native even with the runc version from grid.cern.ch.

So what you can do is to compare the setup of both to find out what's different or you replace Ubuntu 22.04.4 with ArchLinux.
ID: 50135 · Report as offensive     Reply Quote
Ryan Munro

Send message
Joined: 17 Aug 17
Posts: 81
Credit: 8,410,301
RAC: 4,238
Message 50186 - Posted: 15 May 2024, 17:59:21 UTC

Spun up a new machine with the latest version of Ubuntu, ran the script and it seemed to complete fine, however, I am getting tasks instantly failing, For example:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=411110384

time="2024-05-15T18:49:52+01:00" level=fatal msg="nsexec-1[151937]: failed to unshare remaining namespaces (except cgroupns): Operation not permitted"
time="2024-05-15T18:49:52+01:00" level=fatal msg="nsexec-0[151931]: failed to sync with stage-1: next state: Success"
time="2024-05-15T18:49:52+01:00" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:402: getting the final child's pid from pipe caused: EOF"

Seems to where it fails
ID: 50186 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2220
Credit: 173,696,209
RAC: 24,770
Message 50187 - Posted: 15 May 2024, 18:15:54 UTC - in response to Message 50186.  
Last modified: 15 May 2024, 18:17:02 UTC

Debian Stretch
Enabling user namespace for every user permanently:

sudo sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf
sudo sysctl -p
This instructions find you in this folder -> Native Theory Application Setup (Linux only)
ID: 50187 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 7 Aug 14
Posts: 22
Credit: 9,889,945
RAC: 23,094
Message 50683 - Posted: 2 Oct 2024, 10:02:35 UTC

Upgraded to Ubuntu 24.04.1 this morning on a host that was happily running Theory in Legacy mode, Squid installed and working.

During the upgrade it asked if I wanted to Keep the Squid config file which I did.

After upgrading tasks failed immediately, firstly didn't have the sudoers file so installed that.
Then saw that it said I should run "sudo sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf" and
"sudo sysctl -p".
Rebooted and tried again each time but still get stderr saying...

<core_client_version>8.0.2</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
10:05:56 (4879): wrapper (7.15.26016): starting
10:05:56 (4879): wrapper (7.15.26016): starting
10:05:56 (4879): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 ()
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Detected Theory App
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] This application must have permanent access to
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] online repositories via a local CVMFS service.
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] requirements are fulfilled.
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Most important:
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - init process is systemd
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Checking local requirements.
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.15p5.
10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Missing 'CVMFS_HTTP_PROXY="auto;DIRECT"' in '/etc/cvmfs/default.local'.
10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK
10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK
10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK
10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] 2.11.5.0 http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch http://192.168.0.17:3128
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'.
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Creating container filesystem.
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Starting runc container.
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] To get some details on systemd level run
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] systemctl status Theory_2743-2853749-473_0.scope
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] mcplots runspec: boinc pp z1j 7000 100 - pythia8 8.308 tune-AU2 100000 473
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_
time="2024-10-02T10:05:59+01:00" level=fatal msg="nsexec-1[5881]: failed to unshare remaining namespaces (except cgroupns): Operation not permitted"
time="2024-10-02T10:05:59+01:00" level=fatal msg="nsexec-0[5879]: failed to sync with stage-1: next state: Success"
time="2024-10-02T10:05:59+01:00" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:402: getting the final child's pid from pipe caused: EOF"
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Container Theory_2743-2853749-473_0 finished with status code 1.
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Preparing output.
10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [ERROR] No output found.
10:05:59 (4879): cranky exited; CPU time 0.169476
10:05:59 (4879): app exit status: 0xce
10:05:59 (4879): called boinc_finish(195)

</stderr_txt>


As soon as the first task starts I get a System Error Report pop-up, that for some reason you cannot copy the text from, that has these details...

Package: cvmfs2.11.5-1+ubuntu22.04 [origin unknown]
Title: cvmfs2 crashed with SIGABRT in __gnu_cxx:__verbose_terminate_handler()
ProcCmdline: /usr/bin/cvmfs2 -o rw.system_mount,fsname=cvmfs2,allow_other,grab_mountpoint,uid=131,gid=139 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch
ProcCwd: /var/lib/cvmfs/shared

I have tried other things including...

sudo mkdir -p /mnt/cvmfs [no problem]
sudo mount -t cvmfs repository.cern.ch /mnt/cvmfs [has problem...]

CERNVM-FS: running with credentials 131:139
CERNVM-FS: loading Fuse module... Failed to initialize root file catalog (16 - file catalog failure)

I've wiped the cache and cvmfs_config probe says everything is OK.

What next please ?
ID: 50683 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 50684 - Posted: 2 Oct 2024, 10:17:11 UTC - in response to Message 50683.  

What next please ?

1. Make your computers visible for other volunteers to allow them see the complete picture.
2. Post your boinc client service unit file (plus the override.conf if you use one)
3. Install a runc packet from the official Ubuntu repo
ID: 50684 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 7 Aug 14
Posts: 22
Credit: 9,889,945
RAC: 23,094
Message 50685 - Posted: 2 Oct 2024, 10:31:45 UTC - in response to Message 50684.  
Last modified: 2 Oct 2024, 10:32:26 UTC

Thanks, it was the runc.
I'd assumed because the logs showed...
Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'

That would be the correct version to be using !
ID: 50685 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2519
Credit: 250,934,990
RAC: 127,970
Message 50686 - Posted: 2 Oct 2024, 10:48:22 UTC - in response to Message 50685.  

The log just reports what has been found.
It was a correct runc version in the past provided via CVMFS for CERN's own CentOS computers.
It may not run on other Linux distros (esp. more recent ones), mostly because it is linked to an older libc.

Hence, the general suggestion is to use runc from the Linux vendor.
ID: 50686 · Report as offensive     Reply Quote
Profile PDW

Send message
Joined: 7 Aug 14
Posts: 22
Credit: 9,889,945
RAC: 23,094
Message 50687 - Posted: 2 Oct 2024, 11:15:15 UTC - in response to Message 50686.  

The log just reports what has been found.
It was a correct runc version in the past provided via CVMFS for CERN's own CentOS computers.
It may not run on other Linux distros (esp. more recent ones), mostly because it is linked to an older libc.
Agreed, but it worked for 22 and I've seen nothing to state that runc should be upgraded.
Kudos to Laurence for the much improved reporting of what is going right or wrong in the logs.

Hence, the general suggestion is to use runc from the Linux vendor.
Where is this general suggestion documented ?
I saw only once your suggestion "Please try to get/install a more recent runc from your Linux vendor." in response to someone with a different error.
ID: 50687 · Report as offensive     Reply Quote
Lem Novantotto

Send message
Joined: 24 May 23
Posts: 33
Credit: 1,906,580
RAC: 29,556
Message 50688 - Posted: 2 Oct 2024, 11:37:04 UTC - in response to Message 50687.  

Agreed, but it worked for 22 and I've seen nothing to state that runc should be upgraded.

Just to let you know that, on my Ubuntus, both runc 1.0.2-dev and 1.1.12-0ubuntu3.1 work flawlessly.
--
Bye
ID: 50688 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Theory Application : New native version v300.08


©2024 CERN