Message boards :
Theory Application :
New native version v300.08
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · Next
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,696,209 RAC: 24,770 |
Theory is back with Tasks. Thank you. Sunday! |
Send message Joined: 24 Jun 10 Posts: 43 Credit: 6,078,645 RAC: 2,687 |
Tried on both Linux Mint 20.3 (Ubuntu 20.04) and 21.2 (Ubuntu 22.04). I guess I can't run Theory any more until sudo 1.9.10 gets added to the Ubuntu repository. I know I am most likely late to the party on this one, but I only just started again with Theory Native, and ran into the same problem as Aurum. I went and visited the SUDO website at https://www.sudo.ws/, and grabbed the latest package for my distribution. Installed the latest version and then re ran the command in Laurence's OP. Here is one my of validated workunits - https://lhcathome.cern.ch/lhcathome/result.php?resultid=408095114 Cheers |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,934,990 RAC: 127,970 |
Looks good. Since a major goal of that version is to make suspend/resume work via systemd you may want to test this. Select a currently running task in BOINC manager (or your preferred BOINC tool) and pause the task. Test this with a task that has already started the container (see stderr.txt). Then this should happen: 1. You should find a corresponding line in the task's stderr.txt 2. run the "systemd status ..." command shown in stderr.txt (press 'q' to exit the pager). The output should mention the scope as "frozen". A while later resume the task via the BOINC management tool. Check again stderr.txt and the scope status. Hint: Although it would be possible to manually freeze/thaw the scope via systemctl this should not be done because BOINC will not be notified. Hence, always use BOINC for this. |
Send message Joined: 24 Jun 10 Posts: 43 Credit: 6,078,645 RAC: 2,687 |
Ok, will try and find some time on the weekend and give it a go. Cheers |
Send message Joined: 24 Jun 10 Posts: 43 Credit: 6,078,645 RAC: 2,687 |
Good morning, As suggested, went an selected a running theory task and paused it in BOINC. Looked in the matching stderr.txt for task to get the command. Ran the command - systemctl status Theory_2743-2733248-9_1.scope This was the output - Unit Theory_2743-2733248-9_1.scope could not be found. Resumed task, task reset itself back to 0.00 percent and then finished as a computation error. Here is the task - https://lhcathome.cern.ch/lhcathome/result.php?resultid=408107778 Hopefully someone else has had success and that would mean my setup is partially correct, but the suspend/resume is not setup correctly. Cheers |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,934,990 RAC: 127,970 |
The log entries should look a bit like these: 06:14:37 CET +01:00 2024-03-23: cranky: [INFO] Starting runc container. 06:14:38 CET +01:00 2024-03-23: cranky: [INFO] To get some details on systemd level run 06:14:38 CET +01:00 2024-03-23: cranky: [INFO] systemctl status Theory_2743-2785673-11_0.scope 06:14:38 CET +01:00 2024-03-23: cranky: [INFO] mcplots runspec: boinc pp jets 7000 80,-,1060 - herwig++ 2.7.1 UE-EE-5 100000 11 06:14:38 CET +01:00 2024-03-23: cranky: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ 06:21:28 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope 06:22:27 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope 06:32:49 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope 06:32:58 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope 06:33:24 CET +01:00 2024-03-23: cranky: [INFO] Pausing systemd unit Theory_2743-2785673-11_0.scope 06:34:03 CET +01:00 2024-03-23: cranky: [INFO] Resuming systemd unit Theory_2743-2785673-11_0.scope 07:42:57 CET +01:00 2024-03-23: cranky: [INFO] Container Theory_2743-2785673-11_0 finished with status code 0. 07:42:57 CET +01:00 2024-03-23: cranky: [INFO] Preparing output. 07:42:58 (102851): cranky exited; CPU time 5042.031816 07:42:58 (102851): called boinc_finish(0) Yours look weird: 06:16:59 AWST +08:00 2024-03-23: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 260 - pythia6 6.428 ambt1 100000 9 06:16:59 AWST +08:00 2024-03-23: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ 07:39:34 (590135): wrapper (7.15.26016): starting 07:39:34 (590135): wrapper (7.15.26016): starting . . . time="2024-03-23T07:39:38+08:00" level=error msg="container with id exists: Theory_2743-2733248-9_1" It looks like the task stared from scratch (for an unknown reason). It finally failed because runc didn't remove the container id from the 1st attempt. Which systemctl version do you use (must be at least v246)? Please post the output of "systemctl --version" plus the status output of a currently running Theory task. You get the latter via a command like this: systemctl --no-pager status Theory_2743-2733248-9_1.scope |
Send message Joined: 24 Jun 10 Posts: 43 Credit: 6,078,645 RAC: 2,687 |
Evening, Thank you for the extra steps to look at and also an output of what it is supposed to look like. I have to step out for the evening, but I can post this bit of info on my system, using the following version of systemd. systemd 249 (249.11-0ubuntu3.12) Regards |
Send message Joined: 24 Jun 10 Posts: 43 Credit: 6,078,645 RAC: 2,687 |
It's all working now, but not to sure why, but maybe a reboot did something lol. But here is a work unit from LHC-Dev (( had it running a few of the theory tasks over there) https://lhcathomedev.cern.ch/lhcathome-dev/result.php?resultid=3310417 I am getting the pause and resume now, with no errors. Also this is the output in the stderr.txt in one of my theory tasks from here (not dev) which I tried it as well and you can see it is working as well. 05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Starting runc container. 05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] To get some details on systemd level run 05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] systemctl status Theory_2743-2722097-13_1.scope 05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 160 - pythia8 8.308 tune-A2 100000 13 05:49:29 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ 07:40:32 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Pausing systemd unit Theory_2743-2722097-13_1.scope 07:43:01 AWST +08:00 2024-03-24: cranky-0.1.4: [INFO] Resuming systemd unit Theory_2743-2722097-13_1.scope Regards |
Send message Joined: 27 Apr 24 Posts: 10 Credit: 563,349 RAC: 1,808 |
I'm struggling to get native Theory working on my Ubuntu 22.04. I've manually upgraded my sudo to version: Sudo version 1.9.14p2 I've run the sudoer's file, and rebooted. Here is my latest computation error. I don't know how to proceed from here. <core_client_version>7.24.1</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 11:58:44 (3508): wrapper (7.15.26016): starting 11:58:44 (3508): wrapper (7.15.26016): starting 11:58:44 (3508): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 () 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Detected Theory App 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] This application must have permanent access to 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] online repositories via a local CVMFS service. 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] requirements are fulfilled. 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Most important: 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - init process is systemd 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Checking local requirements. 11:58:44 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.14p2. 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] 2.11.3.0 http://s1ral-cvmfs.openhtc.io/cvmfs/alice.cern.ch DIRECT 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'. 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Creating container filesystem. 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Starting runc container. 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] To get some details on systemd level run 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] systemctl status Theory_2743-2839118-133_2.scope 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] mcplots runspec: boinc pp z1j 13000 110 - pythia6 6.428 380 100000 133 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ time="2024-05-06T11:58:48+01:00" level=error msg="operation not permitted" 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Container Theory_2743-2839118-133_2 finished with status code 1. 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Preparing output. 11:58:48 BST +01:00 2024-05-06: cranky-0.1.4: [ERROR] No output found. 11:58:49 (3508): cranky exited; CPU time 0.314435 11:58:49 (3508): app exit status: 0xce 11:58:49 (3508): called boinc_finish(195) </stderr_txt> ]]> |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,934,990 RAC: 127,970 |
This error is from runc: time="2024-05-06T11:58:48+01:00" level=error msg="operation not permitted" The log states it is a version from grid.cern.ch: Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new' Please try to get/install a more recent runc from your Linux vendor. In addition please check if this option is set in your boinc-client.service file: ProtectSystem=strict If so, change "strict" to "full", preferably via an overlay file (see the systemd manual or many posts here). |
Send message Joined: 27 Apr 24 Posts: 10 Credit: 563,349 RAC: 1,808 |
I've installed runc version 1.1.7-0ubuntu1~22.04.2, and checked that in the boinc-client.service file it says "ProtectSystem=full", then I rebooted my computer. This is from the latest error Stderr output. This is the line that seems to be where the problem lies: time="2024-05-06T13:38:35+01:00" level=error msg="runc run failed: fchown fd 7: operation not permitted" It seems to be a permissions thing that I'm unaware of? <core_client_version>7.24.1</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 13:38:33 (3775): wrapper (7.15.26016): starting 13:38:33 (3775): wrapper (7.15.26016): starting 13:38:33 (3775): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 () 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Detected Theory App 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] This application must have permanent access to 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] online repositories via a local CVMFS service. 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] requirements are fulfilled. 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Most important: 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - init process is systemd 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Checking local requirements. 13:38:33 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.14p2. 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] 2.11.3.0 http://s1ral-cvmfs.openhtc.io/cvmfs/alice.cern.ch DIRECT 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Found a local runc version 1.1.7-0ubuntu1~22.04.2. 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Creating container filesystem. 13:38:34 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Starting runc container. 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] To get some details on systemd level run 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] systemctl status Theory_2743-2749343-140_0.scope 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] mcplots runspec: boinc pp jets 13000 660 - herwig7 7.2.0 default 100000 140 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ time="2024-05-06T13:38:35+01:00" level=error msg="runc run failed: fchown fd 7: operation not permitted" 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Container Theory_2743-2749343-140_0 finished with status code 1. 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [INFO] Preparing output. 13:38:35 BST +01:00 2024-05-06: cranky-0.1.4: [ERROR] No output found. 13:38:35 (3775): cranky exited; CPU time 0.317553 13:38:35 (3775): app exit status: 0xce 13:38:35 (3775): called boinc_finish(195) </stderr_txt> ]]> |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,934,990 RAC: 127,970 |
You have another computer attached to the project that runs Arch Linux [6.8.9-arch1-1|libc 2.39]. That one successfully runs Theory native even with the runc version from grid.cern.ch. So what you can do is to compare the setup of both to find out what's different or you replace Ubuntu 22.04.4 with ArchLinux. |
Send message Joined: 17 Aug 17 Posts: 81 Credit: 8,410,301 RAC: 4,238 |
Spun up a new machine with the latest version of Ubuntu, ran the script and it seemed to complete fine, however, I am getting tasks instantly failing, For example: https://lhcathome.cern.ch/lhcathome/result.php?resultid=411110384 time="2024-05-15T18:49:52+01:00" level=fatal msg="nsexec-1[151937]: failed to unshare remaining namespaces (except cgroupns): Operation not permitted" time="2024-05-15T18:49:52+01:00" level=fatal msg="nsexec-0[151931]: failed to sync with stage-1: next state: Success" time="2024-05-15T18:49:52+01:00" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:402: getting the final child's pid from pipe caused: EOF" Seems to where it fails |
Send message Joined: 2 May 07 Posts: 2220 Credit: 173,696,209 RAC: 24,770 |
Debian Stretch Enabling user namespace for every user permanently: sudo sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf sudo sysctl -p This instructions find you in this folder -> Native Theory Application Setup (Linux only) |
Send message Joined: 7 Aug 14 Posts: 22 Credit: 9,889,945 RAC: 23,094 |
Upgraded to Ubuntu 24.04.1 this morning on a host that was happily running Theory in Legacy mode, Squid installed and working. During the upgrade it asked if I wanted to Keep the Squid config file which I did. After upgrading tasks failed immediately, firstly didn't have the sudoers file so installed that. Then saw that it said I should run "sudo sed -i '$ a\kernel.unprivileged_userns_clone = 1' /etc/sysctl.conf" and "sudo sysctl -p". Rebooted and tried again each time but still get stderr saying... <core_client_version>8.0.2</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 10:05:56 (4879): wrapper (7.15.26016): starting 10:05:56 (4879): wrapper (7.15.26016): starting 10:05:56 (4879): wrapper: running ../../projects/lhcathome.cern.ch_lhcathome/cranky-0.1.4 () 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Detected Theory App 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] This application must have permanent access to 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] online repositories via a local CVMFS service. 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] It supports suspend/resume if a couple of 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] requirements are fulfilled. 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Most important: 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - init process is systemd 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - cgroups v2 is enabled and 'freezer' is available 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - the user running this application is a member of the 'boinc' group 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - sudo is at least version 1.9.10 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] - sudoer file provided by LHC@home is installed 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Checking local requirements. 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Found Sudo-Version 1.9.15p5. 10:05:56 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Missing 'CVMFS_HTTP_PROXY="auto;DIRECT"' in '/etc/cvmfs/default.local'. 10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/alice.cern.ch... OK 10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/cernvm-prod.cern.ch... OK 10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/grid.cern.ch... OK 10:05:58 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Probing /cvmfs/sft.cern.ch... OK 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] 2.11.5.0 http://s1cern-cvmfs.openhtc.io/cvmfs/alice.cern.ch http://192.168.0.17:3128 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new'. 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Creating container filesystem. 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Using /cvmfs/cernvm-prod.cern.ch/cvm4 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Starting runc container. 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] To get some details on systemd level run 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] systemctl status Theory_2743-2853749-473_0.scope 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] mcplots runspec: boinc pp z1j 7000 100 - pythia8 8.308 tune-AU2 100000 473 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] ----,^^^^,<<<~_____---,^^^,<<~____--,^^,<~__;_ time="2024-10-02T10:05:59+01:00" level=fatal msg="nsexec-1[5881]: failed to unshare remaining namespaces (except cgroupns): Operation not permitted" time="2024-10-02T10:05:59+01:00" level=fatal msg="nsexec-0[5879]: failed to sync with stage-1: next state: Success" time="2024-10-02T10:05:59+01:00" level=error msg="container_linux.go:380: starting container process caused: process_linux.go:402: getting the final child's pid from pipe caused: EOF" 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Container Theory_2743-2853749-473_0 finished with status code 1. 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [INFO] Preparing output. 10:05:59 BST +01:00 2024-10-02: cranky-0.1.4: [ERROR] No output found. 10:05:59 (4879): cranky exited; CPU time 0.169476 10:05:59 (4879): app exit status: 0xce 10:05:59 (4879): called boinc_finish(195) </stderr_txt> As soon as the first task starts I get a System Error Report pop-up, that for some reason you cannot copy the text from, that has these details... Package: cvmfs2.11.5-1+ubuntu22.04 [origin unknown] Title: cvmfs2 crashed with SIGABRT in __gnu_cxx:__verbose_terminate_handler() ProcCmdline: /usr/bin/cvmfs2 -o rw.system_mount,fsname=cvmfs2,allow_other,grab_mountpoint,uid=131,gid=139 cvmfs-config.cern.ch /cvmfs/cvmfs-config.cern.ch ProcCwd: /var/lib/cvmfs/shared I have tried other things including... sudo mkdir -p /mnt/cvmfs [no problem] sudo mount -t cvmfs repository.cern.ch /mnt/cvmfs [has problem...] CERNVM-FS: running with credentials 131:139 CERNVM-FS: loading Fuse module... Failed to initialize root file catalog (16 - file catalog failure) I've wiped the cache and cvmfs_config probe says everything is OK. What next please ? |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,934,990 RAC: 127,970 |
What next please ? 1. Make your computers visible for other volunteers to allow them see the complete picture. 2. Post your boinc client service unit file (plus the override.conf if you use one) 3. Install a runc packet from the official Ubuntu repo |
Send message Joined: 7 Aug 14 Posts: 22 Credit: 9,889,945 RAC: 23,094 |
Thanks, it was the runc. I'd assumed because the logs showed... Found 'runc version spec: 1.0.2-dev' at '/cvmfs/grid.cern.ch/vc/containers/runc.new' That would be the correct version to be using ! |
Send message Joined: 15 Jun 08 Posts: 2519 Credit: 250,934,990 RAC: 127,970 |
The log just reports what has been found. It was a correct runc version in the past provided via CVMFS for CERN's own CentOS computers. It may not run on other Linux distros (esp. more recent ones), mostly because it is linked to an older libc. Hence, the general suggestion is to use runc from the Linux vendor. |
Send message Joined: 7 Aug 14 Posts: 22 Credit: 9,889,945 RAC: 23,094 |
The log just reports what has been found.Agreed, but it worked for 22 and I've seen nothing to state that runc should be upgraded. Kudos to Laurence for the much improved reporting of what is going right or wrong in the logs. Hence, the general suggestion is to use runc from the Linux vendor.Where is this general suggestion documented ? I saw only once your suggestion "Please try to get/install a more recent runc from your Linux vendor." in response to someone with a different error. |
Send message Joined: 24 May 23 Posts: 33 Credit: 1,906,580 RAC: 29,556 |
Agreed, but it worked for 22 and I've seen nothing to state that runc should be upgraded. Just to let you know that, on my Ubuntus, both runc 1.0.2-dev and 1.1.12-0ubuntu3.1 work flawlessly. -- Bye |
©2024 CERN