Message boards :
ATLAS application :
Creation of container failed
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 12 Jul 11 Posts: 95 Credit: 1,129,876 RAC: 0 |
@David : Atlas tasks are not failing anymore, thanks a lof for the tip ! |
Send message Joined: 20 Aug 10 Posts: 4 Credit: 2,216,637 RAC: 0 |
Hi, All ATLAS Job are still KO for me. Here is an exemple : https://lhcathome.cern.ch/lhcathome/result.php?resultid=314031488 <core_client_version>7.16.6</core_client_version> <![CDATA[ <message> process exited with code 195 (0xc3, -61)</message> <stderr_txt> 19:03:12 (1338628): wrapper (7.7.26015): starting 19:03:12 (1338628): wrapper: running run_atlas (--nthreads 1) [2021-04-21 19:03:12] Arguments: --nthreads 1 [2021-04-21 19:03:12] Threads: 1 [2021-04-21 19:03:12] Checking for CVMFS [2021-04-21 19:03:17] Probing /cvmfs/atlas.cern.ch... OK [2021-04-21 19:03:18] Probing /cvmfs/atlas-condb.cern.ch... OK [2021-04-21 19:03:19] Probing /cvmfs/grid.cern.ch... OK [2021-04-21 19:03:19] Probing /cvmfs/cernvm-prod.cern.ch... OK [2021-04-21 19:03:21] Probing /cvmfs/sft.cern.ch... OK [2021-04-21 19:03:22] Probing /cvmfs/alice.cern.ch... OK [2021-04-21 19:03:22] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2021-04-21 19:03:22] 2.7.5.0 1338840 0 24652 83131 3 1 30021779 33554433 0 65024 0 0 n/a 527 139 http://cvmfs-stratum-one.cern.ch:8000/cvmfs/atlas.cern.ch DIRECT 1 [2021-04-21 19:03:22] CVMFS is ok [2021-04-21 19:03:22] Efficiency of ATLAS tasks can be improved by the following measure(s): [2021-04-21 19:03:22] The CVMFS client on this computer should be configured to use Cloudflare's openhtc.io. [2021-04-21 19:03:22] Small home clusters do not require a local http proxy but it is suggested if [2021-04-21 19:03:22] more than 10 cores throughout the same LAN segment are regularly running ATLAS like tasks. [2021-04-21 19:03:22] Further information can be found at the LHC@home message board. [2021-04-21 19:03:22] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img [2021-04-21 19:03:22] Checking for singularity binary... [2021-04-21 19:03:22] Singularity is not installed, using version from CVMFS [2021-04-21 19:03:22] Checking singularity works with /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname [2021-04-21 19:03:29] INFO: Converting SIF file to temporary sandbox... UBUNTU-FALOURD INFO: Cleaning up image... [2021-04-21 19:03:29] Singularity works [2021-04-21 19:03:29] Starting ATLAS job with PandaID=5031802625 [2021-04-21 19:03:29] Running command: /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec --pwd /var/lib/boinc-client/slots/4 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh [2021-04-21 19:03:39] Job failed [2021-04-21 19:03:39] INFO: Converting SIF file to temporary sandbox... [2021-04-21 19:03:39] INFO: Cleaning up image... [2021-04-21 19:03:39] FATAL: container creation failed: hook function for tag prelayer returns error: failed to create /var/lib/alternatives directory: mkdir /var/lib/alternatives: permission denied [2021-04-21 19:03:39] ./runtime_log [2021-04-21 19:03:39] ./runtime_log.err 19:13:40 (1338628): run_atlas exited; CPU time 15.787750 19:13:40 (1338628): app exit status: 0x1 19:13:40 (1338628): called boinc_finish(195) </stderr_txt> ]]> What can i do ? Thanks for your help. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
The feedback I got said there was a configuration change related to setuid in the latest singularity, and he pointed me to this page. I didn't try to install a local singularity but I did try to: openSUSE and SUSE have a small difference with upstream default. This means the SUID root binaries distributed by singularity are executable only by users belonging to the group 'singularity'. Otherwise, users will get an error message like this one: FATAL: while executing /usr/lib/singularity/bin/starter-suid: permission denied To add a user to the group singularity, execute (as root): # usermod -a -G singularity <user_login>But when I execute: sudo usermod -a -G singularity aurum It responds with: usermod: group 'singularity' does not exist Ok, trying again: sudo groupadd singularity sudo usermod -a -G singularity aurum Ok, so far so good. Just broke 1.1% which is first. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
Failed again. A perfect record: 100% failures for ATLAS. [2021-04-24 09:34:41] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img [2021-04-24 09:34:41] Checking for singularity binary... [2021-04-24 09:34:41] Singularity is not installed, using version from CVMFS [2021-04-24 09:34:41] Checking singularity works with /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname [2021-04-24 09:34:42] Singularity isnt working: INFO: Converting SIF file to temporary sandbox... [2021-04-24 09:34:42] FATAL: while extracting /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img: root filesystem extraction failed: extract command failed: WARNING: passwd file doesn't exist in container, not updating [2021-04-24 09:34:42] WARNING: group file doesn't exist in container, not updating [2021-04-24 09:34:42] WARNING: Skipping mount /etc/hosts [binds]: /etc/hosts doesn't exist in container [2021-04-24 09:34:42] WARNING: Skipping mount /etc/localtime [binds]: /etc/localtime doesn't exist in container [2021-04-24 09:34:42] WARNING: Skipping mount proc [kernel]: /proc doesn't exist in container [2021-04-24 09:34:42] WARNING: Skipping mount /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/tmp [tmp]: /tmp doesn't exist in container [2021-04-24 09:34:42] WARNING: Skipping mount /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/var/tmp [tmp]: /var/tmp doesn't exist in container [2021-04-24 09:34:42] WARNING: Skipping mount /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container [2021-04-24 09:34:42] [2021-04-24 09:34:42] FATAL ERROR:write_file: failed to create file /image/root/usr/include/c++/4.8.2/ext/pb_ds/detail/cc_hash_table_map_/erase_fn_imps.hpp, because Too many open files [2021-04-24 09:34:42] Parallel unsquashfs: Using 36 processors [2021-04-24 09:34:42] 41269 inodes (41434 blocks) to write [2021-04-24 09:34:42] [2021-04-24 09:34:42] : exit status 1 09:44:43 (47648): run_atlas exited; CPU time 1.082061 09:44:43 (47648): app exit status: 0x1 09:44:43 (47648): called boinc_finish(195) </stderr_txt> |
Send message Joined: 7 Jan 07 Posts: 41 Credit: 16,102,983 RAC: 17 |
Looking into errors, I found that the problem is the use of wrong path by container : 09:56:16 (130425): wrapper (7.7.26015): starting 09:56:16 (130425): wrapper: running run_atlas (--nthreads 6) [2021-05-06 09:56:16] Arguments: --nthreads 6 [2021-05-06 09:56:16] Threads: 6 [2021-05-06 09:56:16] Checking for CVMFS [2021-05-06 09:56:16] Probing /cvmfs/atlas.cern.ch... OK [2021-05-06 09:56:16] Probing /cvmfs/atlas-condb.cern.ch... OK [2021-05-06 09:56:16] Running cvmfs_config stat atlas.cern.ch [2021-05-06 09:56:16] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE [2021-05-06 09:56:16] 2.8.1.0 129901 5 25280 84062 2 3 4718377 6144001 0 130560 0 36 99.909 528 724 http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch http://192.168.2.1:3128 1 [2021-05-06 09:56:16] CVMFS is ok [2021-05-06 09:56:16] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img [2021-05-06 09:56:16] Checking for singularity binary... [2021-05-06 09:56:16] Singularity is not installed, using version from CVMFS [2021-05-06 09:56:16] Checking singularity works with /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname [2021-05-06 09:56:18] INFO: Converting SIF file to temporary sandbox... fonck INFO: Cleaning up image... [2021-05-06 09:56:18] Singularity works [2021-05-06 09:56:18] Set ATHENA_PROC_NUMBER=6 [2021-05-06 09:56:18] Starting ATLAS job with PandaID=5047215526 [2021-05-06 09:56:18] Running command: /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec --pwd /var/lib/boinc-client/slots/7 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh [2021-05-06 09:56:20] Job failed [2021-05-06 09:56:20] INFO: Converting SIF file to temporary sandbox... [2021-05-06 09:56:20] INFO: Cleaning up image... [2021-05-06 09:56:20] FATAL: container creation failed: mount /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/rootfs/var/lib/package-list->/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/underlay/var/lib/package-list error: while mounting /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/rootfs/var/lib/package-list: destination /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/underlay/var/lib/package-list doesn't exist in container [2021-05-06 09:56:20] ./runtime_log.err [2021-05-06 09:56:20] ./runtime_log For testing, I changed temporarily write permission for /var/lib : sudo chmod o+w /var/lib/ As you can see, all files are created there ... $ ls -lh /var/lib/|grep boinc drwxr-xr-x 2 boinc boinc 4,0K mai 5 23:49 alternatives lrwxrwxrwx 1 boinc boinc 12 mai 22 2018 boinc -> boinc-client drwxr-xr-x 8 boinc boinc 4,0K mai 6 09:57 boinc-client drwxr-xr-x 2 boinc boinc 4,0K mai 6 00:12 condor drwxr-xr-x 2 boinc boinc 4,0K mai 6 00:28 cs drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:47 games drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 gssproxy drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 initramfs drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 machines drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 ntp -rw-r--r-- 1 boinc boinc 0 mai 6 09:56 package-list drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 rpcbind drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 rpm drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 rpm-state drwxr-xr-x 2 boinc boinc 4,0K mai 6 09:56 texmf There are some parts missing at the mount point : $ ls /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/3.7.2/var/singularity/mnt/session/ $ |
Send message Joined: 23 Jul 05 Posts: 53 Credit: 2,707,793 RAC: 0 |
Looking into errors, I found that the problem is the use of wrong path by container : same problem where I've been posting all night long here : https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5633&postid=45024#45024 so is the local installation of singularity the only available solution out there? |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,355 |
When you read the messages in the folders, you can find the answer! Singularity is downloaded from the Atlas-Server, when local no singularity could be found! |
Send message Joined: 23 Jul 05 Posts: 53 Credit: 2,707,793 RAC: 0 |
When you read the messages in the folders, you can find the answer! yes and that's exactly where the emssage is coming from... from the singularity downloaded from the server and apparently ther eis something wrong with the package since he can't find some folder from the img donwloaded @maeax do you want maybe the task unit to prove it to you? here it is: https://lhcathome.cern.ch/lhcathome/result.php?resultid=317300586 and as you can see the singularity is downloaded, probed as working fine but still there are pieces missing which echoes the messages posted earlier in this thread for example this one: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5647&postid=44897 so since I'm not the only one let's not pretend that the package is okey... OR there are some missing informations then |
Send message Joined: 26 Oct 04 Posts: 6 Credit: 1,696,248 RAC: 0 |
I'm having issues with the native ATLAS app with just these same behaviors. Getting error messages that a given directory under /var/lib isn't present and can't be created. I tried adding the Singularity group and adding both my username and boinc to the group. This is happening on two Linux boxes of mine... |
Send message Joined: 7 Jan 07 Posts: 41 Credit: 16,102,983 RAC: 17 |
Working workarounds can be find in these posts. |
Send message Joined: 17 Feb 17 Posts: 42 Credit: 2,589,736 RAC: 0 |
Having the same issue here. it has been a while. Steps ran: Installed cvmfs. create default.local sudo apt-get install singularity sudo apt-get install squashfs-tools Ubuntu 20.04. Plenty of ram - set to run 1 Atlas task, machine has 16 gb ram. Initially thought there was an error while changing number of CPUs per task. Task in question: https://lhcathome.cern.ch/lhcathome/result.php?resultid=323595231 Has anyone found a working fix for this? |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
sudo apt-get install singularity Even though it says "Singularity works", I haven't had that version of singularity work in years, on Ubuntu 20.04.2 and earlier. This is what worked for me about six months ago, though it is a fairly safe bet that things have changed by now. But it might get you started. (First uninstall the version you have, with sudo apt remove singularity, though it probably won't find it anyway). First: install Dependencies: |
Send message Joined: 17 Feb 17 Posts: 42 Credit: 2,589,736 RAC: 0 |
sudo apt-get install singularity First off - thank you so much. We're currently well above 10 minutes of CPU time, so it appears to be working. This was all very much above my head and I simply copy and pasted a lot of this with the substitutions of version numbers and directories. Hopefully nothing goes wrong in the future as I am still very much new to Linux for the most part. hopefully this, along with a few other things, can be pinned. I'm finding a lot of things are very scattered - especially as they relate to native - and there are a lot of different instructions that are now out of date or do not work for this project. For reference, this is what just worked for me. I still have a few questions, but that's for another topic. sudo apt remove singularity sudo apt autoremove (not sure if this is needed, but it didn't break anything, yet) sudo apt-get update && sudo apt-get install -y \ build-essential \ libssl-dev \ uuid-dev \ libgpgme11-dev \ squashfs-tools \ libseccomp-dev \ wget \ pkg-config \ git \ cryptsetup To correct broken packages (examples): sudo apt install libseccomp2=2.4.3-1ubuntu1 sudo apt install libssl1.1=1.1.1f-1ubuntu2 Second: install GO sudo snap install go --classic Result: go 1.16.6 from Michael Hudson-Doyle (mwhudson) installed Check version: go version go get -d github.com/hpcng/singularity export VERSION=3.8.0 && \ wget https://github.com/hpcng/singularity/releases/download/v${VERSION}/singularity-${VERSION}.tar.gz && \ tar -xzf singularity-${VERSION}.tar.gz && \ cd singularity-3.8.0 ./mconfig && \ make -C ./builddir && \ sudo make -C ./builddir install Good luck and thank you again. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
You are quite welcome. Nothing is easy with native, especially singularity. It all depends on which version of Linux you have, and what libraries it contains. I am glad it worked this time. |
Send message Joined: 17 Feb 17 Posts: 42 Credit: 2,589,736 RAC: 0 |
You are quite welcome. Nothing is easy with native, especially singularity. As am I. I usually install the standard Ubuntu 20.04 along with any updates. I know I had major issues with debien. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
Hi all, I recently switched one of my CentOS 7 machines to use Singularity from CVMFS and got the same problem described in this thread: FATAL: container creation failed: hook function for tag prelayer returns error: failed to create /var/lib/condor directory: mkdir /var/lib/condor: permission denied I am running the boinc client via systemd in /var/lib/boinc and it seems the issue comes from mounting the whole /var directory into the container. I worked around the problem by moving the BOINC data dir to /home/boinc as described here: https://boinc.berkeley.edu/forum_thread.php?id=13919 However those instructions didn't work exactly out of the box, I had to set ProtectHome=false to allow access to /home and add the new dir to ReadWritePaths. My working settings are # /etc/systemd/system/boinc-client.service.d/override.conf [Service] ProtectHome=false BindPaths=/home/boinc WorkingDirectory=/home/boinc ReadWritePaths=-/etc/boinc-client /home/boinc |
Send message Joined: 12 Jul 11 Posts: 95 Credit: 1,129,876 RAC: 0 |
Jesus. The wonderful world of Linux :D We are quite far from the "initial boinc spirit" here : allow anybody to install, subscribe to a project, it works. I fully agree it is necessary and great to have people involved in such test (I am a humble one of them), but what's the point of all this if it can't be packaged in a "simple a workable" way for boinc users ? |
©2024 CERN