Message boards : ATLAS application : ATLAS native version 2.72
Message board moderation

To post messages, you must log in.

AuthorMessage
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 282
Credit: 8,883,757
RAC: 7,638
Message 40093 - Posted: 9 Oct 2019, 8:33:26 UTC

We just released a new version of ATLAS native, which updates the image used by singularity to CentOS 7. This is the only difference from 2.71 and is done to keep in sync with the new update of the vbox version to CentOS 7.
ID: 40093 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 325
Credit: 10,704,652
RAC: 21,400
Message 40105 - Posted: 10 Oct 2019, 1:38:38 UTC - in response to Message 40093.  

The 2.72 are all failing for me after 10 minutes, with
195 (0x000000C3) EXIT_CHILD_FAILED.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10609230&offset=0&show_names=0&state=6&appid=14

I had no problems with 2.71.
ID: 40105 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 731
Credit: 27,328,581
RAC: 38,790
Message 40106 - Posted: 10 Oct 2019, 2:38:29 UTC

2019-10-10 04:03:26,698: Checking Singularity...
2019-10-10 04:03:26,712: Singularity is installed, version 2.6.1-dist
2019-10-10 04:03:26,712: Testing the function of Singularity...
2019-10-10 04:03:26,712: Checking singularity with cmd:singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname
2019-10-10 04:03:26,741: Singularity isnt working: ERROR : Unknown image format/type: /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img
ABORT : Retval = 255
ID: 40106 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 201
Credit: 2,500,279
RAC: 651
Message 40108 - Posted: 10 Oct 2019, 4:45:38 UTC - in response to Message 40105.  
Last modified: 10 Oct 2019, 5:11:31 UTC

Your singularity version (2.6.1) is very old. Maybe the new image needs a more current version in order to work. You should update your singularity version.

For me, native version 2.72 looks like it is working without a problem:

2019-10-09 20:09:23,818: singularity image is /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img
2019-10-09 20:09:23,819: sys.argv = ['run_atlas', '--nthreads', '2']
2019-10-09 20:09:23,820: THREADS=2
2019-10-09 20:09:23,821: Checking for CVMFS
2019-10-09 20:09:39,404: CVMFS is installed
2019-10-09 20:09:39,404: Checking Singularity...
2019-10-09 20:09:40,399: Singularity is installed, version singularity version 3.4.1+324-g54b182afd
2019-10-09 20:09:40,399: Testing the function of Singularity...
2019-10-09 20:09:40,399: Checking singularity with cmd:singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname
2019-10-09 20:10:03,571: Singularity Works...
2019-10-09 20:10:03,572: copy /home/boinc/boinc1/slots/0/shared/ATLAS.root_0
2019-10-09 20:10:03,872: copy /home/boinc/boinc1/slots/0/shared/RTE.tar.gz
2019-10-09 20:10:03,873: copy /home/boinc/boinc1/slots/0/shared/input.tar.gz
2019-10-09 20:10:03,873: copy /home/boinc/boinc1/slots/0/shared/start_atlas.sh
2019-10-09 20:10:03,873: export ATHENA_PROC_NUMBER=2;
2019-10-09 20:10:04,243: start atlas job with PandaID=4503174146
2019-10-09 20:10:04,243: cmd = singularity exec --pwd /home/boinc/boinc1/slots/0 -B /cvmfs,/home /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh > runtime_log 2> runtime_log.err

The first 2.72 task is running for over 10 hours now and everything looks fine.
ID: 40108 · Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 15 Jun 08
Posts: 1128
Credit: 55,604,658
RAC: 106,700
Message 40110 - Posted: 10 Oct 2019, 7:34:38 UTC - in response to Message 40108.  

Your singularity version (2.6.1) is very old. Maybe the new image needs a more current version in order to work.

Indeed.
Looks like v2.72 requires a more recent singularity than 2.6.1 (which is also included in opensuse leap 15.1).

To solve the problems on my hosts I removed the local singularity and fresh tasks now use singularity from CVMFS.
ID: 40110 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 325
Credit: 10,704,652
RAC: 21,400
Message 40112 - Posted: 10 Oct 2019, 7:56:34 UTC - in response to Message 40110.  

To solve the problems on my hosts I removed the local singularity and fresh tasks now use singularity from CVMFS.

What commands did you use to remove it? I tried that on two other machines, and ended up with nothing that worked.
"singularity --version" then showed nothing.
ID: 40112 · Report as offensive     Reply Quote
computezrmle
Avatar

Send message
Joined: 15 Jun 08
Posts: 1128
Credit: 55,604,658
RAC: 106,700
Message 40113 - Posted: 10 Oct 2019, 8:21:16 UTC - in response to Message 40112.  
Last modified: 10 Oct 2019, 8:29:34 UTC

Since I started long ago with a manual installation and that might be mixed with an installation from the distribution package I manually removed the files and directories.

On my hosts the following places were affected:
/etc/singularity
/usr/bin/singularity
/usr/local/bin/singularity
/usr/local/bin/run-singularity
/usr/local/lib/singularity

Renaming all of them should do the trick.
Then request a fresh ATLAS task and check the stderr.txt for the following lines:
2019-10-10 09:59:42,014: Singularity seems to be installed but not working: sh: singularity: Kommando nicht gefunden.
2019-10-10 09:59:42,014: Will use version from CVMFS
2019-10-10 09:59:42,014: Testing the function of Singularity...
2019-10-10 09:59:42,014: Checking singularity with cmd:/cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img hostname
2019-10-10 09:59:55,072: Singularity Works...


<edit>
forgot to mention
/usr/local/lib64/singularity
</edit>
ID: 40113 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 282
Credit: 8,883,757
RAC: 7,638
Message 40114 - Posted: 10 Oct 2019, 9:42:25 UTC

Looking around the web it seems that singularity does not guarantee any backwards compatibility, so if you create an image with one version it is not guaranteed to work with previous versions. I think the centos7 version was built with version 3 so will probably not work with 2.x.

If you have an old 2.x version I would recommend uninstalling it as described above so that the one from CVMFS is used, since this should be guaranteed to work with ATLAS images. If you do this please remember to check that user namespaces are enabled as described here for the theory native app.
ID: 40114 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 325
Credit: 10,704,652
RAC: 21,400
Message 40116 - Posted: 10 Oct 2019, 12:52:47 UTC - in response to Message 40114.  
Last modified: 10 Oct 2019, 12:54:22 UTC

OK, this is a bit of a long story, but after removing various directories I never could get the CVMFS version to work that I can see.
So I installed the one-line version https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5151&postid=40040, and at least got singularity version 3.2.1-1.
But even that did not appear to run, though I may not have waited long enough; it was slow starting.

So I just did "sudo apt install singularity", and got a large bunch of files I had never seen before, and probably are not necessary.
But it now works, so that will do.

Thanks to all.
ID: 40116 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 325
Credit: 10,704,652
RAC: 21,400
Message 40118 - Posted: 10 Oct 2019, 19:52:24 UTC - in response to Message 40116.  

While 2.72 is running fine on my Ubunu 16.04 machine now, the same cannot be said for my Ubuntu 18.04 machines.
The work units basically do not run, but error out in 10 minutes.

i7-8700:https://lhcathome.cern.ch/lhcathome/results.php?hostid=10612434
Note that the stderr output says that "Singularity is installed, version singularity version 3.2.1-1".

i7-9700:https://lhcathome.cern.ch/lhcathome/results.php?hostid=10607999
Note that the stderr output says that "Singularity is not installed, using version from CVMFS".

Also note that there is a high error rate on all of these work units on other machines too, but I see a few that work.
Something is amiss.
ID: 40118 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 9 Jan 15
Posts: 37
Credit: 272,047,293
RAC: 526,407
Message 40129 - Posted: 11 Oct 2019, 17:23:53 UTC - in response to Message 40118.  
Last modified: 11 Oct 2019, 17:24:45 UTC

Try remove singularity and use the one build in vm. It worked for me with 18.10.

$ sudo rm -rf \
    /usr/local/libexec/singularity \
    /usr/local/var/singularity \
    /usr/local/etc/singularity \
    /usr/local/bin/singularity \
    /usr/local/bin/run-singularity \
    /usr/local/etc/bash_completion.d/singularity


https://sylabs.io/guides/3.0/user-guide/installation.html#remove-an-old-version
ID: 40129 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 325
Credit: 10,704,652
RAC: 21,400
Message 40131 - Posted: 12 Oct 2019, 0:02:34 UTC - in response to Message 40129.  

At this point, switching between one singularity an another probably won't save it. The problem seems to be at a deeper level.
I will wipe out one of the machines and start over. That might fix it. Thanks
ID: 40131 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 731
Credit: 27,328,581
RAC: 38,790
Message 40166 - Posted: 16 Oct 2019, 9:52:22 UTC

LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.6.1810 (Core)
Release: 7.6.1810
Boinc 7.16.1

-10-15 11:19:31,539: start atlas job with PandaID=4505354066
2019-10-15 11:19:31,539: cmd = singularity exec --pwd /var/lib/boinc/slots/0 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh > runtime_log 2> runtime_log.err
2019-10-16 11:31:58,538: running cmd return value is 0
2019-10-16 11:31:58,540: Moving ./HITS.19056301._009925.pool.root.1 to shared/HITS.pool.root.1
2019-10-16 11:31:58,540: HITS result file:
2019-10-16 11:31:58,558: -rw-------. 1 boinc boinc 243621250 16. Okt 11:30 shared/HITS.pool.root.1

First task -native CentOS76 with CentOS inside for Atlas-Task?
ID: 40166 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 69
Credit: 8,539,857
RAC: 20,524
Message 40175 - Posted: 17 Oct 2019, 8:41:13 UTC - in response to Message 40166.  

2019-10-15 11:19:31,539: cmd = singularity exec --pwd /var/lib/boinc/slots/0 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh > runtime_log 2> runtime_log.err
2019-10-16 11:31:58,540: Moving ./HITS.19056301._009925.pool.root.1 to shared/HITS.pool.root.1

First task -native CentOS76 with CentOS inside for Atlas-Task?
Operating System: Linux CentOS Linux CentOS Linux 7 (Core) [3.10.0-1062.1.2.el7.x86_64|libc 2.17 (GNU libc)]:

2019-10-14 18:50:34,710: cmd = singularity exec --pwd /var/lib/boinc/slots/0 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/images/singularity/x86_64-centos7.img sh start_atlas.sh > runtime_log 2> runtime_log.err
2019-10-15 00:34:57,894: running cmd return value is 0
2019-10-15 00:34:57,895: Moving ./HITS.19056367._007754.pool.root.1 to shared/HITS.pool.root.1

:)
ID: 40175 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 731
Credit: 27,328,581
RAC: 38,790
Message 40177 - Posted: 17 Oct 2019, 9:12:14 UTC

Henry,
you are one step further and have one PC with CentOs8. ;-)
ID: 40177 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 69
Credit: 8,539,857
RAC: 20,524
Message 40178 - Posted: 17 Oct 2019, 9:28:21 UTC - in response to Message 40177.  

Yes, but Vbox only: I'm still waiting for the official CVMFS release for CentOS 8 - I'm not brave enough to get into building/installing it myself! :(
ID: 40178 · Report as offensive     Reply Quote

Message boards : ATLAS application : ATLAS native version 2.72


©2019 CERN