Message boards : ATLAS application : Singularity errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46716 - Posted: 3 May 2022, 9:29:38 UTC

I have one machine (out of four) with a completely standard CVMFS and singularity installation that has been working fine for the last few days that has started throwing errors after a reboot.

This machine: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10795724

Nothing else changed on the machine, no updates, no changes, just the reboot which all my machines had at the same time due to a power outage.
Singularity simply will not start.
singularity --version

returns
2.6.1-dist

but the work unit log shows:

<core_client_version>7.14.2</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
18:45:39 (14457): wrapper (7.7.26015): starting
18:45:39 (14457): wrapper: running run_atlas (--nthreads 7)
[2022-05-03 18:45:39] Arguments: --nthreads 7
[2022-05-03 18:45:39] Threads: 7
[2022-05-03 18:45:39] Checking for CVMFS
[2022-05-03 18:45:40] Probing /cvmfs/atlas.cern.ch... OK
[2022-05-03 18:45:41] Probing /cvmfs/atlas-condb.cern.ch... OK
[2022-05-03 18:45:41] Running cvmfs_config stat atlas.cern.ch
[2022-05-03 18:45:41] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2022-05-03 18:45:41] 2.9.2.0 14677 0 22940 103863 2 1 3688109 4096000 0 130560 1 0 100.000 0 0 http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch http://192.168.1.3:3128 1
[2022-05-03 18:45:41] CVMFS is ok
[2022-05-03 18:45:41] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2022-05-03 18:45:41] Checking for singularity binary...
[2022-05-03 18:45:41] Using singularity found in PATH at /usr/bin/singularity
[2022-05-03 18:45:41] Running /usr/bin/singularity --version
[2022-05-03 18:45:41] 2.6.1-dist
[2022-05-03 18:45:41] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2022-05-03 18:45:41] Singularity isnt working: /.singularity.d/actions/exec: line 5: /.singularity.d/env/94-appsbase.sh: Input/output error
18:55:41 (14457): run_atlas exited; CPU time 0.108469
18:55:41 (14457): app exit status: 0x1
18:55:41 (14457): called boinc_finish(195)

</stderr_txt>
]]>


Two other machines, with identical configurations are working just fine.
I have put it on a separate profile to run vbox tasks instead, but something is clearly wrong.
Any help would be appreciated.
ID: 46716 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2189
Credit: 173,308,789
RAC: 66,579
Message 46717 - Posted: 3 May 2022, 9:39:04 UTC

There are Info's from David in the Atlas-Thread about Singularity:
[2022-05-03 18:45:41] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2022-05-03 18:45:41] Singularity isnt working: /.singularity.d/actions/exec: line 5: /.singularity.d/env/94-appsbase.sh: Input/output error
Maybe, this Version is too old.
ID: 46717 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46718 - Posted: 3 May 2022, 9:41:23 UTC

As I said, it's still working fine with three other machines, only this one is having any issue.
ID: 46718 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2500
Credit: 248,498,755
RAC: 128,182
Message 46719 - Posted: 3 May 2022, 10:08:48 UTC

Last time David Cameron mentioned the Singularity version was here:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5647&postid=44739
Hence v3.7.2 should be considered to be the minimum requirement.

Also be aware that "Singularity" has been renamed and is now available as "Apptainer".
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5817
ID: 46719 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46720 - Posted: 3 May 2022, 10:13:43 UTC - in response to Message 46719.  

Will this still work if I wipe the local singularity, reboot, and let boinc download it via cvmfs or do I have to do something else?
I've been very busy and not overly well lately so if you could keep the instructions to a simple script I'd really appreciate it.
ID: 46720 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2500
Credit: 248,498,755
RAC: 128,182
Message 46721 - Posted: 3 May 2022, 10:26:20 UTC - in response to Message 46720.  

Since you are using Mint and Ubuntu the recommended way would be to install the most recent >=3.x Singularity/Apptainer from your Linux maintainer.
ID: 46721 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46722 - Posted: 3 May 2022, 10:33:42 UTC - in response to Message 46721.  

Since you are using Mint and Ubuntu the recommended way would be to install the most recent >=3.x Singularity/Apptainer from your Linux maintainer.

There isn't one.
ID: 46722 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46723 - Posted: 3 May 2022, 11:03:15 UTC

I have installed this on two machines, one which was completing Atlas units successfully using Singularity and the one which was failing

https://github.com/apptainer/apptainer/releases/download/v1.0.1/apptainer_1.0.1_amd64.deb

After rebooting both the one that was working fine is still working fine, while the other is now returning invalid work after 30-90 seconds per unit.
ID: 46723 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46724 - Posted: 3 May 2022, 11:05:22 UTC

Std_Err from a failed unit:


<core_client_version>7.14.2</core_client_version>
<![CDATA[
<stderr_txt>
20:43:53 (24546): wrapper (7.7.26015): starting
20:43:53 (24546): wrapper: running run_atlas (--nthreads 7)
[2022-05-03 20:43:53] Arguments: --nthreads 7
[2022-05-03 20:43:53] Threads: 7
[2022-05-03 20:43:53] Checking for CVMFS
[2022-05-03 20:43:53] Probing /cvmfs/atlas.cern.ch... OK
[2022-05-03 20:43:53] Probing /cvmfs/atlas-condb.cern.ch... OK
[2022-05-03 20:43:53] Running cvmfs_config stat atlas.cern.ch
[2022-05-03 20:43:53] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2022-05-03 20:43:53] 2.9.2.0 1777 2 34460 103863 0 21 3779468 4096000 65 130560 15 67983 99.983 0 0 http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch http://192.168.1.3:3128 1
[2022-05-03 20:43:53] CVMFS is ok
[2022-05-03 20:43:53] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2022-05-03 20:43:53] Checking for singularity binary...
[2022-05-03 20:43:53] Using singularity found in PATH at /usr/bin/singularity
[2022-05-03 20:43:53] Running /usr/bin/singularity --version
[2022-05-03 20:43:53] apptainer version 1.0.1
[2022-05-03 20:43:53] Checking singularity works with /usr/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2022-05-03 20:43:54] source: open /.singularity.d/env/94-appsbase.sh: input/output error Orac
[2022-05-03 20:43:54] Singularity works
[2022-05-03 20:43:54] Set ATHENA_PROC_NUMBER=7
[2022-05-03 20:43:54] Starting ATLAS job with PandaID=5435646183
[2022-05-03 20:43:54] Running command: /usr/bin/singularity exec --pwd /var/lib/boinc-client/slots/1 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh
[2022-05-03 20:44:17] *** The last 200 lines of the pilot log: ***
[2022-05-03 20:44:17] ALRB_cvmfs_repo=/cvmfs/atlas.cern.ch/repo
[2022-05-03 20:44:17] ALRB_cvmfs_sft_repo=/cvmfs/sft.cern.ch/lcg
[2022-05-03 20:44:17] ALRB_cvmfs_sftnight_repo=/cvmfs/sft-nightlies.cern.ch/lcg
[2022-05-03 20:44:17] ALRB_cvmfs_unpacked_repo=/cvmfs/unpacked.cern.ch
[2022-05-03 20:44:17] ALRB_envPython=python3
[2022-05-03 20:44:17] ALRB_gridType=emi
[2022-05-03 20:44:17] ALRB_infoProc=model name : AMD Ryzen 7 2700X Eight-Core Processor
[2022-05-03 20:44:17] ALRB_infoUname=Linux Orac 5.13.0-40-generic #45~20.04.1-Ubuntu SMP Mon Apr 4 09:38:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
[2022-05-03 20:44:17] ALRB_motdExclusions=
[2022-05-03 20:44:17] ALRB_noGridMW=NO
[2022-05-03 20:44:17] ALRB_printHelpMain=
[2022-05-03 20:44:17] ALRB_requestedVersions=davix:0.8.1-x86_64-centos7 xrootd:5.4.2-x86_64-centos7 rucio:1.28.1 emi:4.0.2-1_200423.fix4a python:3.8.13-fix1-x86_64-centos7 rcsetup:00-04-18 acm:0.1.28 asetup:V02-00-39
[2022-05-03 20:44:17] ALRB_sedExclusions=
[2022-05-03 20:44:17] ALRB_testPath=,,,,
[2022-05-03 20:44:17] ALRB_tmpScratch=/tmp/boinc/.alrb
[2022-05-03 20:44:17] ALRB_useGridSW=emi
[2022-05-03 20:44:17] ALRB_userMenuFmtSkip=YES
[2022-05-03 20:44:17] AMI_envPython=python3
[2022-05-03 20:44:17] APPTAINER_APPNAME=
[2022-05-03 20:44:17] APPTAINER_BIND=/cvmfs,/var
[2022-05-03 20:44:17] APPTAINER_COMMAND=exec
[2022-05-03 20:44:17] APPTAINER_CONTAINER=/cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2022-05-03 20:44:17] APPTAINER_ENVIRONMENT=/.singularity.d/env/91-environment.sh
[2022-05-03 20:44:17] APPTAINER_NAME=x86_64-centos7
[2022-05-03 20:44:17] ATHENA_PROC_NUMBER=7
[2022-05-03 20:44:17] ATLAS_LOCAL_ACM_VERSION=0.1.28
[2022-05-03 20:44:17] ATLAS_LOCAL_AREA=/var/lib/boinc-client/slots/1/
[2022-05-03 20:44:17] ATLAS_LOCAL_ASETUP_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/AtlasSetup/V02-00-39
[2022-05-03 20:44:17] ATLAS_LOCAL_ASETUP_VERSION=V02-00-39
[2022-05-03 20:44:17] ATLAS_LOCAL_DAVIX_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/davix/0.8.1-x86_64-centos7
[2022-05-03 20:44:17] ATLAS_LOCAL_DAVIX_VERSION=0.8.1-x86_64-centos7
[2022-05-03 20:44:17] ATLAS_LOCAL_EMI_VERSION=4.0.2-1_200423.fix4a
[2022-05-03 20:44:17] ATLAS_LOCAL_PYTHON_VERSION=3.8.13-fix1-x86_64-centos7
[2022-05-03 20:44:17] ATLAS_LOCAL_RCSETUP_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rcSetup/00-04-18
[2022-05-03 20:44:17] ATLAS_LOCAL_RCSETUP_VERSION=00-04-18
[2022-05-03 20:44:17] ATLAS_LOCAL_ROOT=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64
[2022-05-03 20:44:17] ATLAS_LOCAL_ROOT_ARCH=x86_64
[2022-05-03 20:44:17] ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase
[2022-05-03 20:44:17] ATLAS_LOCAL_ROOT_PACOPT=
[2022-05-03 20:44:17] ATLAS_LOCAL_RUCIOCLIENTS_VERSION=1.28.1
[2022-05-03 20:44:17] ATLAS_LOCAL_SETUP_OPTIONS=--quiet -3
[2022-05-03 20:44:17] ATLAS_LOCAL_XROOTD_VERSION=5.4.2-x86_64-centos7
[2022-05-03 20:44:17] ATLAS_POOLCOND_PATH=/cvmfs/atlas.cern.ch/repo/conditions
[2022-05-03 20:44:17] AtlasSetup=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/AtlasSetup/V02-00-39/AtlasSetup
[2022-05-03 20:44:17] AtlasSetupSite=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/AtlasSetup/.config/.asetup.site
[2022-05-03 20:44:17] AtlasSetupSiteCMake=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/AtlasSetup/.configCMake/.asetup.site
[2022-05-03 20:44:17] BDII_LIST=lcg-bdii.cern.ch:2170
[2022-05-03 20:44:17] BOINC_APP=ATLAS
[2022-05-03 20:44:17] CCTOOLS_PATH=/cvmfs/atlas.cern.ch/repo/sw/cctools/3.0.1
[2022-05-03 20:44:17] CMTUSERCONTEXT=/cvmfs/atlas.cern.ch/repo/tools/slc6/cmt
[2022-05-03 20:44:17] CVMFSBASE=/cvmfs
[2022-05-03 20:44:17] DQ2_LOCAL_SITE_ID=ROAMING
[2022-05-03 20:44:17] DYLD_LIBRARY_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/xrootd/5.4.2-x86_64-centos7/lib64
[2022-05-03 20:44:17] EMI_MINBUILDVER_GCC=gcc48
[2022-05-03 20:44:17] EMI_MINBUILDVER_PYTHON=2.7.5
[2022-05-03 20:44:17] EMI_PYTHONBIN=python3
[2022-05-03 20:44:17] EMI_TARBALL_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a
[2022-05-03 20:44:17] EMI_UI_CONF=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a
[2022-05-03 20:44:17] FRONTIER_LOG_LEVEL=warning
[2022-05-03 20:44:17] FRONTIER_SERVER=(serverurl=http://atlascern-frontier.openhtc.io:8080/atlr)(serverurl=http://atlasfrontier-ai.cern.ch:8000/atlr)(serverurl=http://ccfrontier.in2p3.fr:23128/ccin2p3-AtlasFrontier)(proxyu
[2022-05-03 20:44:17] GFAL_CONFIG_DIR=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/etc/gfal2.d/
[2022-05-03 20:44:17] GFAL_PLUGIN_DIR=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/lib64/gfal2-plugins/
[2022-05-03 20:44:17] GFAL_PYTHONBIN=python3
[2022-05-03 20:44:17] GLITE_LOCATION=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr
[2022-05-03 20:44:17] GLITE_LOCATION_VAR=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/var
[2022-05-03 20:44:17] GLITE_SD_PLUGIN=file,bdii
[2022-05-03 20:44:17] GLITE_SD_SERVICES_XML=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/etc/services.xml
[2022-05-03 20:44:17] GLOBUS_LOCATION=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr
[2022-05-03 20:44:17] GLOBUS_TCP_PORT_RANGE=20000,25000
[2022-05-03 20:44:17] GRID_ENV_LOCATION=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/libexec
[2022-05-03 20:44:17] GRID_GLOBAL_JOBHOST=arc-boinc-03.cern.ch
[2022-05-03 20:44:17] GRID_GLOBAL_JOBID=Yr9KDmWe150n9Rq4apoT9bVoABFKDmABFKDmTPjSDmW9SKDmAU2N8n
[2022-05-03 20:44:17] GRID_GLOBAL_JOBINTERFACE=org.nordugrid.arcrest
[2022-05-03 20:44:17] GRID_GLOBAL_JOBURL=https://arc-boinc-03.cern.ch:443/arex/Yr9KDmWe150n9Rq4apoT9bVoABFKDmABFKDmTPjSDmW9SKDmAU2N8n
[2022-05-03 20:44:17] GTAG=http://aipanda403.cern.ch/data/jobs/2022-05-03/BOINC_MCORE/5435646183.out
[2022-05-03 20:44:17] GT_PROXY_MODE=rfc
[2022-05-03 20:44:17] HEP_OSLIBS_VER=7.3.1-2.el7.cern
[2022-05-03 20:44:17] HOME=/var/lib/boinc-client/slots/1
[2022-05-03 20:44:17] INVOCATION_ID=fe80c08fc7204b5eadb93019cc24b2b4
[2022-05-03 20:44:17] JOURNAL_STREAM=8:44076
[2022-05-03 20:44:17] LANG=en_AU.UTF-8
[2022-05-03 20:44:17] LANGUAGE=en_AU:en
[2022-05-03 20:44:17] LCG_GFAL_INFOSYS=lcg-bdii.cern.ch:2170
[2022-05-03 20:44:17] LCG_LOCATION=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr
[2022-05-03 20:44:17] LD_LIBRARY_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/davix/0.8.1-x86_64-centos7/lib64:/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/xrootd/5.4.2-x86_64-centos7/lib64:/cvmfs/atlas.
[2022-05-03 20:44:17] LIBRARY_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.8.13-fix1-x86_64-centos7/lib
[2022-05-03 20:44:17] LOGNAME=boinc
[2022-05-03 20:44:17] MANPATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/share/man:/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/glite/share/man:/usr/loc
[2022-05-03 20:44:17] MYPROXY_SERVER=myproxy.cern.ch
[2022-05-03 20:44:17] OLDPWD=/var/lib/boinc-client/slots/1
[2022-05-03 20:44:17] PAC_ANCHOR=/cvmfs/atlas.cern.ch/repo/sw/cctools/latest
[2022-05-03 20:44:17] PANDA_JSID=harvester-CERN_central_ACTA
[2022-05-03 20:44:17] PANDA_PY3=1
[2022-05-03 20:44:17] PANDA_PYTHON_EXEC=python3
[2022-05-03 20:44:17] PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/davix/0.8.1-x86_64-centos7/bin:/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/xrootd/5.4.2-x86_64-centos7/bin:/cvmfs/atlas.cern.ch/repo/AT
[2022-05-03 20:44:17] PERL5LIB=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/lib64/perl5/vendor_perl:/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/lib/p
[2022-05-03 20:44:17] PILOT_NOKILL=YES
[2022-05-03 20:44:17] PROJECT_ROOT=/boincdata/boinc/project/lhcathome
[2022-05-03 20:44:17] PROMPT_COMMAND=PS1="Singularity> "; unset PROMPT_COMMAND
[2022-05-03 20:44:17] PWD=/var/lib/boinc-client/slots/1
[2022-05-03 20:44:17] PYTHONPATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rucio-clients/1.28.1/lib/python3.6/site-packages:/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/lib64/p
[2022-05-03 20:44:17] RESULT_TEMPLATE=templates/ATLAS_OUT_2
[2022-05-03 20:44:17] RUCIO_ACCOUNT=pilot
[2022-05-03 20:44:17] RUCIO_AUTH_TYPE=x509_proxy
[2022-05-03 20:44:17] RUCIO_HOME=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rucio-clients/1.28.1
[2022-05-03 20:44:17] RUCIO_LOCAL_SITE_ID=BOINC
[2022-05-03 20:44:17] RUCIO_PYTHONBIN=python3
[2022-05-03 20:44:17] RUCIO_PYTHONBINPATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.8.13-fix1-x86_64-centos7/bin/python3.8
[2022-05-03 20:44:17] RUNTIME_CONFIG_DIR=/var/lib/boinc-client/slots/1/
[2022-05-03 20:44:17] SHLVL=3
[2022-05-03 20:44:17] SINGULARITY_BIND=/cvmfs,/var
[2022-05-03 20:44:17] SINGULARITY_CONTAINER=/cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2022-05-03 20:44:17] SINGULARITY_ENVIRONMENT=/.singularity.d/env/91-environment.sh
[2022-05-03 20:44:17] SINGULARITY_NAME=x86_64-centos7
[2022-05-03 20:44:17] SRM_PATH=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/usr/share/srm
[2022-05-03 20:44:17] SSL_CERT_DIR=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/etc/grid-security-emi/certificates
[2022-05-03 20:44:17] UMD_REL_VER=4.1.3-1.el7.centos
[2022-05-03 20:44:17] USER=boinc
[2022-05-03 20:44:17] USER_PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin
[2022-05-03 20:44:17] VIEWSDIR=/cvmfs/sft.cern.ch/lcg/views
[2022-05-03 20:44:17] VOMS_USERCONF=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/etc/vomses
[2022-05-03 20:44:17] VO_ATLAS_AGIS_SITE=BOINC
[2022-05-03 20:44:17] VO_ATLAS_NIGHTLIES_DIR=/cvmfs/atlas-nightlies.cern.ch/repo/sw/nightlies
[2022-05-03 20:44:17] VO_ATLAS_SW_DIR=/cvmfs/atlas.cern.ch/repo/sw
[2022-05-03 20:44:17] WU_TEMPLATE=templates/ATLAS_IN_DYNAMIC
[2022-05-03 20:44:17] X509_CERT_DIR=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/etc/grid-security-emi/certificates
[2022-05-03 20:44:17] X509_USER_PROXY=/tmp/x509up_u124
[2022-05-03 20:44:17] X509_VOMSES=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/etc/vomses
[2022-05-03 20:44:17] X509_VOMS_DIR=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/emi/4.0.2-1_200423.fix4a/etc/grid-security/vomsdir
[2022-05-03 20:44:17] XRDSYS=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/xrootd/5.4.2-x86_64-centos7
[2022-05-03 20:44:17] _=/usr/bin/printenv
[2022-05-03 20:44:17] rcSetup=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rcSetup/00-04-18
[2022-05-03 20:44:17]
[2022-05-03 20:44:17] 2022-05-03 10:44:05,284 [wrapper] Content of /var/lib/boinc-client/slots/1//setup.sh.local
[2022-05-03 20:44:17] export FRONTIER_SERVER="(serverurl=http://atlascern-frontier.openhtc.io:8080/atlr)(serverurl=http://atlasfrontier-ai.cern.ch:8000/atlr)(serverurl=http://ccfrontier.in2p3.fr:23128/ccin2p3-AtlasFrontier
[2022-05-03 20:44:17]
[2022-05-03 20:44:17] ---- Build pilot cmd ----
[2022-05-03 20:44:17] /cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.8.13-fix1-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -j managed --pilot-user=ATLAS --pilot-user ATLAS -w generic --job
[2022-05-03 20:44:17]
[2022-05-03 20:44:17] ---- Ready to run pilot ----
[2022-05-03 20:44:17]
[2022-05-03 20:44:17] 2022-05-03 10:44:05,290 [wrapper] ==== pilot stdout BEGIN ====
[2022-05-03 20:44:17] 2022-05-03 10:44:05,437 | INFO | ****************************************
[2022-05-03 20:44:17] 2022-05-03 10:44:05,437 | INFO | *** PanDA Pilot version 3.3.0 (39) ***
[2022-05-03 20:44:17] 2022-05-03 10:44:05,437 | INFO | ****************************************
[2022-05-03 20:44:17] 2022-05-03 10:44:05,437 | INFO |
[2022-05-03 20:44:17] 2022-05-03 10:44:05,437 | INFO | architecture information:
[2022-05-03 20:44:17] 2022-05-03 10:44:05,481 | INFO |
[2022-05-03 20:44:17] LSB Version: :core-4.1-amd64:core-4.1-noarch
[2022-05-03 20:44:17] Distributor ID: CentOS
[2022-05-03 20:44:17] Description: CentOS Linux release 7.9.2009 (Core)
[2022-05-03 20:44:17] Release: 7.9.2009
[2022-05-03 20:44:17] Codename: Core
[2022-05-03 20:44:17] 2022-05-03 10:44:05,482 | INFO | ****************************************
[2022-05-03 20:44:17] 2022-05-03 10:44:05,482 | WARNING | Failed to initialize SSL context .. skipped, error: certfile should be a valid filesystem path
[2022-05-03 20:44:17] 2022-05-03 10:44:05,496 | WARNING | cache file=/var/lib/boinc-client/slots/1/cric_pandaqueues.json is not available: [Errno 2] No such file or directory: '/var/lib/boinc-client/slots/1/cric_pandaqueu
[2022-05-03 20:44:17] 2022-05-03 10:44:05,496 | INFO | [attempt=1/1] loading data from file=/cvmfs/atlas.cern.ch/repo/sw/local/etc/cric_pandaqueues.json
[2022-05-03 20:44:17] 2022-05-03 10:44:05,506 | INFO | saved data from "/cvmfs/atlas.cern.ch/repo/sw/local/etc/cric_pandaqueues.json" resource into file=/var/lib/boinc-client/slots/1/agis_schedconf.cvmfs.json, length=9
[2022-05-03 20:44:17] 2022-05-03 10:44:05,518 | INFO | queuedata: following keys will be overwritten by config values: {'maxwdir_broken': '14336 MB', 'es_stageout_gap': 601}
[2022-05-03 20:44:17] 2022-05-03 10:44:05,520 | INFO | [attempt=1/3] loading data from url=https://atlas-cric.cern.ch/cache/ddmendpoints.json
[2022-05-03 20:44:17] 2022-05-03 10:44:14,640 | INFO | saved data from "https://atlas-cric.cern.ch/cache/ddmendpoints.json" resource into file=/var/lib/boinc-client/slots/1/agis_ddmendpoints.agis.ALL.json, length=2231.
[2022-05-03 20:44:17] 2022-05-03 10:44:14,678 | INFO | pilot arguments: Namespace(abort_job=<threading.Event object at 0x7f467f467880>, allow_other_country=False, allow_same_user=True, cacert=None, capath=None, cleanup
[2022-05-03 20:44:17] Traceback (most recent call last):
[2022-05-03 20:44:17] File "pilot3/pilot.py", line 612, in <module>
[2022-05-03 20:44:17] trace = main()
[2022-05-03 20:44:17] File "pilot3/pilot.py", line 91, in main
[2022-05-03 20:44:17] workflow = __import__('pilot.workflow.%s' % args.workflow, globals(), locals(), [args.workflow], 0)
[2022-05-03 20:44:17] File "/var/lib/boinc-client/slots/1/pilot3/pilot/workflow/generic.py", line 29, in <module>
[2022-05-03 20:44:17] from pilot.control import job, payload, data, monitor
[2022-05-03 20:44:17] File "/var/lib/boinc-client/slots/1/pilot3/pilot/control/job.py", line 51, in <module>
[2022-05-03 20:44:17] from pilot.util.realtimelogger import cleanup as rtcleanup
[2022-05-03 20:44:17] File "/var/lib/boinc-client/slots/1/pilot3/pilot/util/realtimelogger.py", line 17, in <module>
[2022-05-03 20:44:17] from pilot.util.transport import HttpTransport
[2022-05-03 20:44:17] File "/var/lib/boinc-client/slots/1/pilot3/pilot/util/transport.py", line 16, in <module>
[2022-05-03 20:44:17] from requests.auth import HTTPBasicAuth
[2022-05-03 20:44:17] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rucio-clients/1.28.1/lib/python3.6/site-packages/requests/__init__.py", line 43, in <module>
[2022-05-03 20:44:17] import urllib3
[2022-05-03 20:44:17] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rucio-clients/1.28.1/lib/python3.6/site-packages/urllib3/__init__.py", line 11, in <module>
[2022-05-03 20:44:17] from . import exceptions
[2022-05-03 20:44:17] File "/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rucio-clients/1.28.1/lib/python3.6/site-packages/urllib3/exceptions.py", line 3, in <module>
[2022-05-03 20:44:17] from .packages.six.moves.http_client import IncompleteRead as httplib_IncompleteRead
[2022-05-03 20:44:17] File "<frozen importlib._bootstrap>", line 991, in _find_and_load
[2022-05-03 20:44:17] File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
[2022-05-03 20:44:17] File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
[2022-05-03 20:44:17] File "<frozen importlib._bootstrap_external>", line 839, in exec_module
[2022-05-03 20:44:17] File "<frozen importlib._bootstrap_external>", line 975, in get_code
[2022-05-03 20:44:17] File "<frozen importlib._bootstrap_external>", line 1032, in get_data
[2022-05-03 20:44:17] OSError: [Errno 5] Input/output error: '/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/rucio-clients/1.28.1/lib/python3.6/site-packages/urllib3/packages/__init__.py'
[2022-05-03 20:44:17] 2022-05-03 10:44:17,693 [wrapper] ==== pilot stdout END ====
[2022-05-03 20:44:17] 2022-05-03 10:44:17,696 [wrapper] ==== wrapper stdout RESUME ====
[2022-05-03 20:44:17] 2022-05-03 10:44:17,698 [wrapper] pilotpid: 31187
[2022-05-03 20:44:17] 2022-05-03 10:44:17,701 [wrapper] Pilot exit status: 1
[2022-05-03 20:44:17] 2022-05-03 10:44:17,704 [wrapper] File not found: /var/lib/boinc-client/slots/1/pilot3/pandaIDs.out, no payload
[2022-05-03 20:44:17] 2022-05-03 10:44:17,706 [wrapper] File not found: /var/lib/boinc-client/slots/1/pilot3/pandaIDs.out, no payload
[2022-05-03 20:44:17] 2022-05-03 10:44:17,712 [wrapper] apfmon messages muted
[2022-05-03 20:44:17] 2022-05-03 10:44:17,714 [wrapper] Test setup, not cleaning
[2022-05-03 20:44:17] 2022-05-03 10:44:17,717 [wrapper] ==== wrapper stdout END ====
[2022-05-03 20:44:17] 2022-05-03 10:44:17,719 [wrapper] ==== wrapper stderr END ====
[2022-05-03 20:44:17] 2022-05-03 10:44:17,724 [wrapper] wrapperexiting ec=0, duration=20
[2022-05-03 20:44:17] 2022-05-03 10:44:17,727 [wrapper] apfmon messages muted
[2022-05-03 20:44:17] *** Listing of results directory ***
[2022-05-03 20:44:17] total 460712
[2022-05-03 20:44:17] drwx------ 4 boinc boinc 4096 Apr 29 00:00 pilot3
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 356725 May 3 16:36 pilot2.tar.gz
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 6384 May 3 16:57 queuedata.json
[2022-05-03 20:44:17] -rwx------ 1 boinc boinc 27117 May 3 16:57 runpilot2-wrapper.sh
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 2869 May 3 16:57 pandaJobData.out
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 100 May 3 20:43 wrapper_26015_x86_64-pc-linux-gnu
[2022-05-03 20:44:17] -rwxr-xr-x 1 boinc boinc 6994 May 3 20:43 run_atlas
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 105 May 3 20:43 job.xml
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 9176 May 3 20:43 init_data.xml
[2022-05-03 20:44:17] drwxrwx--x 2 boinc boinc 4096 May 3 20:43 shared
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 0 May 3 20:43 boinc_lockfile
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 8192 May 3 20:43 boinc_mmap_file
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 16493 May 3 20:43 start_atlas.sh
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 368405 May 3 20:43 input.tar.gz
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 467533538 May 3 20:43 EVNT.28801559._000098.pool.root.1
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 2869 May 3 20:43 pandaJob.out
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 461 May 3 20:43 setup.sh.local
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 999083 May 3 20:44 agis_schedconf.cvmfs.json
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 2285335 May 3 20:44 agis_ddmendpoints.agis.ALL.json
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 3258 May 3 20:44 pilotlog.txt
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 25884 May 3 20:44 log.28835167._004793.job.log.1
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 542 May 3 20:44 Yr9KDmWe150n9Rq4apoT9bVoABFKDmABFKDmTPjSDmW9SKDmAU2N8n.diag
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 9109 May 3 20:44 runtime_log.err
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 554 May 3 20:44 runtime_log
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 30720 May 3 20:44 result.tar.gz
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 19673 May 3 20:44 stderr.txt
[2022-05-03 20:44:17] No HITS result produced
[2022-05-03 20:44:17] *** Contents of shared directory: ***
[2022-05-03 20:44:17] total 456988
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 467533538 May 3 20:43 ATLAS.root_0
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 368405 May 3 20:43 input.tar.gz
[2022-05-03 20:44:17] -rw-r--r-- 1 boinc boinc 16493 May 3 20:43 start_atlas.sh
[2022-05-03 20:44:17] -rw------- 1 boinc boinc 30720 May 3 20:44 result.tar.gz
20:44:18 (24546): run_atlas exited; CPU time 4.118071
20:44:18 (24546): called boinc_finish(0)

</stderr_txt>
]]>
ID: 46724 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2500
Credit: 248,498,755
RAC: 128,182
Message 46726 - Posted: 3 May 2022, 11:13:03 UTC - in response to Message 46722.  

There are plenty directly from the developers.
https://github.com/apptainer/apptainer/releases
ID: 46726 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46727 - Posted: 3 May 2022, 11:21:41 UTC - in response to Message 46726.  

There are plenty directly from the developers.
https://github.com/apptainer/apptainer/releases


So the one I linked to then

The result being in the above post
ID: 46727 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2500
Credit: 248,498,755
RAC: 128,182
Message 46728 - Posted: 3 May 2022, 11:39:04 UTC - in response to Message 46727.  

OK.
You may have stumbled over a CERN configuration error.
The script mentioned in at least some of your failed logs has a size of "0":
/cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7/.singularity.d/env/94-appsbase.sh

Needs to be clarified by David Cameron.
Will send him a mail.
ID: 46728 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 93
Credit: 23,511,565
RAC: 21,760
Message 46746 - Posted: 5 May 2022, 3:39:56 UTC

Ok, further to this issue: I seem to have it working again.

In the end I had to completely wipe any trace of singularity, Apptainer, and cvmfs from the system, deleting all files and folders from the system, reboot, the reinstall cvmfs from scratch.
At this point it had a massive hissy fit regarding missing folders and keys, and for some reason refused to populate the /etc/cvmfs directory structure, so I copied it whole from a working system, changed the owner to cvmfs:cvmfs, restarted autofs, and finally got cvmfs_config probe to return something usable.
Then I installed Apptainer and it appears to be working normally again.
ID: 46746 · Report as offensive     Reply Quote

Message boards : ATLAS application : Singularity errors


©2024 CERN