1) Message boards : ATLAS application : All new tasks failing after about 5 minutes (Message 45700)
Posted 15 Nov 2021 by Cody
Post:
Thanks for pointing that out. I've 3 boxes and missed that it was misconfigured on that system, now corrected.

I've two linux hosts, one windows 10 box, Atlas is failing on all of of them. I've double checked the VB extensions is installed on all systems, and that it matches the version of vbox. Will check to see how that impacts the tasks overnight.

C
2) Message boards : ATLAS application : All new tasks failing after about 5 minutes (Message 45696)
Posted 14 Nov 2021 by Cody
Post:
My Atlas tasks are also failing after 10 minutes +- 40 seconds. CMS vbox tasks complete on the same host.

System setup Windows 10 recent install, current BOINC verison with virtual box installed.
12 cpu, 32 gb memory, twin 1050ti cards.

One example of a failed task is below

https://lhcathome.cern.ch/lhcathome/result.php?resultid=332861223

Looking at the log it looks like it fails almost immediately, just 4 seconds after starting, so I'm not sure where that 10 minutes is calculated from.


[2021-11-14 15:19:18] Running command: /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec --pwd /var/lib/boinc-client/slots/16 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh
[2021-11-14 15:19:18] Job failed
[2021-11-14 15:19:18] FATAL: container creation failed: hook function for tag prelayer returns error: failed to create /var/lib/alternatives directory: mkdir /var/lib/alternatives: read-only file system
[2021-11-14 15:19:18] ./runtime_log.err
[2021-11-14 15:19:18] ./runtime_log

Any ideas where I should look to fix this ?

C


Full log below.

<core_client_version>7.16.6</core_client_version>
<![CDATA[
<message>
process exited with code 195 (0xc3, -61)</message>
<stderr_txt>
15:19:14 (64254): wrapper (7.7.26015): starting
15:19:14 (64254): wrapper: running run_atlas (--nthreads 8)
[2021-11-14 15:19:14] Arguments: --nthreads 8
[2021-11-14 15:19:14] Threads: 8
[2021-11-14 15:19:14] Checking for CVMFS
[2021-11-14 15:19:15] Probing /cvmfs/atlas.cern.ch... OK
[2021-11-14 15:19:15] Probing /cvmfs/atlas-condb.cern.ch... OK
[2021-11-14 15:19:15] Running cvmfs_config stat atlas.cern.ch
[2021-11-14 15:19:15] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2021-11-14 15:19:15] 2.8.2.0 64387 0 24772 95926 4 1 2997157 4096000 0 130560 0 0 100.000 0 0 http://cernvmfs.gridpp.rl.ac.uk:8000/cvmfs/atlas.cern.ch DIRECT 1
[2021-11-14 15:19:15] CVMFS is ok
[2021-11-14 15:19:15] Efficiency of ATLAS tasks can be improved by the following measure(s):
[2021-11-14 15:19:15] The CVMFS client on this computer should be configured to use Cloudflare's openhtc.io.
[2021-11-14 15:19:15] Small home clusters do not require a local http proxy but it is suggested if
[2021-11-14 15:19:15] more than 10 cores throughout the same LAN segment are regularly running ATLAS like tasks.
[2021-11-14 15:19:15] Further information can be found at the LHC@home message board.
[2021-11-14 15:19:15] Using singularity image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2021-11-14 15:19:15] Checking for singularity binary...
[2021-11-14 15:19:15] Singularity is not installed, using version from CVMFS
[2021-11-14 15:19:15] Checking singularity works with /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2021-11-14 15:19:16] thor
[2021-11-14 15:19:16] Singularity works
[2021-11-14 15:19:18] Set ATHENA_PROC_NUMBER=8
[2021-11-14 15:19:18] Starting ATLAS job with PandaID=5254591102
[2021-11-14 15:19:18] Running command: /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity exec --pwd /var/lib/boinc-client/slots/16 -B /cvmfs,/var /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh
[2021-11-14 15:19:18] Job failed
[2021-11-14 15:19:18] FATAL: container creation failed: hook function for tag prelayer returns error: failed to create /var/lib/alternatives directory: mkdir /var/lib/alternatives: read-only file system
[2021-11-14 15:19:18] ./runtime_log.err
[2021-11-14 15:19:18] ./runtime_log
15:29:18 (64254): run_atlas exited; CPU time 0.330064
15:29:18 (64254): app exit status: 0x1
15:29:18 (64254): called boinc_finish(195)

</stderr_txt>
]]>



©2024 CERN