Name GzvMDmyOy56nsSi4ap6QjLDmwznN0nGgGQJmkKkKDmTctKDmn3sbZm_0
Workunit 230507733
Created 22 Feb 2025, 6:42:57 UTC
Sent 22 Feb 2025, 12:01:27 UTC
Report deadline 2 Mar 2025, 12:01:27 UTC
Received 24 Feb 2025, 23:45:26 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x00000000)
Computer ID 10687519
Run time 2 days 11 hours 35 min 54 sec
CPU time 3 days 22 hours 17 min 38 sec
Validate state Valid
Credit 7,638.07
Device peak FLOPS 31.54 GFLOPS
Application version ATLAS Simulation v3.01 (native_mt)
x86_64-pc-linux-gnu
Peak working set size 2.46 GB
Peak swap size 31.84 GB
Peak disk usage 1.31 GB

Stderr output

<core_client_version>7.7.0</core_client_version>
<![CDATA[
<stderr_txt>
07:02:17 (8053): wrapper (7.7.26015): starting
07:02:17 (8053): wrapper: running run_atlas (--nthreads 10)
[2025-02-22 07:02:18] Arguments: --nthreads 10
[2025-02-22 07:02:18] Threads: 10
[2025-02-22 07:02:18] Checking for CVMFS
[2025-02-22 07:02:18] Probing /cvmfs/atlas.cern.ch... OK
[2025-02-22 07:02:19] Probing /cvmfs/atlas-condb.cern.ch... OK
[2025-02-22 07:02:19] Running cvmfs_config stat atlas.cern.ch
[2025-02-22 07:02:20] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2025-02-22 07:02:20] 2.11.2.0 26780 10978 329348 142938 1 349 16697880 18432000 15455 130560 0 36521296 98.846 103308643 38438 http://cvmfs-s1fnal.opensciencegrid.org:8000/cvmfs/atlas.cern.ch http://192.41.237.109:6081 1
[2025-02-22 07:02:20] CVMFS is ok
[2025-02-22 07:02:20] Efficiency of ATLAS tasks can be improved by the following measure(s):
[2025-02-22 07:02:20] The CVMFS client on this computer should be configured to use Cloudflare's openhtc.io.
[2025-02-22 07:02:20] Further information can be found at the LHC@home message board.
[2025-02-22 07:02:20] Using apptainer image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2025-02-22 07:02:20] Checking for apptainer binary...
[2025-02-22 07:02:20] Using apptainer found in PATH at /usr/bin/apptainer
[2025-02-22 07:02:20] Running /usr/bin/apptainer --version
[2025-02-22 07:02:20] apptainer version 1.3.2-1.el7
[2025-02-22 07:02:20] Checking apptainer works with /usr/bin/apptainer exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2025-02-22 07:02:38] c-211-22.aglt2.org
[2025-02-22 07:02:38] apptainer works
[2025-02-22 07:02:38] Set ATHENA_PROC_NUMBER=10
[2025-02-22 07:02:38] Set ATHENA_CORE_NUMBER=10
[2025-02-22 07:02:38] Starting ATLAS job with PandaID=6525230404
[2025-02-22 07:02:38] Running command: /usr/bin/apptainer exec -B /cvmfs,/tmp/boinchome/slots/0 /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh
18:40:04 (8053): BOINC client no longer exists - exiting
18:40:04 (8053): timer handler: client dead, exiting
18:46:58 (36366): wrapper (7.7.26015): starting
18:46:58 (36366): wrapper: running run_atlas (--nthreads 10)
[2025-02-22 18:46:58] Arguments: --nthreads 10
[2025-02-22 18:46:58] Threads: 10
[2025-02-22 18:46:58] This job has been restarted, cleaning up previous attempt
[2025-02-22 18:46:58] Checking for CVMFS
[2025-02-22 18:46:58] Probing /cvmfs/atlas.cern.ch... OK
[2025-02-22 18:46:58] Probing /cvmfs/atlas-condb.cern.ch... OK
[2025-02-22 18:46:58] Running cvmfs_config stat atlas.cern.ch
[2025-02-22 18:46:59] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2025-02-22 18:46:59] 2.11.2.0 26780 11683 333860 142956 1 316 17042722 18432001 25403 130560 0 38937808 98.851 108959562 38251 http://cvmfs-s1fnal.opensciencegrid.org:8000/cvmfs/atlas.cern.ch http://192.41.237.109:6081 1
[2025-02-22 18:46:59] CVMFS is ok
[2025-02-22 18:46:59] Efficiency of ATLAS tasks can be improved by the following measure(s):
[2025-02-22 18:46:59] The CVMFS client on this computer should be configured to use Cloudflare's openhtc.io.
[2025-02-22 18:46:59] Further information can be found at the LHC@home message board.
[2025-02-22 18:46:59] Using apptainer image /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7
[2025-02-22 18:46:59] Checking for apptainer binary...
[2025-02-22 18:46:59] Using apptainer found in PATH at /usr/bin/apptainer
[2025-02-22 18:46:59] Running /usr/bin/apptainer --version
[2025-02-22 18:46:59] apptainer version 1.3.2-1.el7
[2025-02-22 18:46:59] Checking apptainer works with /usr/bin/apptainer exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 hostname
[2025-02-22 18:47:03] c-211-22.aglt2.org
[2025-02-22 18:47:03] apptainer works
[2025-02-22 18:47:03] Set ATHENA_PROC_NUMBER=10
[2025-02-22 18:47:03] Set ATHENA_CORE_NUMBER=10
[2025-02-22 18:47:04] Starting ATLAS job with PandaID=6525230404
[2025-02-22 18:47:04] Running command: /usr/bin/apptainer exec -B /cvmfs,/tmp/boinchome/slots/0 /cvmfs/atlas.cern.ch/repo/containers/fs/singularity/x86_64-centos7 sh start_atlas.sh
[2025-02-24 18:45:04]  *** The last 200 lines of the pilot log: ***
[2025-02-24 18:45:04] 2025-02-24 23:41:54,368 | INFO     | executing command: lscpu
[2025-02-24 18:45:04] 2025-02-24 23:41:55,108 | INFO     | found 20 cores (10 cores per socket, 2 sockets)
[2025-02-24 18:45:04] 2025-02-24 23:41:55,368 | INFO     | executing command: export ATLAS_LOCAL_ROOT_BASE=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase;source ${ATLAS_LOCAL_ROOT_BASE}/user/atlasLocalSetup.sh --quiet;lsetup
[2025-02-24 18:45:04] 2025-02-24 23:41:56,799 | INFO     | PID=4615 has CPU usage=4.8% CMD=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.20-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -
[2025-02-24 18:45:04] 2025-02-24 23:41:56,799 | INFO     | .. there are 41 such processes running
[2025-02-24 18:45:04] 2025-02-24 23:42:24,369 | INFO     | PID=4615 has CPU usage=3.5% CMD=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.20-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -
[2025-02-24 18:45:04] 2025-02-24 23:42:24,370 | INFO     | .. there are 41 such processes running
[2025-02-24 18:45:04] 2025-02-24 23:42:52,115 | INFO     | PID=4615 has CPU usage=2.9% CMD=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.20-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -
[2025-02-24 18:45:04] 2025-02-24 23:42:52,115 | INFO     | .. there are 41 such processes running
[2025-02-24 18:45:04] 2025-02-24 23:43:15,585 | INFO     | CPU arch script returned: x86-64-v2
[2025-02-24 18:45:04] 2025-02-24 23:43:15,585 | INFO     | using path: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_output.txt (trf name=prmon)
[2025-02-24 18:45:04] 2025-02-24 23:43:15,869 | INFO     | extracted standard info from memory monitor json
[2025-02-24 18:45:04] 2025-02-24 23:43:15,869 | WARNING  | standard memory fields were not found in memory monitor json (or json doesn't exist yet): 'totRCHAR'
[2025-02-24 18:45:04] 2025-02-24 23:43:16,612 | INFO     | fitting pss+swap vs Time
[2025-02-24 18:45:04] 2025-02-24 23:43:16,648 | INFO     | model: linear, x: [1740268129.0, 1740268190.0, 1740268251.0, 1740268312.0, 1740268373.0, 1740268434.0, 1740268495.0, 1740268556.0, 1740268617.0, 1740268678.0, 1740
[2025-02-24 18:45:04] 2025-02-24 23:43:16,650 | INFO     | sum of square deviations: 7013243834509.861
[2025-02-24 18:45:04] 2025-02-24 23:43:21,616 | INFO     | sum of deviations: 6307229708808.4795
[2025-02-24 18:45:04] 2025-02-24 23:43:21,617 | INFO     | mean x: 1740354353.5201557
[2025-02-24 18:45:04] 2025-02-24 23:43:21,640 | INFO     | mean y: 2650861.5265205093
[2025-02-24 18:45:04] 2025-02-24 23:43:21,640 | INFO     | -- intersect: -1562504285.135173
[2025-02-24 18:45:04] 2025-02-24 23:43:21,640 | INFO     | intersect: -1562504285.135173
[2025-02-24 18:45:04] 2025-02-24 23:43:21,645 | INFO     | chi2: 7.806351765752151
[2025-02-24 18:45:04] 2025-02-24 23:43:21,648 | INFO     | model: linear, x: [1740268129.0, 1740268190.0, 1740268251.0, 1740268312.0, 1740268373.0, 1740268434.0, 1740268495.0, 1740268556.0, 1740268617.0, 1740268678.0, 1740
[2025-02-24 18:45:04] 2025-02-24 23:43:21,649 | INFO     | sum of square deviations: 6976110648754.888
[2025-02-24 18:45:04] 2025-02-24 23:43:23,478 | INFO     | PID=4615 has CPU usage=3.0% CMD=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.20-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -
[2025-02-24 18:45:04] 2025-02-24 23:43:23,479 | INFO     | .. there are 41 such processes running
[2025-02-24 18:45:04] 2025-02-24 23:43:23,529 | INFO     | sum of deviations: 6290969477871.12
[2025-02-24 18:45:04] 2025-02-24 23:43:23,530 | INFO     | mean x: 1740354201.0198371
[2025-02-24 18:45:04] 2025-02-24 23:43:23,530 | INFO     | mean y: 2650794.7481402764
[2025-02-24 18:45:04] 2025-02-24 23:43:23,530 | INFO     | -- intersect: -1566778893.3051436
[2025-02-24 18:45:04] 2025-02-24 23:43:23,530 | INFO     | intersect: -1566778893.3051436
[2025-02-24 18:45:04] 2025-02-24 23:43:23,534 | INFO     | chi2: 7.805938804263543
[2025-02-24 18:45:04] 2025-02-24 23:43:23,535 | INFO     | current chi2=7.805938804263543 (change=0.005290070201799444 %)
[2025-02-24 18:45:04] 2025-02-24 23:43:23,535 | INFO     | right removable region: 2822
[2025-02-24 18:45:04] 2025-02-24 23:43:23,537 | INFO     | model: linear, x: [1740268434.0, 1740268495.0, 1740268556.0, 1740268617.0, 1740268678.0, 1740268739.0, 1740268800.0, 1740268861.0, 1740268922.0, 1740268983.0, 1740
[2025-02-24 18:45:04] 2025-02-24 23:43:23,539 | INFO     | sum of square deviations: 6976109923464.638
[2025-02-24 18:45:04] 2025-02-24 23:43:24,272 | INFO     | sum of deviations: 5425315101617.163
[2025-02-24 18:45:04] 2025-02-24 23:43:24,272 | INFO     | mean x: 1740354506.0219624
[2025-02-24 18:45:04] 2025-02-24 23:43:24,272 | INFO     | mean y: 2654482.787460149
[2025-02-24 18:45:04] 2025-02-24 23:43:24,272 | INFO     | -- intersect: -1350817823.021497
[2025-02-24 18:45:05] 2025-02-24 23:43:24,273 | INFO     | intersect: -1350817823.021497
[2025-02-24 18:45:05] 2025-02-24 23:43:24,277 | INFO     | chi2: 4.722842534653857
[2025-02-24 18:45:05] 2025-02-24 23:43:24,277 | INFO     | current chi2=4.722842534653857 (change=39.50000363327458 %)
[2025-02-24 18:45:05] 2025-02-24 23:43:24,280 | INFO     | model: linear, x: [1740268739.0, 1740268800.0, 1740268861.0, 1740268922.0, 1740268983.0, 1740269044.0, 1740269105.0, 1740269166.0, 1740269227.0, 1740269288.0, 1740
[2025-02-24 18:45:05] 2025-02-24 23:43:24,282 | INFO     | sum of square deviations: 6939107317206.892
[2025-02-24 18:45:05] 2025-02-24 23:43:24,495 | INFO     | 172555s have passed since pilot start
[2025-02-24 18:45:05] 2025-02-24 23:43:24,775 | INFO     | sum of deviations: 4747637469592.111
[2025-02-24 18:45:05] 2025-02-24 23:43:24,776 | INFO     | mean x: 1740354658.5237758
[2025-02-24 18:45:05] 2025-02-24 23:43:24,776 | INFO     | mean y: 2657275.677430802
[2025-02-24 18:45:05] 2025-02-24 23:43:24,776 | INFO     | -- intersect: -1188068362.2872643
[2025-02-24 18:45:05] 2025-02-24 23:43:24,776 | INFO     | intersect: -1188068362.2872643
[2025-02-24 18:45:05] 2025-02-24 23:43:24,781 | INFO     | chi2: 2.9916409532296
[2025-02-24 18:45:05] 2025-02-24 23:43:24,781 | INFO     | current chi2=2.9916409532296 (change=36.65592423887872 %)
[2025-02-24 18:45:05] 2025-02-24 23:43:24,784 | INFO     | model: linear, x: [1740269044.0, 1740269105.0, 1740269166.0, 1740269227.0, 1740269288.0, 1740269349.0, 1740269410.0, 1740269471.0, 1740269532.0, 1740269593.0, 1740
[2025-02-24 18:45:05] 2025-02-24 23:43:24,785 | INFO     | sum of square deviations: 6902235783174.155
[2025-02-24 18:45:05] 2025-02-24 23:43:27,634 | INFO     | sum of deviations: 4105435483340.5645
[2025-02-24 18:45:05] 2025-02-24 23:43:27,634 | INFO     | mean x: 1740354811.0255954
[2025-02-24 18:45:05] 2025-02-24 23:43:27,635 | INFO     | mean y: 2659931.756132243
[2025-02-24 18:45:05] 2025-02-24 23:43:27,635 | INFO     | -- intersect: -1032499488.9354026
[2025-02-24 18:45:05] 2025-02-24 23:43:27,635 | INFO     | intersect: -1032499488.9354026
[2025-02-24 18:45:05] 2025-02-24 23:43:27,646 | INFO     | chi2: 1.4431459619723896
[2025-02-24 18:45:05] 2025-02-24 23:43:27,647 | INFO     | current chi2=1.4431459619723896 (change=51.76072314378322 %)
[2025-02-24 18:45:05] 2025-02-24 23:43:27,655 | INFO     | model: linear, x: [1740269349.0, 1740269410.0, 1740269471.0, 1740269532.0, 1740269593.0, 1740269654.0, 1740269715.0, 1740269776.0, 1740269837.0, 1740269898.0, 1740
[2025-02-24 18:45:05] 2025-02-24 23:43:27,662 | INFO     | sum of square deviations: 6865495088803.884
[2025-02-24 18:45:05] 2025-02-24 23:43:28,935 | INFO     | sum of deviations: 3851777622365.9775
[2025-02-24 18:45:05] 2025-02-24 23:43:28,936 | INFO     | mean x: 1740354963.5274217
[2025-02-24 18:45:05] 2025-02-24 23:43:28,936 | INFO     | mean y: 2660984.344373219
[2025-02-24 18:45:05] 2025-02-24 23:43:28,936 | INFO     | -- intersect: -973737690.0091126
[2025-02-24 18:45:05] 2025-02-24 23:43:28,936 | INFO     | intersect: -973737690.0091126
[2025-02-24 18:45:05] 2025-02-24 23:43:28,944 | INFO     | chi2: 1.2055081331925313
[2025-02-24 18:45:05] 2025-02-24 23:43:28,944 | INFO     | current chi2=1.2055081331925313 (change=16.46665237209075 %)
[2025-02-24 18:45:05] 2025-02-24 23:43:28,949 | INFO     | left removable region: 40
[2025-02-24 18:45:05] 2025-02-24 23:43:28,952 | INFO     | model: linear, x: [1740270569.0, 1740270630.0, 1740270691.0, 1740270752.0, 1740270813.0, 1740270874.0, 1740270935.0, 1740270996.0, 1740271057.0, 1740271118.0, 1740
[2025-02-24 18:45:05] 2025-02-24 23:43:28,988 | INFO     | sum of square deviations: 6676544878702.307
[2025-02-24 18:45:05] 2025-02-24 23:43:30,871 | INFO     | sum of deviations: 2961673925408.428
[2025-02-24 18:45:05] 2025-02-24 23:43:30,871 | INFO     | mean x: 1740355390.5337887
[2025-02-24 18:45:05] 2025-02-24 23:43:30,871 | INFO     | mean y: 2664610.660316319
[2025-02-24 18:45:05] 2025-02-24 23:43:30,872 | INFO     | -- intersect: -769346253.4514452
[2025-02-24 18:45:05] 2025-02-24 23:43:30,872 | INFO     | intersect: -769346253.4514452
[2025-02-24 18:45:05] 2025-02-24 23:43:30,876 | INFO     | chi2: 0.5402423203205676
[2025-02-24 18:45:05] 2025-02-24 23:43:30,876 | INFO     | -- intersect: -769346253.4514452
[2025-02-24 18:45:05] 2025-02-24 23:43:30,877 | INFO     | current memory leak: 0.44 B/s (using 2782 data points, chi2=0.54)
[2025-02-24 18:45:05] 2025-02-24 23:43:33,388 | INFO     | monitor loop #7076: job 0:6525230404 is in state 'running'
[2025-02-24 18:45:05] 2025-02-24 23:43:44,939 | INFO     | system is under heavy CPU load
[2025-02-24 18:45:05] 2025-02-24 23:43:44,939 | INFO     | CPU consumption time changed by a factor of 1.0009748601339832 (below the limit of 10)
[2025-02-24 18:45:05] 2025-02-24 23:43:44,940 | INFO     | (instant) CPU consumption time for pid=44115: 208438)
[2025-02-24 18:45:05] 2025-02-24 23:43:44,940 | INFO     | using path: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_output.txt (trf name=prmon)
[2025-02-24 18:45:05] 2025-02-24 23:43:45,695 | INFO     | using path: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_output.txt (trf name=prmon)
[2025-02-24 18:45:05] 2025-02-24 23:43:45,957 | INFO     | max memory (maxPSS) used by the payload is within the allowed limit: 2688925 B (2 * maxRSS = 81920000 B, memkillgrace = 100%)
[2025-02-24 18:45:05] 2025-02-24 23:43:45,958 | INFO     | reaping zombies for max 20 seconds
[2025-02-24 18:45:05] 2025-02-24 23:43:46,003 | INFO     | checking for looping job (in state=running)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,004 | INFO     | using looping job limit: 7200 s
[2025-02-24 18:45:05] 2025-02-24 23:43:46,004 | INFO     | executing command: find /tmp/boinchome/slots/0/PanDA_Pilot-6525230404 -mmin -120
[2025-02-24 18:45:05] 2025-02-24 23:43:46,310 | INFO     | found 3 files that were recently updated
[2025-02-24 18:45:05] 2025-02-24 23:43:46,311 | INFO     | file /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/log.EVNTtoHITS is the most recently updated file (at time=1740440108)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,330 | INFO     | files were last touched 0h 8m 38s ago (current time: 1740440626)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,351 | INFO     | payload log (log.EVNTtoHITS) within allowed size limit (2147483648 B): 4038712 B
[2025-02-24 18:45:05] 2025-02-24 23:43:46,351 | INFO     | payload log (payload.stdout) within allowed size limit (2147483648 B): 9339 B
[2025-02-24 18:45:05] 2025-02-24 23:43:46,391 | INFO     | executing command: df -mP /tmp/boinchome/slots/0
[2025-02-24 18:45:05] 2025-02-24 23:43:46,471 | INFO     | sufficient remaining disk space (57358155776 B)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,507 | INFO     | work directory size check will use 2537553920 B as a max limit (10% grace limit added)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,514 | INFO     | size of work directory /tmp/boinchome/slots/0/PanDA_Pilot-6525230404: 283795562 B (within 2537553920 B limit)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,526 | INFO     | total size of present files: 278311977 B (workdir size: 283795562 B)
[2025-02-24 18:45:05] 2025-02-24 23:43:46,526 | INFO     | output file /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/HITS.43092792._002813.pool.root.1 is within allowed size limit (278311977 B < 536870912000 B)
[2025-02-24 18:45:05] 2025-02-24 23:43:50,508 | INFO     | number of running child processes to parent process 44115: 6
[2025-02-24 18:45:05] 2025-02-24 23:43:50,509 | INFO     | maximum number of monitored processes: 6
[2025-02-24 18:45:05] 2025-02-24 23:43:50,842 | INFO     | PID=4615 has CPU usage=9.7% CMD=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.20-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -
[2025-02-24 18:45:05] 2025-02-24 23:43:50,842 | INFO     | .. there are 41 such processes running
[2025-02-24 18:45:05] 2025-02-24 23:43:51,051 | INFO     | running: iteration=2830 pid=44115 exit_code=None
[2025-02-24 18:45:05] 2025-02-24 23:43:53,032 | INFO     | monitor loop #7077: job 0:6525230404 is in state 'running'
[2025-02-24 18:45:05] 2025-02-24 23:44:05,281 | INFO     | system is under heavy CPU load
[2025-02-24 18:45:05] 2025-02-24 23:44:05,281 | INFO     | CPU consumption time changed by a factor of 1.0002110939464013 (below the limit of 10)
[2025-02-24 18:45:05] 2025-02-24 23:44:05,281 | INFO     | (instant) CPU consumption time for pid=44115: 208482)
[2025-02-24 18:45:05] 2025-02-24 23:44:05,282 | INFO     | using path: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_output.txt (trf name=prmon)
[2025-02-24 18:45:05] 2025-02-24 23:44:07,727 | INFO     | number of running child processes to parent process 44115: 6
[2025-02-24 18:45:05] 2025-02-24 23:44:07,727 | INFO     | maximum number of monitored processes: 6
[2025-02-24 18:45:05] 2025-02-24 23:44:10,236 | INFO     | monitor loop #7078: job 0:6525230404 is in state 'running'
[2025-02-24 18:45:05] 2025-02-24 23:44:16,283 | INFO     | PID=4615 has CPU usage=5.0% CMD=/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/python/3.9.20-x86_64-centos7/bin/python3 pilot3/pilot.py -q BOINC_MCORE -i PR -
[2025-02-24 18:45:05] 2025-02-24 23:44:16,284 | INFO     | .. there are 41 such processes running
[2025-02-24 18:45:05] 2025-02-24 23:44:23,449 | INFO     | system is under heavy CPU load
[2025-02-24 18:45:05] 2025-02-24 23:44:23,449 | INFO     | CPU consumption time changed by a factor of 1.0001486938920385 (below the limit of 10)
[2025-02-24 18:45:05] 2025-02-24 23:44:23,509 | INFO     | (instant) CPU consumption time for pid=44115: 208513)
[2025-02-24 18:45:05] 2025-02-24 23:44:23,536 | INFO     | using path: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_output.txt (trf name=prmon)
[2025-02-24 18:45:05] 2025-02-24 23:44:26,070 | INFO     | number of running child processes to parent process 44115: 6
[2025-02-24 18:45:05] 2025-02-24 23:44:26,071 | INFO     | maximum number of monitored processes: 6
[2025-02-24 18:45:05] 2025-02-24 23:44:28,595 | INFO     | monitor loop #7079: job 0:6525230404 is in state 'running'
[2025-02-24 18:45:05] 2025-02-24 23:44:31,459 | CRITICAL | max running time (172800s) minus grace time (180s) has been exceeded - time to abort pilot
[2025-02-24 18:45:05] 2025-02-24 23:44:31,459 | INFO     | setting REACHED_MAXTIME and graceful stop
[2025-02-24 18:45:05] 2025-02-24 23:44:31,459 | INFO     | [monitor] control thread has ended
[2025-02-24 18:45:05] 2025-02-24 23:44:31,467 | WARNING  | since job:queue_monitor is responsible for sending job updates, we sleep for 20 s
[2025-02-24 18:45:05] 2025-02-24 23:44:31,587 | INFO     | breaking -- sending SIGTERM to pid=44115
[2025-02-24 18:45:05] 2025-02-24 23:44:31,587 | INFO     | breaking -- sleep 10 s before sending SIGKILL pid=44115
[2025-02-24 18:45:05] 2025-02-24 23:44:32,475 | INFO     | all data control threads have been joined
[2025-02-24 18:45:05] 2025-02-24 23:44:32,583 | INFO     | [data] copytool_in thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:32,728 | INFO     | all payload control threads have been joined
[2025-02-24 18:45:05] 2025-02-24 23:44:32,785 | INFO     | [payload] failed_post thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:32,989 | INFO     | all job control threads have been joined
[2025-02-24 18:45:05] 2025-02-24 23:44:33,481 | INFO     | [data] control thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:33,496 | INFO     | [job] create_data_payload thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:33,546 | INFO     | [job] validate thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:33,734 | INFO     | [payload] control thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:33,936 | INFO     | [payload] validate_pre thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:33,949 | INFO     | [job] retrieve thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:33,995 | INFO     | [job] control thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:34,005 | INFO     | [payload] validate_post thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:34,492 | INFO     | [data] copytool_out thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:36,344 | INFO     | [data] queue_monitor thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:37,550 | INFO     | job.realtimelogging is not enabled
[2025-02-24 18:45:05] 2025-02-24 23:44:38,555 | INFO     | [payload] run_realtimelog thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:39,644 | INFO     | system is under heavy CPU load
[2025-02-24 18:45:05] 2025-02-24 23:44:39,644 | INFO     | CPU consumption time changed by a factor of 2.8775184281076e-05 (below the limit of 10)
[2025-02-24 18:45:05] 2025-02-24 23:44:39,645 | INFO     | (instant) CPU consumption time for pid=44115: 6)
[2025-02-24 18:45:05] 2025-02-24 23:44:39,645 | INFO     | using path: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_output.txt (trf name=prmon)
[2025-02-24 18:45:05] 2025-02-24 23:44:40,902 | INFO     | number of running child processes to parent process 44115: 1
[2025-02-24 18:45:05] 2025-02-24 23:44:40,903 | INFO     | maximum number of monitored processes: 6
[2025-02-24 18:45:05] 2025-02-24 23:44:40,903 | INFO     | will abort loop
[2025-02-24 18:45:05] 2025-02-24 23:44:41,638 | INFO     | 
[2025-02-24 18:45:05] 
[2025-02-24 18:45:05] finished pid=44115 exit_code=None state=failed
[2025-02-24 18:45:05] 
[2025-02-24 18:45:05] 2025-02-24 23:44:41,638 | WARNING  | detected unset exit_code from wait_graceful - reset to -1
[2025-02-24 18:45:05] 2025-02-24 23:44:41,640 | INFO     | using pid=17313 to kill prmon
[2025-02-24 18:45:05] 2025-02-24 23:44:41,641 | INFO     | stopping utility process 'MemoryMonitor' with signal 10
[2025-02-24 18:45:05] 2025-02-24 23:44:41,641 | INFO     | process 17313 no longer exists
[2025-02-24 18:45:05] 2025-02-24 23:44:41,641 | INFO     | utility process 44127 cleanup finished with status=True
[2025-02-24 18:45:05] 2025-02-24 23:44:41,641 | INFO     | taking a short nap (3 s) to allow the memory monitor to finish writing to the summary file (#0/#20)
[2025-02-24 18:45:05] 2025-02-24 23:44:41,910 | INFO     | [job] job monitor thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:44,657 | INFO     | copied /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/memory_monitor_summary.json to /tmp/boinchome/slots/0
[2025-02-24 18:45:05] 2025-02-24 23:44:44,786 | INFO     | found no lingering processes
[2025-02-24 18:45:05] 2025-02-24 23:44:44,786 | INFO     | CPU consumption time: 5511.33 s (rounded to 5511 s)
[2025-02-24 18:45:05] 2025-02-24 23:44:44,786 | WARNING  | main payload execution returned non-zero exit code: -1
[2025-02-24 18:45:05] 2025-02-24 23:44:44,787 | INFO     | scanning dmesg message for subprocess=4669 for memory errors
[2025-02-24 18:45:05] 2025-02-24 23:44:44,787 | INFO     | executing command: dmesg|grep 4669
[2025-02-24 18:45:05] 2025-02-24 23:44:45,206 | WARNING  | job report does not exist: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/jobReport.json
[2025-02-24 18:45:05] 2025-02-24 23:44:45,206 | WARNING  | metadata does not exist: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/metadata.xml
[2025-02-24 18:45:05] 2025-02-24 23:44:45,206 | WARNING  | file does not exist: /tmp/boinchome/slots/0/PanDA_Pilot-6525230404/metadata.xml
[2025-02-24 18:45:05] 2025-02-24 23:44:45,207 | INFO     | generated guid for lfn=HITS.43092792._002813.pool.root.1: 8EA7F4B2-B1A3-4330-87F0-E21993C4C1AE
[2025-02-24 18:45:05] 2025-02-24 23:44:45,207 | WARNING  | aborting payload error diagnosis since an error has already been set: [1315, 1187]
[2025-02-24 18:45:05] 2025-02-24 23:44:46,410 | INFO     | [payload] execute_payloads thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:47,610 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140194755405632)>', '<ExcThread(queue_monitor, started 140194082703104)>']
[2025-02-24 18:45:05] 2025-02-24 23:44:49,622 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140194755405632)>', '<ExcThread(queue_monitor, started 140194082703104)>']
[2025-02-24 18:45:05] 2025-02-24 23:44:51,635 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140194755405632)>', '<ExcThread(queue_monitor, started 140194082703104)>']
[2025-02-24 18:45:05] 2025-02-24 23:44:53,647 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140194755405632)>', '<ExcThread(queue_monitor, started 140194082703104)>']
[2025-02-24 18:45:05] 2025-02-24 23:44:54,576 | INFO     | waiting for thread to finish: ['<_MainThread(MainThread, started 140194755405632)>', '<ExcThread(queue_monitor, started 140194082703104)>']
[2025-02-24 18:45:05] 2025-02-24 23:44:54,576 | INFO     | [job] queue monitor thread has finished
[2025-02-24 18:45:05] 2025-02-24 23:44:55,660 | INFO     | caller=run is remaining thread - safe to abort (names=['<_MainThread(MainThread, started 140194755405632)>'])
[2025-02-24 18:45:05] 2025-02-24 23:45:00,686 | INFO     | all workflow threads have been joined
[2025-02-24 18:45:05] 2025-02-24 23:45:00,690 | INFO     | end of generic workflow (traces error code: 0)
[2025-02-24 18:45:05] 2025-02-24 23:45:00,690 | INFO     | traces error code: 0
[2025-02-24 18:45:05] 2025-02-24 23:45:00,690 | INFO     | pilot has finished (exit code=0, shell exit code=0)
[2025-02-24 18:45:05] 2025-02-24 23:45:01,560 [wrapper] ==== pilot stdout END ====
[2025-02-24 18:45:05] 2025-02-24 23:45:01,610 [wrapper] ==== wrapper stdout RESUME ====
[2025-02-24 18:45:05] 2025-02-24 23:45:01,676 [wrapper] pilotpid: 4615
[2025-02-24 18:45:05] 2025-02-24 23:45:01,716 [wrapper] Pilot exit status: 0
[2025-02-24 18:45:05] 2025-02-24 23:45:01,917 [wrapper] pandaids: 6525230404 6525230404
[2025-02-24 18:45:05] 2025-02-24 23:45:02,657 [wrapper] cleanup supervisor_pilot  9726 4616
[2025-02-24 18:45:05] 2025-02-24 23:45:02,694 [wrapper] Test setup, not cleaning
[2025-02-24 18:45:05] 2025-02-24 23:45:02,767 [wrapper] apfmon messages muted
[2025-02-24 18:45:05] 2025-02-24 23:45:02,795 [wrapper] ==== wrapper stdout END ====
[2025-02-24 18:45:05] 2025-02-24 23:45:02,831 [wrapper] ==== wrapper stderr END ====
[2025-02-24 18:45:05]  *** Error codes and diagnostics ***
[2025-02-24 18:45:05]  *** Listing of results directory ***
[2025-02-24 18:45:05] total 783256
[2025-02-24 18:45:05] drwx------ 4 boincer umatlas      4096 Dec 18 06:03 pilot3
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas    491065 Feb 22 01:14 pilot3.tar.gz
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas      5118 Feb 22 01:41 queuedata.json
[2025-02-24 18:45:05] -rwx------ 1 boincer umatlas     35865 Feb 22 01:42 runpilot2-wrapper.sh
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas       100 Feb 22 07:02 wrapper_26015_x86_64-pc-linux-gnu
[2025-02-24 18:45:05] -rwxr-xr-x 1 boincer umatlas      7986 Feb 22 07:02 run_atlas
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas       105 Feb 22 07:02 job.xml
[2025-02-24 18:45:05] -rw-r--r-- 3 boincer umatlas 350014465 Feb 22 07:02 EVNT.43092790._000015.pool.root.1
[2025-02-24 18:45:05] -rw-r--r-- 3 boincer umatlas 350014465 Feb 22 07:02 ATLAS.root_0
[2025-02-24 18:45:05] -rw-r--r-- 2 boincer umatlas    503559 Feb 22 07:02 input.tar.gz
[2025-02-24 18:45:05] -rw-r--r-- 2 boincer umatlas     17569 Feb 22 07:02 start_atlas.sh
[2025-02-24 18:45:05] drwxrwx--x 2 boincer umatlas      4096 Feb 22 07:02 shared
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas         0 Feb 22 07:02 boinc_lockfile
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas      2554 Feb 22 18:47 pandaJob.out
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas       467 Feb 22 18:47 setup.sh.local
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas    987744 Feb 22 18:47 agis_schedconf.cvmfs.json
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas   1582569 Feb 22 18:47 agis_ddmendpoints.agis.ALL.json
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas      6038 Feb 24 18:40 init_data.xml
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas       921 Feb 24 18:43 heartbeat.json
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas        96 Feb 24 18:43 pilot_heartbeat.json
[2025-02-24 18:45:05] drwxrwx--- 2 boincer umatlas      4096 Feb 24 18:44 PanDA_Pilot-6525230404
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas      1065 Feb 24 18:44 memory_monitor_summary.json
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas       534 Feb 24 18:44 boinc_task_state.xml
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas  32722390 Feb 24 18:45 pilotlog.txt
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas  32754461 Feb 24 18:45 log.43092792._002813.job.log.1
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas       571 Feb 24 18:45 runtime_log
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas  32768000 Feb 24 18:45 result.tar.gz
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas     10834 Feb 24 18:45 runtime_log.err
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas       784 Feb 24 18:45 GzvMDmyOy56nsSi4ap6QjLDmwznN0nGgGQJmkKkKDmTctKDmn3sbZm.diag
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas      8192 Feb 24 18:45 boinc_mmap_file
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas        30 Feb 24 18:45 wrapper_checkpoint.txt
[2025-02-24 18:45:05] -rw-r--r-- 1 boincer umatlas     27806 Feb 24 18:45 stderr.txt
[2025-02-24 18:45:05] No HITS result produced
[2025-02-24 18:45:05]  *** Contents of shared directory: ***
[2025-02-24 18:45:05] total 374328
[2025-02-24 18:45:05] -rw-r--r-- 3 boincer umatlas 350014465 Feb 22 07:02 ATLAS.root_0
[2025-02-24 18:45:05] -rw-r--r-- 2 boincer umatlas    503559 Feb 22 07:02 input.tar.gz
[2025-02-24 18:45:05] -rw-r--r-- 2 boincer umatlas     17569 Feb 22 07:02 start_atlas.sh
[2025-02-24 18:45:05] -rw------- 1 boincer umatlas  32768000 Feb 24 18:45 result.tar.gz
18:45:06 (36366): run_atlas exited; CPU time 7692.725359
18:45:06 (36366): called boinc_finish(0)

</stderr_txt>
]]>


©2025 CERN