Message boards : ATLAS application : Failed to execute payload:/bin/bash: Sim_tf.py: command not found
Message board moderation

To post messages, you must log in.

AuthorMessage
wujj123456

Send message
Joined: 14 Sep 08
Posts: 52
Credit: 66,850,956
RAC: 31,909
Message 51029 - Posted: 9 Nov 2024, 5:27:37 UTC

I got a whole lot of invalid results due to this error across multiple machines starting today. The tasks don't show up as errors, but invalid. Example: https://lhcathome.cern.ch/lhcathome/result.php?resultid=416180255

My normal troubleshooting command shows cvmfs is OK on the host. I do occasionally get WUs actually crunch for a while instead of giving up right away.
$ cvmfs_config probe
Probing /cvmfs/atlas.cern.ch... OK
Probing /cvmfs/atlas-condb.cern.ch... OK
Probing /cvmfs/grid.cern.ch... OK
Probing /cvmfs/cernvm-prod.cern.ch... OK
Probing /cvmfs/sft.cern.ch... OK
Probing /cvmfs/alice.cern.ch... OK

Is this a problem with my setup or are the latest batch of tasks bad?
ID: 51029 · Report as offensive     Reply Quote
Saturn911

Send message
Joined: 3 Nov 12
Posts: 68
Credit: 150,128,081
RAC: 123,299
Message 51030 - Posted: 9 Nov 2024, 5:37:54 UTC - in response to Message 51029.  

+1
https://lhcathome.cern.ch/lhcathome/result.php?resultid=416182268

[2024-11-09 06:29:49] 2024-11-09 05:29:35,541 | INFO | extracted standard info from prmon json
[2024-11-09 06:29:49] 2024-11-09 05:29:35,541 | INFO | extracted standard memory fields from prmon json
[2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | WARNING | GPU info not found in prmon json: 'gpu'
[2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | WARNING | found no stored workdir sizes
[2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | INFO | will not add max space = 0 B to job metrics
[2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | WARNING | wrong length of table data, x=[1731130164.0], y=[0.0] (must be same and length>=4)

What's prmon json
ID: 51030 · Report as offensive     Reply Quote
ktamail666

Send message
Joined: 11 Jul 06
Posts: 6
Credit: 2,915,386
RAC: 1
Message 51031 - Posted: 9 Nov 2024, 5:54:43 UTC - in response to Message 51029.  
Last modified: 9 Nov 2024, 6:01:23 UTC

I also got of bunch of tasks that finished with error. I don't think this is client side error. However, yesterday there were a few good pairs that ran with 50 events.

pilotErrorDiags = ['Failed to execute payload:/bin/bash: Sim_tf.py: command not found\n', 'Payload metadata does not exist
ID: 51031 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1841
Credit: 126,296,188
RAC: 124,680
Message 51032 - Posted: 9 Nov 2024, 7:35:35 UTC

ID: 51032 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 105
Credit: 26,099,112
RAC: 1,161
Message 51033 - Posted: 9 Nov 2024, 8:54:14 UTC

I have over 450 invalids. Initially I blamed my own machines for conflicts or memory issues but after poking around uninstalling things, rebooting, removing and re-adding the project etc, and then observing it happening on more than one machine it seems it's the units themselves. Very frustrating.
https://lhcathome.cern.ch/lhcathome/results.php?userid=268818&offset=0&show_names=0&state=5&appid=
ID: 51033 · Report as offensive     Reply Quote
Boone

Send message
Joined: 22 Sep 08
Posts: 2
Credit: 10,014,169
RAC: 2,430
Message 51034 - Posted: 9 Nov 2024, 12:40:59 UTC - in response to Message 51031.  

I also got of bunch of tasks that finished with error. I don't think this is client side error. However, yesterday there were a few good pairs that ran with 50 events.

pilotErrorDiags = ['Failed to execute payload:/bin/bash: Sim_tf.py: command not found\n', 'Payload metadata does not exist

Same here. Therefore this might be on server side (CVMFS) or the WUs themselves are not correct ('Payload metadata').
So i put it on no new Work since this is solved.
ID: 51034 · Report as offensive     Reply Quote
0x8A63F77D

Send message
Joined: 12 Sep 11
Posts: 1
Credit: 1,059,352
RAC: 0
Message 51043 - Posted: 11 Nov 2024, 12:34:09 UTC

Same here https://lhcathome.cern.ch/lhcathome/result.php?resultid=416524959
ID: 51043 · Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 15 Jul 05
Posts: 26
Credit: 2,428,108
RAC: 229
Message 51222 - Posted: 28 Nov 2024, 23:08:44 UTC

I got also one result with this error
https://lhcathome.cern.ch/lhcathome/result.php?resultid=417576672
I did a project reset some days before and this was the first time I got an Atlas task. So the Atlas vdi was just newly downloaded for this result
Matthias

ID: 51222 · Report as offensive     Reply Quote
ktamail666

Send message
Joined: 11 Jul 06
Posts: 6
Credit: 2,915,386
RAC: 1
Message 51225 - Posted: 30 Nov 2024, 9:52:39 UTC

As I see it works now. Since yesterday, I’ve had 79 successful runs. I also checked the logs containing Sim_tf.py lines, so it’s probably using it. However, the download speed varies between 50 kbps or 500 kbps. At least in native version.
...
09:41:39 Py:Sim_tf INFO **** Setting-up configuration flags
09:42:10 Py:AutoConfigFlags INFO Obtaining metadata of auto-configuration by peeking into 'EVNT.41942093._000662.pool.root.1'
09:42:12 Py:MetaReader INFO Current mode used: lite
09:42:15 Py:PileUp INFO There are 400 events in this run.
...
ID: 51225 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 745
Credit: 51,965,552
RAC: 31,404
Message 51226 - Posted: 30 Nov 2024, 10:13:14 UTC

Started to get a few tasks with this error again today. https://lhcathome.cern.ch/lhcathome/result.php?resultid=417634622
Most of the tasks working OK though.
ID: 51226 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1841
Credit: 126,296,188
RAC: 124,680
Message 51230 - Posted: 30 Nov 2024, 19:01:46 UTC - in response to Message 51226.  

Started to get a few tasks with this error again today. https://lhcathome.cern.ch/lhcathome/result.php?resultid=417634622
Most of the tasks working OK though.
same here
ID: 51230 · Report as offensive     Reply Quote

Message boards : ATLAS application : Failed to execute payload:/bin/bash: Sim_tf.py: command not found


©2025 CERN