Message boards :
ATLAS application :
Failed to execute payload:/bin/bash: Sim_tf.py: command not found
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 Sep 08 Posts: 52 Credit: 70,678,817 RAC: 99,388 ![]() ![]() ![]() |
I got a whole lot of invalid results due to this error across multiple machines starting today. The tasks don't show up as errors, but invalid. Example: https://lhcathome.cern.ch/lhcathome/result.php?resultid=416180255 My normal troubleshooting command shows cvmfs is OK on the host. I do occasionally get WUs actually crunch for a while instead of giving up right away. $ cvmfs_config probe Probing /cvmfs/atlas.cern.ch... OK Probing /cvmfs/atlas-condb.cern.ch... OK Probing /cvmfs/grid.cern.ch... OK Probing /cvmfs/cernvm-prod.cern.ch... OK Probing /cvmfs/sft.cern.ch... OK Probing /cvmfs/alice.cern.ch... OK Is this a problem with my setup or are the latest batch of tasks bad? |
Send message Joined: 3 Nov 12 Posts: 69 Credit: 156,312,027 RAC: 118,313 ![]() ![]() ![]() |
+1 https://lhcathome.cern.ch/lhcathome/result.php?resultid=416182268 [2024-11-09 06:29:49] 2024-11-09 05:29:35,541 | INFO | extracted standard info from prmon json [2024-11-09 06:29:49] 2024-11-09 05:29:35,541 | INFO | extracted standard memory fields from prmon json [2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | WARNING | GPU info not found in prmon json: 'gpu' [2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | WARNING | found no stored workdir sizes [2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | INFO | will not add max space = 0 B to job metrics [2024-11-09 06:29:49] 2024-11-09 05:29:35,542 | WARNING | wrong length of table data, x=[1731130164.0], y=[0.0] (must be same and length>=4) What's prmon json |
Send message Joined: 11 Jul 06 Posts: 6 Credit: 2,915,386 RAC: 0 ![]() ![]() |
I also got of bunch of tasks that finished with error. I don't think this is client side error. However, yesterday there were a few good pairs that ran with 50 events. pilotErrorDiags = ['Failed to execute payload:/bin/bash: Sim_tf.py: command not found\n', 'Payload metadata does not exist |
Send message Joined: 18 Dec 15 Posts: 1863 Credit: 132,034,990 RAC: 111,349 ![]() ![]() ![]() |
|
![]() Send message Joined: 7 Aug 11 Posts: 105 Credit: 26,817,094 RAC: 22,336 ![]() ![]() ![]() |
I have over 450 invalids. Initially I blamed my own machines for conflicts or memory issues but after poking around uninstalling things, rebooting, removing and re-adding the project etc, and then observing it happening on more than one machine it seems it's the units themselves. Very frustrating. https://lhcathome.cern.ch/lhcathome/results.php?userid=268818&offset=0&show_names=0&state=5&appid= |
Send message Joined: 22 Sep 08 Posts: 2 Credit: 10,593,882 RAC: 23,621 ![]() ![]() ![]() |
I also got of bunch of tasks that finished with error. I don't think this is client side error. However, yesterday there were a few good pairs that ran with 50 events. Same here. Therefore this might be on server side (CVMFS) or the WUs themselves are not correct ('Payload metadata'). So i put it on no new Work since this is solved. |
Send message Joined: 12 Sep 11 Posts: 1 Credit: 1,059,352 RAC: 0 ![]() ![]() |
Same here https://lhcathome.cern.ch/lhcathome/result.php?resultid=416524959 |
Send message Joined: 15 Jul 05 Posts: 26 Credit: 2,440,469 RAC: 325 ![]() ![]() |
I got also one result with this error https://lhcathome.cern.ch/lhcathome/result.php?resultid=417576672 I did a project reset some days before and this was the first time I got an Atlas task. So the Atlas vdi was just newly downloaded for this result Matthias |
Send message Joined: 11 Jul 06 Posts: 6 Credit: 2,915,386 RAC: 0 ![]() ![]() |
As I see it works now. Since yesterday, I’ve had 79 successful runs. I also checked the logs containing Sim_tf.py lines, so it’s probably using it. However, the download speed varies between 50 kbps or 500 kbps. At least in native version. ... 09:41:39 Py:Sim_tf INFO **** Setting-up configuration flags 09:42:10 Py:AutoConfigFlags INFO Obtaining metadata of auto-configuration by peeking into 'EVNT.41942093._000662.pool.root.1' 09:42:12 Py:MetaReader INFO Current mode used: lite 09:42:15 Py:PileUp INFO There are 400 events in this run. ... |
![]() Send message Joined: 28 Sep 04 Posts: 760 Credit: 54,077,868 RAC: 41,326 ![]() ![]() ![]() |
Started to get a few tasks with this error again today. https://lhcathome.cern.ch/lhcathome/result.php?resultid=417634622 Most of the tasks working OK though. ![]() |
Send message Joined: 18 Dec 15 Posts: 1863 Credit: 132,034,990 RAC: 111,349 ![]() ![]() ![]() |
Started to get a few tasks with this error again today. https://lhcathome.cern.ch/lhcathome/result.php?resultid=417634622same here |
Send message Joined: 4 Mar 20 Posts: 13 Credit: 5,768,722 RAC: 6,377 ![]() ![]() ![]() |
Got two of them today https://lhcathome.cern.ch/lhcathome/result.php?resultid=421567273 https://lhcathome.cern.ch/lhcathome/result.php?resultid=421547620 Until now these are the only two. Rest is OK. |
![]() Send message Joined: 28 Sep 04 Posts: 760 Credit: 54,077,868 RAC: 41,326 ![]() ![]() ![]() |
Over 500 validate errors during the past couple of days. ![]() |
Send message Joined: 17 Jun 21 Posts: 14 Credit: 3,041,587 RAC: 25,195 ![]() ![]() ![]() |
Over 500 validate errors during the past couple of days. Yeah, I had to set my boxes to "No New Tasks" so they waste electricity on the bad WU's |
©2025 CERN