log in

Another batch of faulty WUs?


Advanced search

Message boards : ATLAS application : Another batch of faulty WUs?

Author Message
Dave Peachey
Send message
Joined: 9 May 09
Posts: 17
Credit: 752,075
RAC: 0
Message 30069 - Posted: 26 Apr 2017, 21:42:52 UTC
Last modified: 26 Apr 2017, 21:51:10 UTC

I'm starting to see a number of WUs terminating early and giving a "Validate error" in the results.

Examples include:
- Workunit 65970381
- Workunit 65971624
- Workunit 65971430
each of which has the common parameters:
- name includes text string ..Su7Ccp2YYBZmABFKDmABFKDm3INKDm..
- taskID = 10995533
and all of which are terminating early (anything from 10 to 30 minutes elapsed run-time).

As these are relatively new batch of WUs (created around 10:00 UTC today) and I haven't had any/many wingmen report results, I don't know whether this is "just me" or a symptom of another batch of faulty WUs.

Having said all of the above, I have also had some successes with WUs bearing these parameters so that would suggest it isn't necessarily a completely faulty batch and that maybe some other factors are involved (although my machine is generally stable so I don't believe the fault is in the hardware/software set-up).

Is anyone else seeing the same or similar behaviour with WUs having these parameters?

Dave

Crystal Pellet
Volunteer moderator
Volunteer tester
Send message
Joined: 14 Jan 10
Posts: 384
Credit: 2,997,809
RAC: 2,011
Message 30072 - Posted: 27 Apr 2017, 6:02:45 UTC - in response to Message 30069.

From the same batch, I had one successful: https://lhcathome.cern.ch/lhcathome/result.php?resultid=136702475

Erich56
Send message
Joined: 18 Dec 15
Posts: 383
Credit: 3,873,774
RAC: 7,567
Message 30076 - Posted: 27 Apr 2017, 9:29:05 UTC
Last modified: 27 Apr 2017, 9:58:19 UTC

I, too, had such a task this morning, it errored out after 17 minutes:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=136754666

Profile HerveUAE
Avatar
Send message
Joined: 18 Dec 16
Posts: 120
Credit: 6,749,027
RAC: 20,218
Message 30096 - Posted: 27 Apr 2017, 21:08:19 UTC

I had a few from the same batch that went through with no problem:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=136760061
https://lhcathome.cern.ch/lhcathome/result.php?resultid=136759824
https://lhcathome.cern.ch/lhcathome/result.php?resultid=136759670
____________
We are the product of random evolution.

maeax
Send message
Joined: 2 May 07
Posts: 232
Credit: 11,993,210
RAC: 14,363
Message 30443 - Posted: 20 May 2017, 13:00:06 UTC

This task is not running for more than five user:

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=66206901

maeax
Send message
Joined: 2 May 07
Posts: 232
Credit: 11,993,210
RAC: 14,363
Message 31169 - Posted: 29 Jun 2017, 5:49:45 UTC - in response to Message 30443.

This task is not running for more than five user:

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=72503000

Message boards : ATLAS application : Another batch of faulty WUs?