Message boards : Sixtrack Application : Bad SixTrack workunits?
Message board moderation

To post messages, you must log in.

AuthorMessage
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 719
Credit: 5,988,486
RAC: 2,412
Message 32829 - Posted: 14 Oct 2017, 10:24:14 UTC
Last modified: 14 Oct 2017, 14:21:16 UTC

There are a lot of bad workunits around created last night.
Distribution of resends stopped after a while cause "Too many total results" was achieved.

Example of workunit errors failing on my machine and on wingmen too:

https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76443395
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76438320
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76433859
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76430695
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76445807
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=76432782

What's worrying me, that's they don't crash early, but after consuming a certain amount of cpu time. For me between 104 and 13,626 cpu seconds.
ID: 32829 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 550
Credit: 332,146,912
RAC: 562,445
Message 32835 - Posted: 15 Oct 2017, 11:41:18 UTC
Last modified: 15 Oct 2017, 11:41:29 UTC

I saw the same, I sent some logs to Eric & James
ID: 32835 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 400
Credit: 84,342,627
RAC: 91,041
Message 32843 - Posted: 16 Oct 2017, 10:03:48 UTC

I'm seeing a lot of bad WUs, they all seem to start with dtwo_...

They had already consumed 108.000 seconds of RunTime, CPU-Time is zero up to 2 minutes

I will cancel them


Supporting BOINC, a great concept !
ID: 32843 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 141
Credit: 1,727,652
RAC: 5,296
Message 32845 - Posted: 16 Oct 2017, 14:46:30 UTC - in response to Message 32843.  
Last modified: 16 Oct 2017, 15:01:09 UTC

Yes I got some of them as well, eg:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=159075597
https://lhcathome.cern.ch/lhcathome/result.php?resultid=158994555

We are still investigating since the issue is not clear at all. Apologies for the inconvenience.
ID: 32845 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 400
Credit: 84,342,627
RAC: 91,041
Message 32855 - Posted: 19 Oct 2017, 11:36:16 UTC

More bad WUs:



It has 08 minutes of CPU-Time, but 10 hours elapsed time.

It is this task: https://lhcathome.cern.ch/lhcathome/result.php?resultid=159169102

But it is a dones.... WU


Supporting BOINC, a great concept !
ID: 32855 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 719
Credit: 5,988,486
RAC: 2,412
Message 32869 - Posted: 21 Oct 2017, 19:10:49 UTC - in response to Message 32845.  
Last modified: 21 Oct 2017, 19:11:32 UTC

On 16 Oct. Alessio Mereghetti wrote:
Yes I got some of them as well, eg:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=159075597
https://lhcathome.cern.ch/lhcathome/result.php?resultid=158994555

We are still investigating since the issue is not clear at all. Apologies for the inconvenience.

No sign from the admins, but maybe the bad tasks are re-issued or similar are distributed.
At least I see Dtwo-tasks again, but now bis is inserted into the name like: Dtwo0bis_hlbbo_2222... and no errors so far.
ID: 32869 · Report as offensive     Reply Quote

Message boards : Sixtrack Application : Bad SixTrack workunits?


©2019 CERN