Message boards : Theory Application : Website showing list with "bad" sherpa tasks
Message board moderation

To post messages, you must log in.

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1721
Credit: 107,702,308
RAC: 101,847
Message 49485 - Posted: 9 Feb 2024, 19:12:13 UTC

I remember that a few years ago, here in the forum someone posted an URL to a website showing the names of "bad" (faulty) sherpa tasks (so that one could check after a task download whether that task will sucdeed or probably fail).
Unfortunately, I don't remember this URL and I can't find it anywhere.
Coulde someone please post it here?
ID: 49485 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2450
Credit: 232,580,038
RAC: 131,311
Message 49486 - Posted: 9 Feb 2024, 19:40:17 UTC - in response to Message 49485.  

few years ago

Might not be useful any more since there were major changes recently.
Most important: the switch from cvm3 to cvm4 which provides a completely different base environment.
According to mcplots the overall failure rate is 1.27 % for revision 2687.
ID: 49486 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2158
Credit: 162,608,688
RAC: 123,664
Message 49489 - Posted: 10 Feb 2024, 0:10:27 UTC - in response to Message 49485.  
Last modified: 10 Feb 2024, 0:13:59 UTC

Erich56, remember this page can change the revision.
atm rev=2687
http://mcplots2cc7.cern.ch/production.php?view=runs&rev=2687&display=fail
ID: 49489 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1721
Credit: 107,702,308
RAC: 101,847
Message 49492 - Posted: 10 Feb 2024, 6:37:21 UTC

on one of my hosts, a Sherpa [Boinc pp winclusive 7000 10 - sherpa 2.2.9 default 6000 223] has been running for more than 2 days.
Console f2 says "lean back and enjoy" ... but no events are being shown since then. CPU is active with 99+ % in console f3.

Since this task is shown in the list from the link above posting, I guess that I should kill the task, right?
ID: 49492 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2158
Credit: 162,608,688
RAC: 123,664
Message 49493 - Posted: 10 Feb 2024, 7:02:24 UTC
Last modified: 10 Feb 2024, 7:08:38 UTC

Therefore no task of them was successful, yes.
btw no new Theory tasks avalaible.
Best time to find the network issues ;-))
ID: 49493 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2450
Credit: 232,580,038
RAC: 131,311
Message 49497 - Posted: 10 Feb 2024, 8:18:12 UTC

Although the task in question indeed might got stuck the "failed" list is not helpful to decide whether any task should be killed or not.

That's because 1 important fact has not been mentioned:
"Therefore no task of them was successful so far..."

Especially when a new mcplots revision starts there are always a couple of runspecs that fail or get lost before they report their first success.
To get removed from the "failed" list a runspec needs to report at least 1 successful result.
ID: 49497 · Report as offensive     Reply Quote

Message boards : Theory Application : Website showing list with "bad" sherpa tasks


©2024 CERN