Message boards : ATLAS application : Some Validate errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 986
Credit: 34,627,655
RAC: 19,114
Message 33235 - Posted: 8 Dec 2017, 8:30:21 UTC
Last modified: 8 Dec 2017, 9:10:36 UTC

ID: 33235 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 442
Credit: 24,183,579
RAC: 14,159
Message 33433 - Posted: 18 Dec 2017, 7:55:10 UTC

New tasks that were loaded and crunched today (18th) give validate errors for all hosts that have downloaded them. Here is one: https://lhcathome.cern.ch/lhcathome/result.php?resultid=169624095
ID: 33433 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 322
Credit: 10,943,067
RAC: 5,730
Message 33439 - Posted: 18 Dec 2017, 20:27:46 UTC - in response to Message 33433.  

This is the error that we have seen before: "No events to process: 4050 (skipEvents) >= 2000 (inputEvents of EVNT)"

It happens when the WU tries to process events which do not exist in the input file and is a bug in our ATLAS systems. I have changed the validation logic to pass these results so that the real error gets propagated upstream and so the WU does not get retried, since it will never succeed.
ID: 33439 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 442
Credit: 24,183,579
RAC: 14,159
Message 33440 - Posted: 18 Dec 2017, 20:47:02 UTC - in response to Message 33439.  

Good. The next tasks went through without a problem.
ID: 33440 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 986
Credit: 34,627,655
RAC: 19,114
Message 34107 - Posted: 26 Jan 2018, 18:33:46 UTC

ID: 34107 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 986
Credit: 34,627,655
RAC: 19,114
Message 34261 - Posted: 3 Feb 2018, 19:40:16 UTC

Two Atlas-WU's are finished, but not validated and where not send to a new Computer:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10409041
ID: 34261 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1504
Credit: 82,979,928
RAC: 79,778
Message 34263 - Posted: 4 Feb 2018, 8:55:07 UTC

My last valid ATLAS task has been reported Friday morning.
Since then all reported tasks are either "validation pending" or "validation error" although their logs are OK.
:-(

That's not very nice.
Should be investigated.
ID: 34263 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1291
Credit: 23,316,312
RAC: 4,248
Message 34266 - Posted: 4 Feb 2018, 9:53:08 UTC - in response to Message 34263.  

Since then all reported tasks are either "validation pending" ...
this is obviously the case for ALL sub-projects since 2 days ago. I crunched CMS, Sixtrack and LHCb. All of them show "validation pending".
No idea what's going wrong there
ID: 34266 · Report as offensive     Reply Quote
Hona

Send message
Joined: 29 Sep 04
Posts: 5
Credit: 2,280,223
RAC: 1,270
Message 34267 - Posted: 4 Feb 2018, 10:22:54 UTC

I think the issue is the "transitioner" process which shows actually a backlog of about 33 h.
I have also 11 tasks pending at all sub-projects since Feb/2.
ID: 34267 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 406
Credit: 96,567,558
RAC: 372
Message 34268 - Posted: 4 Feb 2018, 11:10:08 UTC

David, can you explain, why these tasks haven't been validated:

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10487431&offset=0&show_names=0&state=5&appid=


Supporting BOINC, a great concept !
ID: 34268 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1291
Credit: 23,316,312
RAC: 4,248
Message 34273 - Posted: 4 Feb 2018, 17:35:36 UTC - in response to Message 34267.  

I think the issue is the "transitioner" process which shows actually a backlog of about 33 h.
I have also 11 tasks pending at all sub-projects since Feb/2.
Right now, the transitioner backlog is 34,57 hours. This said, at least all WUs from Friday and early Saturday should have been validated at this point. However, this is NOT the case.
Hence, I guess the problem is somewhere else :-(
ID: 34273 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 986
Credit: 34,627,655
RAC: 19,114
Message 34289 - Posted: 5 Feb 2018, 12:11:22 UTC

When the results from Atlas for the last days is over the sea!
Is it possible to get the Cobblestones for this wasting work?
Or is there a chance to find a better solution?
Thank you Atlas-Team.
ID: 34289 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 322
Credit: 10,943,067
RAC: 5,730
Message 34291 - Posted: 5 Feb 2018, 13:15:57 UTC

The validate errors are due to the problems caused by the daemons getting stuck. Because WUs had to wait a long time for validation, the results files were deleted before the validation took place.

The deletion of results was stopped on Saturday morning so any results uploaded since then will eventually be validated, unless they are past the deadline. Just now we are waiting for the backlog to clear before submitting any new WU.
ID: 34291 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1291
Credit: 23,316,312
RAC: 4,248
Message 34292 - Posted: 5 Feb 2018, 14:52:21 UTC - in response to Message 34291.  

The validate errors are due to the problems caused by the daemons getting stuck. Because WUs had to wait a long time for validation, the results files were deleted before the validation took place.

The deletion of results was stopped on Saturday morning so any results uploaded since then will eventually be validated, unless they are past the deadline. Just now we are waiting for the backlog to clear before submitting any new WU.
two questions:
1) is all this true also for the tasks from sub-projects other than ATLAS?
2) Do I understand right that tasks that were uploaded between Friday about noon and Saturday morning are lost (i.e. no credits)?
ID: 34292 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 986
Credit: 34,627,655
RAC: 19,114
Message 34300 - Posted: 7 Feb 2018, 11:00:44 UTC

Atlas-Team and IT-Team,
thank you, the ATLAS-Cobblestones where there!
ID: 34300 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : ATLAS application : Some Validate errors


©2020 CERN