Message boards :
ATLAS application :
Some Validate errors
Message board moderation
Previous · 1 · 2 · 3
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 2071 Credit: 156,190,333 RAC: 103,992 |
|
Send message Joined: 28 Sep 04 Posts: 674 Credit: 43,168,378 RAC: 16,164 |
New tasks that were loaded and crunched today (18th) give validate errors for all hosts that have downloaded them. Here is one: https://lhcathome.cern.ch/lhcathome/result.php?resultid=169624095 |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
This is the error that we have seen before: "No events to process: 4050 (skipEvents) >= 2000 (inputEvents of EVNT)" It happens when the WU tries to process events which do not exist in the input file and is a bug in our ATLAS systems. I have changed the validation logic to pass these results so that the real error gets propagated upstream and so the WU does not get retried, since it will never succeed. |
Send message Joined: 28 Sep 04 Posts: 674 Credit: 43,168,378 RAC: 16,164 |
Good. The next tasks went through without a problem. |
Send message Joined: 2 May 07 Posts: 2071 Credit: 156,190,333 RAC: 103,992 |
|
Send message Joined: 2 May 07 Posts: 2071 Credit: 156,190,333 RAC: 103,992 |
Two Atlas-WU's are finished, but not validated and where not send to a new Computer: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10409041 |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 223,033,688 RAC: 136,941 |
My last valid ATLAS task has been reported Friday morning. Since then all reported tasks are either "validation pending" or "validation error" although their logs are OK. :-( That's not very nice. Should be investigated. |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,478,007 RAC: 104,387 |
Since then all reported tasks are either "validation pending" ...this is obviously the case for ALL sub-projects since 2 days ago. I crunched CMS, Sixtrack and LHCb. All of them show "validation pending". No idea what's going wrong there |
Send message Joined: 29 Sep 04 Posts: 5 Credit: 3,043,759 RAC: 0 |
I think the issue is the "transitioner" process which shows actually a backlog of about 33 h. I have also 11 tasks pending at all sub-projects since Feb/2. |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
David, can you explain, why these tasks haven't been validated: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10487431&offset=0&show_names=0&state=5&appid= Supporting BOINC, a great concept ! |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,478,007 RAC: 104,387 |
I think the issue is the "transitioner" process which shows actually a backlog of about 33 h.Right now, the transitioner backlog is 34,57 hours. This said, at least all WUs from Friday and early Saturday should have been validated at this point. However, this is NOT the case. Hence, I guess the problem is somewhere else :-( |
Send message Joined: 2 May 07 Posts: 2071 Credit: 156,190,333 RAC: 103,992 |
When the results from Atlas for the last days is over the sea! Is it possible to get the Cobblestones for this wasting work? Or is there a chance to find a better solution? Thank you Atlas-Team. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
The validate errors are due to the problems caused by the daemons getting stuck. Because WUs had to wait a long time for validation, the results files were deleted before the validation took place. The deletion of results was stopped on Saturday morning so any results uploaded since then will eventually be validated, unless they are past the deadline. Just now we are waiting for the backlog to clear before submitting any new WU. |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,478,007 RAC: 104,387 |
The validate errors are due to the problems caused by the daemons getting stuck. Because WUs had to wait a long time for validation, the results files were deleted before the validation took place.two questions: 1) is all this true also for the tasks from sub-projects other than ATLAS? 2) Do I understand right that tasks that were uploaded between Friday about noon and Saturday morning are lost (i.e. no credits)? |
Send message Joined: 2 May 07 Posts: 2071 Credit: 156,190,333 RAC: 103,992 |
Atlas-Team and IT-Team, thank you, the ATLAS-Cobblestones where there! |
©2024 CERN