Message boards : ATLAS application : Some Validate errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,084,902
RAC: 104,657
Message 33235 - Posted: 8 Dec 2017, 8:30:21 UTC
Last modified: 8 Dec 2017, 9:10:36 UTC

ID: 33235 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,150,010
RAC: 16,024
Message 33433 - Posted: 18 Dec 2017, 7:55:10 UTC

New tasks that were loaded and crunched today (18th) give validate errors for all hosts that have downloaded them. Here is one: https://lhcathome.cern.ch/lhcathome/result.php?resultid=169624095
ID: 33433 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 33439 - Posted: 18 Dec 2017, 20:27:46 UTC - in response to Message 33433.  

This is the error that we have seen before: "No events to process: 4050 (skipEvents) >= 2000 (inputEvents of EVNT)"

It happens when the WU tries to process events which do not exist in the input file and is a bug in our ATLAS systems. I have changed the validation logic to pass these results so that the real error gets propagated upstream and so the WU does not get retried, since it will never succeed.
ID: 33439 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,150,010
RAC: 16,024
Message 33440 - Posted: 18 Dec 2017, 20:47:02 UTC - in response to Message 33439.  

Good. The next tasks went through without a problem.
ID: 33440 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,084,902
RAC: 104,657
Message 34107 - Posted: 26 Jan 2018, 18:33:46 UTC

ID: 34107 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,084,902
RAC: 104,657
Message 34261 - Posted: 3 Feb 2018, 19:40:16 UTC

Two Atlas-WU's are finished, but not validated and where not send to a new Computer:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10409041
ID: 34261 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,899,033
RAC: 138,178
Message 34263 - Posted: 4 Feb 2018, 8:55:07 UTC

My last valid ATLAS task has been reported Friday morning.
Since then all reported tasks are either "validation pending" or "validation error" although their logs are OK.
:-(

That's not very nice.
Should be investigated.
ID: 34263 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,341,407
RAC: 101,788
Message 34266 - Posted: 4 Feb 2018, 9:53:08 UTC - in response to Message 34263.  

Since then all reported tasks are either "validation pending" ...
this is obviously the case for ALL sub-projects since 2 days ago. I crunched CMS, Sixtrack and LHCb. All of them show "validation pending".
No idea what's going wrong there
ID: 34266 · Report as offensive     Reply Quote
Hona

Send message
Joined: 29 Sep 04
Posts: 5
Credit: 3,043,759
RAC: 0
Message 34267 - Posted: 4 Feb 2018, 10:22:54 UTC

I think the issue is the "transitioner" process which shows actually a backlog of about 33 h.
I have also 11 tasks pending at all sub-projects since Feb/2.
ID: 34267 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 34268 - Posted: 4 Feb 2018, 11:10:08 UTC

David, can you explain, why these tasks haven't been validated:

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10487431&offset=0&show_names=0&state=5&appid=


Supporting BOINC, a great concept !
ID: 34268 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,341,407
RAC: 101,788
Message 34273 - Posted: 4 Feb 2018, 17:35:36 UTC - in response to Message 34267.  

I think the issue is the "transitioner" process which shows actually a backlog of about 33 h.
I have also 11 tasks pending at all sub-projects since Feb/2.
Right now, the transitioner backlog is 34,57 hours. This said, at least all WUs from Friday and early Saturday should have been validated at this point. However, this is NOT the case.
Hence, I guess the problem is somewhere else :-(
ID: 34273 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,084,902
RAC: 104,657
Message 34289 - Posted: 5 Feb 2018, 12:11:22 UTC

When the results from Atlas for the last days is over the sea!
Is it possible to get the Cobblestones for this wasting work?
Or is there a chance to find a better solution?
Thank you Atlas-Team.
ID: 34289 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 34291 - Posted: 5 Feb 2018, 13:15:57 UTC

The validate errors are due to the problems caused by the daemons getting stuck. Because WUs had to wait a long time for validation, the results files were deleted before the validation took place.

The deletion of results was stopped on Saturday morning so any results uploaded since then will eventually be validated, unless they are past the deadline. Just now we are waiting for the backlog to clear before submitting any new WU.
ID: 34291 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,341,407
RAC: 101,788
Message 34292 - Posted: 5 Feb 2018, 14:52:21 UTC - in response to Message 34291.  

The validate errors are due to the problems caused by the daemons getting stuck. Because WUs had to wait a long time for validation, the results files were deleted before the validation took place.

The deletion of results was stopped on Saturday morning so any results uploaded since then will eventually be validated, unless they are past the deadline. Just now we are waiting for the backlog to clear before submitting any new WU.
two questions:
1) is all this true also for the tasks from sub-projects other than ATLAS?
2) Do I understand right that tasks that were uploaded between Friday about noon and Saturday morning are lost (i.e. no credits)?
ID: 34292 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,084,902
RAC: 104,657
Message 34300 - Posted: 7 Feb 2018, 11:00:44 UTC

Atlas-Team and IT-Team,
thank you, the ATLAS-Cobblestones where there!
ID: 34300 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : ATLAS application : Some Validate errors


©2024 CERN