Message boards :
Sixtrack Application :
Inconclusive, valid/invalid results
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 9 · Next
Author | Message |
---|---|
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks for the feedback. Very useful. More news soonest. Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Working Day and Night on this. Problem seems to be Ubuntu Kernel/Version 4.8.0 specific. (There may be other reasons for empty result files though. They have been masked until now.) . Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Dear Volunteer, this morning 26th June at 08:00 CEST (06:00 CST + 2) I BANNED 17 Hosts from new work/Tasks (2 had already been banned). These 17 Linux Ubuntu systems 4.8.0 have been consistently producing empty result files with Exit Code 59. Hence the long delays in Validation and the large number of Inconclusive. I believe this will avoid further delays. I am now trying to identify the reason for the Exit Code 59 produced on these hosts. A simple problem but very very difficult to identify! (If any of these 17 host owners, have the time and the inclination, I could send various tests, while I set up my own Ubuntu system. I shall try and e-mail the individual owners. I shall discuss with colleagues the suspension of WU submission until this backlog is cleared. It remains to be seen if there are further sources of empty result files where I strongly suspect a Client/Server problem on Windows 10. Thanks again for your patience and tolerance. This is a pretty complex system involving many elements and factors, but not compared to the LHC operation itself! :-) Eric. |
Send message Joined: 15 Jun 08 Posts: 2485 Credit: 247,257,148 RAC: 117,324 |
If this host is not already included in the banlist it may be a candidate: hostid=10486290 |
Send message Joined: 13 Jul 05 Posts: 133 Credit: 162,641 RAC: 0 |
Is my host 10414945 OK? Lot of pending and inconclusive results here.... |
Send message Joined: 15 Jun 08 Posts: 2485 Credit: 247,257,148 RAC: 117,324 |
Is my host 10414945 OK? Lot of pending and inconclusive results here.... If I understand Eric's comments correctly you may do the following: - check your invalid results. If 0 (or close to 0) your computer is most likely ok - if you have a high number of inconclusive results, ceck your wingmen's computer - if your wingmen's computer have a lot of invalid results their computer is most likely a candidate for the banlist Your inconclusives will be resend to another computer and you will get credits once they are validated. Pending does not indicate an error. Be patient. |
Send message Joined: 13 Jul 05 Posts: 133 Credit: 162,641 RAC: 0 |
Thx for your help and advice! On checking the details of my 20 inconclusives, they are split amongst 7 different wingmen, all with high proportions of inconclusives. |
Send message Joined: 15 Jun 08 Posts: 2485 Credit: 247,257,148 RAC: 117,324 |
... amongst 7 different wingmen, all with high proportions of inconclusives. Of course, but the more interesting value would be the wingmen's invalid rate. |
Send message Joined: 6 Jan 08 Posts: 3 Credit: 592,658 RAC: 0 |
|
Send message Joined: 25 Jan 06 Posts: 2 Credit: 1,083,546 RAC: 0 |
bonjour, j'ai ce soucis apparemment. Pour résumer les messages du modérateur : - Suite à des machines renvoyant systématiquement et massivement des résultats faux (des linux) il y a une grande file d'attente en validation peu concluante - Patientez. Vos unités de travail seront renvoyées à d'autres machines pour recalcul et validation. - L'attente peut être longue car ces unités sont en bas de la file d'envoi du serveur. summarized trans : wait for a resend to another host of the WU / see msg 31064 and 31074 |
Send message Joined: 7 May 17 Posts: 10 Credit: 6,952,848 RAC: 0 |
I have doubts that there are special hosts or special OSs causing the high rate of inconclusive results. Why? Because the rate of inconclusive results is >99.9 % as far as I can see. My current sixtrack stats: all (9502) in progress (755) validation pending (1708) validation inconclusive (4527) valid (1256) --- only 3 after the validator changed, see below invalid (1) error (1255) Comments: Before the validator change, I think I had only a handful inconclusive results. Browsing through my >4500 current inconclusives, they seem to be all from after the validator change. The single invalid one is WU 69714861: 2x cancelled + 3x finished but with different results according to the validator (completed on June 3, June 10, and June 25, i.e. 2x with old validator and 1x with new validator). Therefore this invalid task really is more like inconclusive, because there were two guys who cancelled, and it remains unknown which of the three submitted results was the right one. The errors are some user-aborted tasks, but typically "finish file present too long" errors. Of the valid tasks, only 3 (three) have been validated by the new validator. All others had been validated before the new validator was brought online. (BTW, all of my boxes are Xeon E5 and Xeon E3, all but one with ECC RAM, and they had earned my trust in their results before. Some of them are purpose-built compute nodes for engineering applications, doing Distributed Computing in downtimes. --- Edit: These are Linux boxes, except one Windows box which shows exactly the same picture as the Linux boxes.) |
Send message Joined: 1 Nov 12 Posts: 3 Credit: 562,848 RAC: 0 |
Same her! my last validated was on 24jun, date of validater change.... The wu`s start with Pending(some of them), befor they move on to Validation inconclusiv Before i had maybe 5 or 6, but today the inconclusive task has increase to 96! Going back to Theory simulation |
Send message Joined: 7 May 17 Posts: 10 Credit: 6,952,848 RAC: 0 |
PS: Spot checks through my >4500 (and rising) inconclusive results show that I and the wingman both completed with exist status 0. So far I have not seen a single WU with non-zero exit code in my or the wingman's task. (IOW I have nowhere seen the exit code 59 which is mentioned in message 31064.) |
Send message Joined: 6 Jan 08 Posts: 3 Credit: 592,658 RAC: 0 |
bonjour, j'ai ce soucis apparemment. ok , donc j'ai rien à faire de particulier. merci pour ta réponse. :) |
Send message Joined: 1 May 07 Posts: 27 Credit: 2,334,924 RAC: 235 |
I have 2 computers working on six track 1 is an old dual xeon machine running fedora core 10 and has Validation inconclusive (20) out of 41 tasks. 2 is an I5 processor running windows10 and has Validation inconclusive (99) out of 189 tasks with 56 still waiting to be processed. There are only 16 valid. It would seem that old linux with old cpu's has the same issues as an I5 cpu with the latest windows 10. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
HIGHLY IMPORTANT It looks as if my attempted fix is creating too many problems, MEA CULPA. I expect we shall shortly withdraw it and return to the previous version of the sixtrack_validator. Sorry for the hassle and I shall try and answer as many of your posts to this thread as possible. The side effect seems to be I/we are now NOT validating when we should, certainly my fault. We shall return to a state where we will validate invalid empty results, but in this case the attempted Cure is Worse than the Disease. ( I tried.) Hope to try again soon. Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
No, I am very sure not. I created another problem. Hope all will be well soon. Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Oui, mais je me suis trompee. Vous n'avez rien a faire. Eric |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Indeed you would NOT see the -59. I created another problem which we shall fix soonest. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Understood. Eric. |
©2024 CERN