Message boards :
Sixtrack Application :
many tasks still not validated after 13 days
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
From my tasks list on the webpage I can see quite a number of unvalidated tasks which were uploaded as far back as December 20 (at this date, the list ends - most probably there are even older tasks around, waiting vor validation). What does this mean? Will I ever earn credit points for those, or are these old tasks lost at some time? |
Send message Joined: 15 Jun 08 Posts: 2413 Credit: 226,472,363 RAC: 131,969 |
Be patient. It looks like there is nothing wrong with your results except that they need a confirmation by a wingman's computer (quorum of 2, requested by the project). Your first wingman gave up (for whatever reason) so the second task of the WU has to be rescheduled. Unfortunately those resends are sorted at the end of the currently very long RTS queue. Cheers |
Send message Joined: 10 Jan 12 Posts: 5 Credit: 1,175,296 RAC: 5,684 |
Thanks for explaining ! |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
Be patient. ...Thanks for explaining. I just checked again, many of the still unvalidated tasks had been uploaded as far back as December 20, 2017 (i.e. 18 days ago). So we'll see what's going to happen. I hope that finally all this work was not for nothing :-( |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 |
Have you tried checking how many wingmen you have to wait for? For example, yesterday I crunched the fourth or even fifth replica of the same task, eg https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=82065656 That poor guy has been waiting since 18th Dec - this is unfortunately the result of a huge peak in WUs being submitted at ~ the same time, relatively long (4-5h on my pc) tasks, and storage issues (with people abandoning or results reported after the deadline...) Hope this helps! Cheers, |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
Have you tried checking how many wingmen you have to wait for?Hm - honestly, I've no idea how I can make this check. Please let me know; many thanks. |
Send message Joined: 15 Jun 08 Posts: 2413 Credit: 226,472,363 RAC: 131,969 |
... I've no idea how I can make this check. Please let me know; many thanks. From your main account page click on "Tasks view" to get your task list. Then click on the ID in the workunit colum. The resulting page shows the distribution of the entire workunit and additional information, e.g. how many results are necessary to get a workunit validated. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
...From your main account page click on "Tasks view" to get your task list....oh, many thanks, it's easy enough :-) Valuable information! |
Send message Joined: 15 Jun 08 Posts: 2413 Credit: 226,472,363 RAC: 131,969 |
... many thanks ... Immer gerne. :-) |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
In my task list I just noticed a still unvalidated task (created on Dec. 23) which I uploaded on Jan. 13 - the details show that the two other crunchers had "Error while computing". Does this mean that this workunit will never get validated? |
Send message Joined: 15 Jun 08 Posts: 2413 Credit: 226,472,363 RAC: 131,969 |
In my task list I just noticed a still unvalidated task (created on Dec. 23) which I uploaded on Jan. 13 - the details show that the two other crunchers had "Error while computing". Does this mean that this workunit will never get validated? Be so kind as to post a link to the WU or at least the WUID. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
Be so kind as to post a link to the WU or at least the WUID.https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=82213313 (computer 10388905: it's me) and yet another one, with a whole mix of remarks: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=82054142 (computer 10452404: it's me) |
Send message Joined: 15 Jun 08 Posts: 2413 Credit: 226,472,363 RAC: 131,969 |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=82054142 This WU may be critical. Until now your computer is the only one that delivered a valid result but that result has to be confirmed by a wingcomputer (min_quorum=2). As 3 wingcomputers failed (for different reasons) the server created a 5th (and last!) result after 2018-01-16 22:35:39 UTC. This result was added at the end of the RTS queue and has not yet been sent out. As soon as result #5 will be reported it either confirms your result and both computers get the credit OR result #5 fails and the whole WU will be treated as failed which means no credit to all computers. https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=82213313 This WU is a bit less critical as there are only 4 results in the list. If result #4 fails or does not confirm your result, a 5th one will be created. At the end there's nothing you or the project admins can/will do. Just wait and see what will happen. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
... At the end there's nothing you or the project admins can/will do.thanks once more for your good explanation. I am rather new to Sixtrack (so far, I had only crunched VM tasks). Honestly, I am rather surprised about the complexity of these Sixtrack WUs. For me the question is whether it's necessary, for some reason, to make it that complicated. Particularly making a positive validation of the result dependent on what several other crunchers are doing or are not doing. Which may result in the fact that crunching a given task finally was for nothing. I really can't see the rationale behind this kind of procedure. On one of my PCs I am still waiting for AVX tasks - so far I only got SSE2 tasks (I have read your recent explanation regarding this topic). Should this not work out either pretty soon, I guess I will abandon Sixtrack and resume crunching VM tasks. |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
Particularly making a positive validation of the result dependent on what several other crunchers are doing or are not doing. BOINC will take care of it, don't worry. Just step away from the computer. It is estimated that in a worst case scenario, results can take up to four weeks to be validated. The team knows this. Obviously with the recent server issues, we are experiencing a worst case scenario. The longer it takes to reach the quorum, the more likely a "reliable" system will be given the task with priority. The chances of validation will improve as time goes on. Validating the results among several volunteers is very much necessary. I can see the downside, but I also don't want a random flipped bit to cause major issues. On one of my PCs I am still waiting for AVX tasks - so far I only got SSE2 tasks I do get AVX tasks and see no improvement in run time so far (although it is hard to tell, given the closed nature of the tasks). They're also still ~180 GFLOPS. Should this not work out either pretty soon, I guess I will abandon Sixtrack and resume crunching VM tasks. Good luck successfully returning an ATLAS task at the moment. It is nearly impossible given the current server situation. I have crunched four ATLAS tasks last weekend. Of those four, two are stuck with the common PID=-1 message. I'll stick with 50KB SixTrack instead of 140MB ATLAS until the servers have been upgraded. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
Good luck successfully returning an ATLAS task at the moment. It is nearly impossible given the current server situation. I have crunched four ATLAS tasks last weekend. Of those four, two are stuck with the common PID=-1 message.I agree, uploading ATLAS tasks can take several days (that's why I am wondering that still such a high number of tasks is being pumped into the mills, while it's clear that the infrastructure problems back at LHC are still prevailing). However, so far, I have experienced no such problems with CMS, LHCb and Therory. So these subprojects could be recommended for the time being, until the problems at LHC will finally be solved (hopefully). |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
I just found out the following interesting thing when looking up a finished task in my list for which validation is still pending: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=83937128 what catches my eye is that my computer (10388905) got it as SSE2, the wingman's computer got it as AVX. Can anyone explain to me how come? |
Send message Joined: 28 Sep 04 Posts: 675 Credit: 43,653,221 RAC: 15,903 |
I just found out the following interesting thing when looking up a finished task in my list for which validation is still pending: The tasks are free from the application optimization, they are just data files. But the applications can be different [sse2/pni/avx or windows/linux/mac] etc. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,878,564 RAC: 121,573 |
The tasks are free from the application optimization, they are just data files. But the applications can be different [sse2/pni/avx or windows/linux/mac] etc.okay, but then the question is: why is the task being run as AVX on the other cruncher's computer, and as SSE2 on mine, although mine also offers AVX? |
Send message Joined: 28 Sep 04 Posts: 675 Credit: 43,653,221 RAC: 15,903 |
Sorry, I don't have an answer to that. Have you checked the messages when Boinc starts up that it actually recoqnizes the avx extension on the processor? |
©2024 CERN