Message boards :
News :
Status and Plans, Sunday 4th November
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
First service continues to run well; the first intensity scan is nearing completion with well over a million results in 15 studies successfully returned. Just a couple of hundred thousand more! (Sadly no one study is complete but a couple are very close and I shall start post-processing and analysis soon. I am still reflecting on the thread "Number crunching; WU not being sent to another user". This is not easy, trying to get studies complete, but keeping the system busy. I am the "feeder" and since in the end I need all the studies I am rather prioritising keeping WUs available.) Just checked and we have over 80,000, yes eighty thousand WUs active and this is a new (recent) record. Draft documentation of the User side is now available thanks to my colleague R. Demaria. If you are interested [url=SixDesk Doc]http://sixtrack-ng.web.cern.ch/sixtrack-ng/[/url] and I hope you can access it (otherwise I shall put a copy to LHC@home). Right now I hope to try new executables with new physics on our test server and I mght shortly appeal for some volunteers to help (and also to run a few more 10 million turn jobs). I do NOT want to risk the production service while it is running so smoothly. Otherwise (At Last!) I shall start writing my paper on how to get identical results on ANY IEEE 754 hardware with ANY standard compiler at ANY level of Optimisation. Thanks to all. Eric. |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 ![]() |
1) SixDesk Doc is accessible here, but you have to swap the parameters to url= over :P 2) I've just come across a wee problemette - WU 4413334. I think the middle one should have been set to 'invalid' after the third user reported - I'm not sure whether it will be properly marked off as completed like this. 3) Stick my name on the list of volunteers for the new app and the 10M turn jobs. Host 9990937 is set up for quick turnround. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks Richard. The correct URL is (I hope): SixDesk Doc I shall look at the WU you mention at the office. I have also noticed a reduction in the average run time of the WUs. I suspect this is because the study w13cbb has the highest bunch charge and is therefore showing the onset of chaos and lost particles well before the million turns are completed. Eric. |
![]() Send message Joined: 30 Oct 11 Posts: 26 Credit: 4,955,767 RAC: 46 ![]() ![]() |
I think the problem is here, this seems to be a normal unit: minimum quorum 2 initial replication 2 while this is the unit in question: minimum quorum 2 initial replication 3 If a unit is INITIALLY sent to three pc's but only two are required for validation and ALL units are returned prior to the deadline, how does the Server side handle all three units? Yes then umber are MUCH different for the second pc than the other two, but shouldn't Boinc have granted credits based on the first and second pc's and NOT used the third one since the first two were NOT marked as invalid? OR does "inconclusive" mean the same thing to Boinc? A further question is what would have happened to the third unit if the second was not "inconclusive"? Would it have been aborted? What if the pc was half way thru crunching it, or even only had seconds left to finish? One would HOPE that the third unit would have been allowed to finish and be returned and ALSO granted credits, IF it returned a valid result. Especially since the user was NOT at fault for receiving it. |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 ![]() |
initial replication 3 No, don't worry about that. It's a well-known terminological inexactitude (mistake) in the BOINC server code. Two tasks were sent out on 30 Oct 2012, an initial replication of two. When they failed to agree, a third instance was created and sent out on 3 Nov 2012, to make a current replication of 3. BOINC updates the number, but it doesn't update the word. |
![]() Send message Joined: 30 Oct 11 Posts: 26 Credit: 4,955,767 RAC: 46 ![]() ![]() |
initial replication 3 Sort of...YES 2 were initially sent out, but the 1st unit errored out the same day and the 3rd unit was sent out 2 days AFTER the 2nd unit was returned to the Project. 2 units were sent out 30 Oct, 1st unit returned the same day, 1 unit returned 1 Nov. The 3rd unit was sent out 3 Nov and returned 4 Nov. This could be due to VERY slow Server responses or a Server 'glitch' that ended up causing the current situation. The replacement unit SHOULD have been sent out immediately after the 'inconclusive' unit was returned, NOT 3 days later. COULD this be a part of the problems of not sending the bad units to another user? Is the Server NOT recognizing the invalid or inconclusive units properly and therefore NOT resending the units? |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 ![]() |
initial replication 3 Sorry, not true (if we're looking at the same workunit). Both of the first two tasks returned 'success' status, so no problem could possibly be detected until the second report was received at 1 Nov 2012 | 6:44:33 UTC, and the validator was able to detect the mismatch. After that, the third - tie-breaker - task 9891207 was created at 1 Nov 2012 | 6:44:48 UTC. 15 seconds for task creation isn't excessive: the delay between creation and distribution is a queue function, as we've discussed elsewhere. |
![]() Send message Joined: 25 Jan 11 Posts: 179 Credit: 83,858 RAC: 0 |
Good analysis, Richard. Appears that is exactly the way it went down. |
![]() Send message Joined: 30 Oct 11 Posts: 26 Credit: 4,955,767 RAC: 46 ![]() ![]() |
initial replication 3 Ahh I see, what I am seeing is the queue delay, okay that makes sense, THANKS! |
Send message Joined: 9 Oct 10 Posts: 77 Credit: 3,671,357 RAC: 0 |
I'm ready for more huge WUs too :) Like I said before, I think that would be great to keep them along the regular simulations, but in a separate "application", so that donors can check it in their preferences according to what they want to get. |
Send message Joined: 2 Sep 04 Posts: 22 Credit: 4,113,694 RAC: 41 ![]() ![]() |
Hello! Good work, Thank You! I'm ready to do some testing for the new science and executables, and I would also like some more of the LONG wu's ! If needed, I would even run 100 billion turn wu's! Good luck! Greetings from Hans Sveen Oslo, Norway ![]() |
©2025 CERN