21) Message boards : Team invites : JOIN The Final Front Ear (Message 24952)
Posted 10 Nov 2012 by mikey
Post:
22) Message boards : News : Status and Plans, Sunday 4th November (Message 24950)
Posted 9 Nov 2012 by mikey
Post:
initial replication 3

No, don't worry about that. It's a well-known terminological inexactitude (mistake) in the BOINC server code.

Two tasks were sent out on 30 Oct 2012, an initial replication of two.

When they failed to agree, a third instance was created and sent out on 3 Nov 2012, to make a current replication of 3.

BOINC updates the number, but it doesn't update the word.


Sort of...YES 2 were initially sent out, but the 1st unit errored out the same day and the 3rd unit was sent out 2 days AFTER the 2nd unit was returned to the Project. 2 units were sent out 30 Oct, 1st unit returned the same day, 1 unit returned 1 Nov. The 3rd unit was sent out 3 Nov and returned 4 Nov. This could be due to VERY slow Server responses or a Server 'glitch' that ended up causing the current situation. The replacement unit SHOULD have been sent out immediately after the 'inconclusive' unit was returned, NOT 3 days later.

COULD this be a part of the problems of not sending the bad units to another user? Is the Server NOT recognizing the invalid or inconclusive units properly and therefore NOT resending the units?

Sorry, not true (if we're looking at the same workunit). Both of the first two tasks returned 'success' status, so no problem could possibly be detected until the second report was received at 1 Nov 2012 | 6:44:33 UTC, and the validator was able to detect the mismatch.

After that, the third - tie-breaker - task 9891207 was created at 1 Nov 2012 | 6:44:48 UTC. 15 seconds for task creation isn't excessive: the delay between creation and distribution is a queue function, as we've discussed elsewhere.


Ahh I see, what I am seeing is the queue delay, okay that makes sense, THANKS!
23) Message boards : Number crunching : WU not being resent to another user (Message 24949)
Posted 9 Nov 2012 by mikey
Post:
It's just a very, very busy project. They maintain a 'high water mark' of around 300,000 tasks ready to send, and turn over about 60,000 tasks per hour - so the end of the queue is never more than five or six hours away.

That sounds simple enough that it could be made to work here too. The trick is having the code to do it and setting the high/low water marks appropriately. For example, if Eric has a batch of 179,249 tasks, just to pick a nice not-round number, then he ought not to dump all 179,249 into the queue at once. There needs to be a high water mark of say 2,000 and a low water mark of say 1,000. The batch starts with the feeder, splitter, or whatever its name is, dumping 2,000 tasks into the queue. When there are only 1,000 tasks left in the queue the feeder/splitter/whatever dumps all the resends into the queue and follows those with enough tasks to fill the queue to the high water mark. Eventually all 179,249 tasks have been put into the queue with all the resends sprinkled in as well at the beginning of each top-up. There should be only a short tail which won't matter anyway because even if Eric creates another big batch the tasks in the tail will go into the queue first before tasks from the new batch.

Now the milllion dollar question is.... Is there server code available that does that? Is that code already installed? If so what are the names of the options/config items that make it happen?

The keywords to look for are 'Work generator' (as used at Einstein) or 'workunit generator'. 'Splitter' is a specialised version of a WU generator used at SETI: 'feeder' is a different animal altogether, and doesn't belong in this list.


I think you are the Pro coder while the rest of us don't know how to use those terms the way you do, so we use terms that fit for us, but are not technically accurate for you, the Pro coder. I THINK what he is trying to say is why aren't the resends automatically sent immediately to the available workunit cache instead of just into 'some other cache' then into the available units cache?
24) Message boards : Cafe LHC : ~~~Last Person To Post Wins~~~ (Message 24940)
Posted 6 Nov 2012 by mikey
Post:
Now time for another run of this popular past-time.

Simple rules:


Post in this thread, anything with legible words
Stay within the message board rules
Empty posts are invalid towards winning
You must wait for someone else to post before you can post again, any and all multiple posts are invalid towards winning
Have FUN



Rules Subject to change at moderators discretion without notice

The winner will be the last person that posted, when I have time to check the thread, after one of these events:


A specific date and/or time occurs
A specific number of posts have been made
I get tired of y'all nagging me who is the winner :)
An event I have chosen occurs (doesn't occur)



Sorry there are no physical prizes as I ate them all, only the satisfaction you played and won and your name added into a winners list. :)
25) Message boards : News : Status and Plans, Sunday 4th November (Message 24939)
Posted 6 Nov 2012 by mikey
Post:
initial replication 3

No, don't worry about that. It's a well-known terminological inexactitude (mistake) in the BOINC server code.

Two tasks were sent out on 30 Oct 2012, an initial replication of two.

When they failed to agree, a third instance was created and sent out on 3 Nov 2012, to make a current replication of 3.

BOINC updates the number, but it doesn't update the word.


Sort of...YES 2 were initially sent out, but the 1st unit errored out the same day and the 3rd unit was sent out 2 days AFTER the 2nd unit was returned to the Project. 2 units were sent out 30 Oct, 1st unit returned the same day, 1 unit returned 1 Nov. The 3rd unit was sent out 3 Nov and returned 4 Nov. This could be due to VERY slow Server responses or a Server 'glitch' that ended up causing the current situation. The replacement unit SHOULD have been sent out immediately after the 'inconclusive' unit was returned, NOT 3 days later.

COULD this be a part of the problems of not sending the bad units to another user? Is the Server NOT recognizing the invalid or inconclusive units properly and therefore NOT resending the units?
26) Message boards : News : Status and Plans, Sunday 4th November (Message 24929)
Posted 5 Nov 2012 by mikey
Post:

2) I've just come across a wee problemette - WU 4413334. I think the middle one should have been set to 'invalid' after the third user reported - I'm not sure whether it will be properly marked off as completed like this.


I think the problem is here, this seems to be a normal unit:
minimum quorum 2
initial replication 2

while this is the unit in question:
minimum quorum 2
initial replication 3

If a unit is INITIALLY sent to three pc's but only two are required for validation and ALL units are returned prior to the deadline, how does the Server side handle all three units? Yes then umber are MUCH different for the second pc than the other two, but shouldn't Boinc have granted credits based on the first and second pc's and NOT used the third one since the first two were NOT marked as invalid? OR does "inconclusive" mean the same thing to Boinc?

A further question is what would have happened to the third unit if the second was not "inconclusive"? Would it have been aborted? What if the pc was half way thru crunching it, or even only had seconds left to finish? One would HOPE that the third unit would have been allowed to finish and be returned and ALSO granted credits, IF it returned a valid result. Especially since the user was NOT at fault for receiving it.


Previous 20


©2024 CERN