Message boards : Number crunching : WU send oddity
Message board moderation

To post messages, you must log in.

AuthorMessage
Ulrich Metzner
Avatar

Send message
Joined: 27 Sep 04
Posts: 36
Credit: 29,315
RAC: 0
Message 10370 - Posted: 22 Sep 2005, 13:13:59 UTC
Last modified: 22 Sep 2005, 13:32:31 UTC

Hi everyone,

i have some strange WU's in my results table:
http://lhcathome.cern.ch/workunit.php?wuid=603600
http://lhcathome.cern.ch/workunit.php?wuid=593592
Look at the Result IDs and i ask myself, why are [delete:n't] the others still NOT send? If they aren't send to anybody, it's no wonder i get no credit and they remain pending. Could someone involved take a look at that please?

[edit] corrected double negation ;)
greetz, Uli

ID: 10370 · Report as offensive     Reply Quote
madmac
Avatar

Send message
Joined: 25 Aug 05
Posts: 42
Credit: 3,518
RAC: 0
Message 10372 - Posted: 22 Sep 2005, 13:25:08 UTC

TO Ulrich I have one just the same but I have one that is even worse. I have a WU where only one WU has been sent mine done and sent back. We will just have to wait for the other parts to be sent out. They will be sent out but I do not know when.

ID: 10372 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 10373 - Posted: 22 Sep 2005, 13:26:27 UTC - in response to Message 10370.  

<blockquote>Hi everyone,

i have some strange WU's in my results table:
http://lhcathome.cern.ch/workunit.php?wuid=603600
http://lhcathome.cern.ch/workunit.php?wuid=593592
Look at the Result IDs and i ask myself, why aren't the others still NOT send? If they aren't send to anybody, it's no wonder i get no credit and they remain pending. Could someone involved take a look at that please?</blockquote>

No problem here. BOINC issues one result from every WU before it issues the second one, then the third, usw. This allows for the possibility that the first three results will provide a quorum, and the last two results need not be issued.

It has the side effect that it may be a while before credit is granted if you happen to receive the first result from a WU.


Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 10373 · Report as offensive     Reply Quote
itenginerd
Avatar

Send message
Joined: 29 Aug 05
Posts: 42
Credit: 27,102
RAC: 0
Message 10375 - Posted: 22 Sep 2005, 15:42:02 UTC - in response to Message 10373.  

<blockquote>
No problem here. BOINC issues one result from every WU before it issues the second one, then the third, usw. This allows for the possibility that the first three results will provide a quorum, and the last two results need not be issued.
</blockquote>

Sorry, Mike, but you're wrong. LHC may do this, but it isn't BOINC (and I think all 4 of us know the difference). Here's a SETI WU that got sent out as 4 results in under a minute.

http://setiathome.berkeley.edu/workunit.php?wuid=27681052

I think it's a configurable part of the DB/Feeder/Scheduler routines on the project end. I see this behavior too, Uli. I chock it up to the admins and how they want/need the project to run. Dunno why they do it like that, but they seem happy to keep doing it...

(j)
James
ID: 10375 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 10377 - Posted: 22 Sep 2005, 16:29:33 UTC - in response to Message 10375.  

<blockquote>
Sorry, Mike, but you're wrong. LHC may do this, but it isn't BOINC (and I think all 4 of us know the difference). Here's a SETI WU that got sent out as 4 results in under a minute.
</blockquote>

So what you've said, then, is that I'm 'wrong' in a 'right' sort of way.

There's a key difference between LHC and SETI. LHC's workload is made up of relatively small batches of WUs. Batches are dependent on the results from previous batches. This allows the round-robin approach seen at LHC to work effectively.

SETI is made up of a continuous stream of data, and this data is NOT dependent on the output of previous WUs. Trying to use a round-robin approach would result in an endless wait for 2nd and subsequent WUs, so it makes sense to issue all results
for a WU at once. Einstein works similarly.

Since it is BOINC handling the scheduling, it seems reasonable to attribute this behaviour there.

Sorry, IT Nerd, but you're not right!

;-)


Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 10377 · Report as offensive     Reply Quote
itenginerd
Avatar

Send message
Joined: 29 Aug 05
Posts: 42
Credit: 27,102
RAC: 0
Message 10378 - Posted: 22 Sep 2005, 17:01:45 UTC - in response to Message 10377.  

Still not with you, Mike (though, for the record, it wouldn't be the first time I was in my own little world 8P ). Let me put it to you this way: if project A (LHC) and project B (SETI) hand out work in two different ways, the it CAN'T be a BOINC behavior.

BOINC is the client only. Client just says 'I want 10 seconds of work'. It's up to the project to determine what the next result out the door is.

My whole point hasn't changed. BOINC doesn't hand out work. LHC hands out work. Einstein hands out work. PP@H hands out work.

Since it is the project, NOT BOINC, handling the scheduling, it doesn't seem reasonable to attibute the behavior there. That's the best way to put what I'm trying to say.

Am I making more sense now?

(j)
James
ID: 10378 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 10379 - Posted: 22 Sep 2005, 17:29:29 UTC

>>Am I making more sense now?

Nope!

BOINC is a complete client/server system. There is a great deal of BOINC software running on the server issuing work, tracking it's return, validating (although this is a project-specific component) and deleting. There's also the web site, which is also handled by BOINC software, and the management of user accounts and preferences, and the collection of statistics. There are a couple of non-proprietary services provided generally by Apache (HTTP) and MySQL (database). It is the BOINC server-side scheduler that is responsible for when, and in which order work is issued.

The BOINC client is responsible for client-side scheduling - requesting work, returning it and attempting to honour the client side resource share preferences. You are correct in that the BOINC client doesn't decide what to request, but BOINC is more than the client.




Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 10379 · Report as offensive     Reply Quote
itenginerd
Avatar

Send message
Joined: 29 Aug 05
Posts: 42
Credit: 27,102
RAC: 0
Message 10383 - Posted: 22 Sep 2005, 19:40:57 UTC - in response to Message 10379.  

OK, now you've got me scanning 'how to build your project' on the boinc site... 8)

Here's what I hear you saying:
BOINC is a complete system (which I'm going to grant you, because it's much truer than I thought, now that I read a bit more in depth). The BOINC server-side scheduler is responsible for when and in what order work is issued. SETI, LHC, and Einstein all use the same BOINC scheduler, since they are BOINC projects.

Here's the source of my problem. I've demonstrated that one BOINC powered project (SETI) does not hand out work in the same way as another BOINC powered project (LHC).

I can't reconcile the two sides of what you're saying. Either BOINC is the same across all projects, or it's a project-side decision as to how to hand out work. Is there another option I'm missing?

I'm just curious to see where we go with this now... 8)

(j)
James
ID: 10383 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 10384 - Posted: 22 Sep 2005, 20:26:47 UTC

BOINC is the same across all projects, subject to variations in version number, and differences in project specific components.

But, as with most software, it can be set up differently to match project requirements. For example, Einstein used 7-day deadlines, and has changed to 14-day deadlines. CPDN uses deadlines in the order of 365 days. Some projects use a quorum of three, with an issue of three. LHC uses a quorum of three, with an issue of five. And, I surmise, LHC issues results round-robin, while SETI issues them in sequence.



Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 10384 · Report as offensive     Reply Quote
itenginerd
Avatar

Send message
Joined: 29 Aug 05
Posts: 42
Credit: 27,102
RAC: 0
Message 10386 - Posted: 22 Sep 2005, 21:22:07 UTC

So now we get to the semantics game... 8)

So are we saying that the oddness that Uli and others have observed in the patterns of sending work out are because of the way the LHC project has implemented the BOINC feeder/scheduler routines?

I can agree to that.

(j)
James

PS - if we agree on that, where is the next result for that WU of Uli's?
ID: 10386 · Report as offensive     Reply Quote
Ulrich Metzner
Avatar

Send message
Joined: 27 Sep 04
Posts: 36
Credit: 29,315
RAC: 0
Message 10388 - Posted: 22 Sep 2005, 23:28:28 UTC

Although you had a very nice discussion here, nonetheless this doesn't barely fit my question. Maybe LHC has another strategy in dispatching wu's to the clients, but this doesn't explain why especially this wu's are spread this way, where the major part of wu's are definitely NOT spread this way. Just take a look at your own result table. Normally wu's are spread out to different clients in a matter of minutes. Only if a result times out, another one is dispatched 'out of time'.

So please stop this discussion and take a look at your own results and make a decent attendance.
greetz, Uli

ID: 10388 · Report as offensive     Reply Quote
Ulrich Metzner
Avatar

Send message
Joined: 27 Sep 04
Posts: 36
Credit: 29,315
RAC: 0
Message 10391 - Posted: 23 Sep 2005, 8:49:47 UTC

Just found other ones, that are waiting to get sent to other participants :/

http://lhcathome.cern.ch/workunit.php?wuid=620448
http://lhcathome.cern.ch/workunit.php?wuid=617530
http://lhcathome.cern.ch/workunit.php?wuid=562276

There is a raising trend in my pending credits...
greetz, Uli

ID: 10391 · Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 2 Sep 04
Posts: 545
Credit: 148,912
RAC: 0
Message 10395 - Posted: 23 Sep 2005, 14:36:38 UTC

Interesting discussion.

The BOINC SYSTEM is indeed a large system. The confusion comes in because of UCB's regrettable sloppy language and persistance in not being consistent in use of terms. Worse, they also have a regrettable lack of talent in naming things.

The BOINC Client Software is one component and is present on the participant's computer. This for many *IS* BOINC. But, as was pointed out, there is a complex of software on the SERVER side of the BOINC System.

Work is issued by the Scheduler, and as such, the issue policies can be configured in part with changes to the project's configuration file. Also, because the software in the BOINC System is available to the project, they can "hand tweak" it to do things the way that the project desires. So, a common system for all BOINC Powered Projects can APPEAR to be quite different in behavior from each other.

If you look in the Wiki, you can see in the development section information about the configuration file and see if it has a simple "switch" that the project can set to change the scheduler behavior.

Oh, and there are elements in the BOINC System that are deliberately non-deterministic which can give it that "random" look we all love when trying to understand how things work.

Anyway, I think part of the confusion in the discussion is that you are having troubles with the terms used. I have a rather ridgid set of definitions in the WIki and you can test your understanding against them. Or, of course, you can continue to fail to understand each other ... :)
ID: 10395 · Report as offensive     Reply Quote

Message boards : Number crunching : WU send oddity


©2024 CERN