Message boards : Number crunching : Fairer distribuiton of work(Flame Fest 2007)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

AuthorMessage
Profile dr_mabuse
Avatar

Send message
Joined: 30 Dec 05
Posts: 57
Credit: 835,284
RAC: 0
Message 15122 - Posted: 18 Oct 2006, 11:07:11 UTC - in response to Message 15120.  

R~~[/quote]

Curious to know if this current batch that went out today was under any sort of "program" for distribution. I'm just glad to have gotten some :)
[/quote]

I did again get nothing. My BOINC program trries to download repeatedly with times of 12 to 58minutes. The work units must havbe been sold out in an even shorter time. This is very frustrating and I think that more and more CERN supporters will sign out. Is this the right way to get people engaged ? Remember that as soon as LHC works there will be an awkward crowd of computer power needed and you should not discourage anyone whom you can relay on.

WHat would you think about a change in distribution procedure as here are some promising ideas proposed ?

Greetings from Germany
Jochen
ID: 15122 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 15123 - Posted: 18 Oct 2006, 11:35:16 UTC - in response to Message 15122.  

...My BOINC program trries to download repeatedly with times of 12 to 58minutes. The work units must havbe been sold out in an even shorter time...


The work went out between 0100 and 0300 on 10th, with there never being any significant number of results ready to send. As fast as the jobs were submitted to the sceduler they were handed out.

Look at Scarecrow's graphs - no results ready to send at 0205, yet if you look at the results in progress at that time they have already started to rise.

Scarecrows stats collector went in and found zero for the same reason that your requests for work did - even though work was being fed into the system, it was being handed out as fast

R~~
ID: 15123 · Report as offensive     Reply Quote
darkclown

Send message
Joined: 30 Sep 06
Posts: 9
Credit: 5,298
RAC: 0
Message 15124 - Posted: 18 Oct 2006, 11:45:39 UTC - in response to Message 15123.  

...My BOINC program trries to download repeatedly with times of 12 to 58minutes. The work units must havbe been sold out in an even shorter time...


The work went out between 0100 and 0300 on 10th, with there never being any significant number of results ready to send. As fast as the jobs were submitted to the sceduler they were handed out.

Look at Scarecrow's graphs - no results ready to send at 0205, yet if you look at the results in progress at that time they have already started to rise.

Scarecrows stats collector went in and found zero for the same reason that your requests for work did - even though work was being fed into the system, it was being handed out as fast

R~~


Yea, my system happened to check at 0300 UTC, and got 7 WUs. Another one of my systems checked around that time, and found the server empty. Seems luck of the draw, which is seemingly what they want at this point. You can't please this many people with ANY method of distribution.

My guess is that it was another batch of 5000 WU? If so, almost 2000 have been returned, so I'd say they're getting crunched pretty darn quickly.
ID: 15124 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 15125 - Posted: 18 Oct 2006, 12:51:47 UTC - in response to Message 15118.  

... Looking at the last months, I'd say :
There is basically no "fair" distribution of work possible, given the tiny amount of Work that pops up once a while.

Even if every User was to be handed one exclusive WorkUnit, it would still leave others without work...


Not so. Look at Scarecrow's graphs from mid September to now.

Around 21 Sept we had about 45,000 results. The following three, smaller, releases are almost certainly connected with that release (we usually see this, when the results get back the engineers ask for more to cover details that were missed).

These three were Oct 1st 12,000; 10th 10,000; 18th 4,200

Best estimate is that there are some 6,000 hosts chasing the results. So in the main release we could have had 7 or 8 each. At one WU per host we'd all have got some two times out of three. At one result per USER we'd all have got one onthe 18th (there are less than 4,000 active users as many of us have multiple hosts).

Personally, if my 11 boxes had got 70+ WU in September I would not be cross if the smaller distributions were a bit lumpy - that is why the idea of releasing them 5 at a time makes sense in these sparse conditions. With 5 at a time, one in three boxes would have got lucky on 1st and 10th, and 1 in 8 boxes on 18th.

That would have felt fairer to me than some people getting 43 WU and the vast majority getting nowt.

Ironically (while I always liked LHC), other Projects - established and emerging - are literally dying for more computing power as we speak, plus people can witness a direct line to the staff that actually realizes Community inputs where possible. Just a point to consider.


Agree totally and I hope that goes without saying. The stats that have appeared in the sigs in this thread show, I think, that most of us on all sides of this discussion are united on this.

For anyone who doesn't yet crunch a second project while waiting, there are tips on how to get LHC work while crunching elsewhere in this thread, the posts from Keck Komputers being particularly useful in my opinion.

But when work is available here I'd like to be able to take part so that my past contribution does not get diluted in the stats by people who randomly get work when I can't.

R~~
ID: 15125 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 15126 - Posted: 18 Oct 2006, 12:54:49 UTC - in response to Message 15123.  

The work went out between 0100 and 0300 on 10th
should have read

The work went out between 0100 and 0300 on 18th Oct

sorry for typo, too late to edit. Times in UTC.
R~~
ID: 15126 · Report as offensive     Reply Quote
Profile caspr
Avatar

Send message
Joined: 26 Apr 06
Posts: 89
Credit: 309,235
RAC: 0
Message 15131 - Posted: 18 Oct 2006, 19:29:23 UTC

I knew it was hard to get work around here,.... but damm! Haven't got a thing since end of July! I keep bangin them them for work,... but nodda!,...why? I gotta give them another chance! I believe in the project, but not in the method of work distribuiton! It could be better. Alot better! I hope you are listing. at least other projects do listen to thier crunchers. I wouldn't dream of telling you how to run your project, only how to keep your boxes happy! just listen!! At least tell me to get lost!!
A clear conscience is usually the sign of a bad memory


ID: 15131 · Report as offensive     Reply Quote
KAMasud

Send message
Joined: 7 Oct 06
Posts: 114
Credit: 23,192
RAC: 0
Message 15135 - Posted: 19 Oct 2006, 0:15:30 UTC - in response to Message 15131.  

Please add my voice to the overall lack of response i have no WU now for ages.

I knew it was hard to get work around here,.... but damm! Haven't got a thing since end of July! I keep bangin them them for work,... but nodda!,...why? I gotta give them another chance! I believe in the project, but not in the method of work distribuiton! It could be better. Alot better! I hope you are listing. at least other projects do listen to thier crunchers. I wouldn't dream of telling you how to run your project, only how to keep your boxes happy! just listen!! At least tell me to get lost!!


ID: 15135 · Report as offensive     Reply Quote
David Stites
Avatar

Send message
Joined: 15 Jul 05
Posts: 18
Credit: 1,406,469
RAC: 0
Message 15143 - Posted: 19 Oct 2006, 22:06:31 UTC - in response to Message 15135.  

Please add my voice to the overall lack of response i have no WU now for ages.

I knew it was hard to get work around here,.... but damm! Haven't got a thing since end of July! I keep bangin them them for work,... but nodda!,...why? I gotta give them another chance! I believe in the project, but not in the method of work distribuiton! It could be better. Alot better! I hope you are listing. at least other projects do listen to thier crunchers. I wouldn't dream of telling you how to run your project, only how to keep your boxes happy! just listen!! At least tell me to get lost!!


I just suspended LHC. I am tired of my computers wasting time trying to get work from a project that doesn't seem to need my help. I will check back every month or so and if they ever seem to need me again I will be happy to help.
--
David Stites
Mount Vernon, WA USA
ID: 15143 · Report as offensive     Reply Quote
Profile Trane Francks

Send message
Joined: 18 Sep 04
Posts: 71
Credit: 28,399
RAC: 0
Message 15145 - Posted: 20 Oct 2006, 10:48:42 UTC - in response to Message 15077.  

I have seen this happen, ie new work snuck in front of a running CPDN task, but not sure on how recent a client.


This certainly isn't a problem, it's how BOINC is designed to keep short-term and long-term deficits in check. With CPDN taking hundreds of hours to crunch a single WU on my Athlon XP 2500+, it's a GOOD THING that BOINC looks at the deadline and puts WUs from other projects into the queue to maintain STD and LTD balance.

It's the right way to do things. BOINCs project scheduling takes into account how much time remains to crunch the work, the deadline and resource sharing. I have no problem with it whatsoever.
ID: 15145 · Report as offensive     Reply Quote
Profile Trane Francks

Send message
Joined: 18 Sep 04
Posts: 71
Credit: 28,399
RAC: 0
Message 15146 - Posted: 20 Oct 2006, 10:53:16 UTC - in response to Message 15143.  

I just suspended LHC. I am tired of my computers wasting time trying to get work from a project that doesn't seem to need my help.


Not to sound as though I'm complaining - it's your decision - but how, exactly, are your computers wasting time by checking? The amount of CPU resources required to poll a server for work are minimal at most. The impact on a running project is nil.

And this brings us back to square one: why do people feel a need to constantly poke and prod BOINC to get work from projects in the first place? Connect, set it, forget it. The only time you should be messing with BOINC and the way it handles its projects is when you KNOW you have a problem application (such as how CPDN resets its % done to 0 for each of the 3 phases and throws off BOINC's ability to properly calculate computing effort required to meet the deadline).

Just stay connected and let the thing go. Check in once in a while to ensure that things are working properly, but, otherwise, just leave it alone.

That's my advice. YMMV and you guys can certainly feel free to disagree. It won't bother me a bit. :)

ID: 15146 · Report as offensive     Reply Quote
Profile caspr
Avatar

Send message
Joined: 26 Apr 06
Posts: 89
Credit: 309,235
RAC: 0
Message 15160 - Posted: 23 Oct 2006, 10:23:03 UTC

ok dammit, its my b-day and I WANT TO CRUNCH SUMTHIN!!!




please?
A clear conscience is usually the sign of a bad memory


ID: 15160 · Report as offensive     Reply Quote
darkclown

Send message
Joined: 30 Sep 06
Posts: 9
Credit: 5,298
RAC: 0
Message 15167 - Posted: 24 Oct 2006, 14:31:24 UTC - in response to Message 15160.  

ok dammit, its my b-day and I WANT TO CRUNCH SUMTHIN!!!




please?


There were some WUs released today. I got about 20 of them. They taste good.
ID: 15167 · Report as offensive     Reply Quote
Profile marvinvwinkle
Avatar

Send message
Joined: 13 Jul 05
Posts: 5
Credit: 19,923
RAC: 0
Message 15168 - Posted: 24 Oct 2006, 15:26:34 UTC

So that's where they all went!
ID: 15168 · Report as offensive     Reply Quote
PovAddict
Avatar

Send message
Joined: 14 Jul 05
Posts: 275
Credit: 49,291
RAC: 0
Message 15170 - Posted: 24 Oct 2006, 15:31:51 UTC

Again, read the thread (and others) before asking for work!

By the way, I think we have users with 10-day caches grabbing everything. What about a daily quota of 5 units? :)
ID: 15170 · Report as offensive     Reply Quote
genes
Avatar

Send message
Joined: 29 Sep 04
Posts: 25
Credit: 77,910
RAC: 0
Message 15179 - Posted: 25 Oct 2006, 12:06:20 UTC

I got some! I got some! I got 3 on one of my machines. It just happened to be asking at the right time. I only keep a 0.1 day cache. It helps to have several machines asking, then the chances of one of them asking when the work is actually there increases.

ID: 15179 · Report as offensive     Reply Quote
Profile Tomas

Send message
Joined: 1 Oct 04
Posts: 4
Credit: 391,126
RAC: 0
Message 15180 - Posted: 25 Oct 2006, 14:07:05 UTC - in response to Message 15170.  

Again, read the thread (and others) before asking for work!

By the way, I think we have users with 10-day caches grabbing everything. What about a daily quota of 5 units? :)

I prefer something like this:

One PC can download max. 5 (or so, depending on the time it will take to crunch) workunits at a time. After reporting a workunit it can download another unit but to the maximum of 5 workunits at a time.

This will help against "everything-grabber" but will not slow down the crunching of the workunits if one PC is out of work. :-D
ID: 15180 · Report as offensive     Reply Quote
Profile Tomas

Send message
Joined: 1 Oct 04
Posts: 4
Credit: 391,126
RAC: 0
Message 15196 - Posted: 27 Oct 2006, 7:47:59 UTC - in response to Message 15180.  

Again, read the thread (and others) before asking for work!

By the way, I think we have users with 10-day caches grabbing everything. What about a daily quota of 5 units? :)

I prefer something like this:

One PC can download max. 5 (or so, depending on the time it will take to crunch) workunits at a time. After reporting a workunit it can download another unit but to the maximum of 5 workunits at a time.

This will help against "everything-grabber" but will not slow down the crunching of the workunits if one PC is out of work. :-D

let me call it "maximum number of workunits per time"
- no grabbing of everything, better distribution of work
- no slowdown for the project if PC is out of work
- the PC can hold the deadline because it does not have to much WUs
ID: 15196 · Report as offensive     Reply Quote
Kaal

Send message
Joined: 7 Nov 05
Posts: 19
Credit: 248,179
RAC: 0
Message 15198 - Posted: 27 Oct 2006, 13:07:13 UTC
Last modified: 27 Oct 2006, 13:07:51 UTC

@Tomas
So you're suggesting something like this? :)
If so, I'd agree. Keck Komputers seems to have the right of it, but River's later suggestion of extending the poll time in that solution out to 20 minutes also holds merit to allow those hosts in the 4hr back off to have a better chance of getting a WU.
ID: 15198 · Report as offensive     Reply Quote
Profile Tomas

Send message
Joined: 1 Oct 04
Posts: 4
Credit: 391,126
RAC: 0
Message 15199 - Posted: 27 Oct 2006, 13:31:49 UTC - in response to Message 15198.  
Last modified: 27 Oct 2006, 13:33:02 UTC

@Tomas
So you're suggesting something like this? :)
If so, I'd agree. Keck Komputers seems to have the right of it, but River's later suggestion of extending the poll time in that solution out to 20 minutes also holds merit to allow those hosts in the 4hr back off to have a better chance of getting a WU.

Keck Komputers writes: "to send no more than 5 results per scheduler RPC" and "the host wait 10 minutes before getting more work" to slow down the download to give others the chance to download work. This would allow the PC to grab a lot of work - slower than in the moment - but possible.
I would prefer that there are only a fixed number (say 5) of workunits (of this project) on a PC at a time. Before the PC loads another workunit it must upload a workunit. This is more restrictive but will not slow down activities of fast PCs.
I hope you agree anyway :-)
ID: 15199 · Report as offensive     Reply Quote
Kaal

Send message
Joined: 7 Nov 05
Posts: 19
Credit: 248,179
RAC: 0
Message 15200 - Posted: 27 Oct 2006, 13:44:26 UTC - in response to Message 15199.  

I would prefer that there are only a fixed number (say 5) of workunits (of this project) on a PC at a time. Before the PC loads another workunit it must upload a workunit. This is more restrictive but will not slow down activities of fast PCs.
I hope you agree anyway :-)

Hmmm, I think I might agree if I could see a simple implementation, but one of the objections to River's first proposal was that it might require software re-writes, as I think yours would.
John Keck's proposal merely requires 2 lines of the server-side config.xml to be changed, as I understand it.
Given that this is a temporary situation (given Garield) and that John's solution would adeqately (perhaps not perfectly, but certainly adequately) fix the problem, improving moral whilst not damaging the response to science, I think that his has to be the way forward. :)
ID: 15200 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Number crunching : Fairer distribuiton of work(Flame Fest 2007)


©2025 CERN