Message boards : Number crunching : Surely a pointless request
Message board moderation

To post messages, you must log in.

AuthorMessage
Ian Thompson

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 60,866
RAC: 0
Message 8314 - Posted: 5 Jul 2005, 18:21:09 UTC

Hi everyone

I have just been checking my logs and found this.

05/07/2005 18:14:02|LHC@home|Finished upload of wjun2A_v6s4hvpac_mqx__1__64.264_59.274__4_6__6__85_1_sixvf_boinc970_0_0
05/07/2005 18:14:02|LHC@home|Throughput 20453 bytes/sec
05/07/2005 18:14:04|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
05/07/2005 18:14:04|LHC@home|Requesting 0 seconds of work, returning 1 results
05/07/2005 18:14:05|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
05/07/2005 18:31:29|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
05/07/2005 18:31:29|LHC@home|Requesting 1 seconds of work, returning 0 results
05/07/2005 18:31:31|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
05/07/2005 18:31:32|LHC@home|Started download of wjun2A_v6s4hvpac_mqx__11__64.258_59.268__6_8__6__40_1_sixvf_boinc14850.zip

Surely you could set a minimum level of work to be requested this request was a waste of time and resources.
<img border="0" src="http://boinc.mundayweb.com/one/stats.php?userID=2104" />
ID: 8314 · Report as offensive     Reply Quote
Grenadier
Avatar

Send message
Joined: 2 Sep 04
Posts: 39
Credit: 441,128
RAC: 0
Message 8315 - Posted: 5 Jul 2005, 19:35:08 UTC

The minimum is one work unit. It doesn't really get 1 second of work. It gets at least that many seconds of work.
ID: 8315 · Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 2 Sep 04
Posts: 545
Credit: 148,912
RAC: 0
Message 8316 - Posted: 5 Jul 2005, 22:00:54 UTC
Last modified: 5 Jul 2005, 22:02:38 UTC

A one second request is equivelent to asking for one work unit. And i updated the wiki to say so ...
ID: 8316 · Report as offensive     Reply Quote
Profile Thierry Van Driessche
Avatar

Send message
Joined: 1 Sep 04
Posts: 157
Credit: 82,604
RAC: 0
Message 8317 - Posted: 5 Jul 2005, 22:51:07 UTC - in response to Message 8316.  
Last modified: 5 Jul 2005, 22:51:48 UTC

> A one second request is equivelent to asking for one work unit. And i updated
> the wiki to say so ...

Agree with this Paul, and just had something similar

06/07/2005 00:49:06|LHC@home|Started upload of wjun2A_v6s4hvpac_mqx__5__64.26_59.27__8_10__6__55_1_sixvf_boinc6438_3_0
06/07/2005 00:49:09|LHC@home|Finished upload of wjun2A_v6s4hvpac_mqx__5__64.26_59.27__8_10__6__55_1_sixvf_boinc6438_3_0
06/07/2005 00:49:09|LHC@home|Throughput 25361 bytes/sec
06/07/2005 00:49:11|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
06/07/2005 00:49:11|LHC@home|Requesting 0 seconds of work, returning 1 results
06/07/2005 00:49:12|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
06/07/2005 00:49:51|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
06/07/2005 00:49:51|LHC@home|Requesting 15 seconds of work, returning 0 results
06/07/2005 00:49:52|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
06/07/2005 00:49:54|LHC@home|Started download of wjun2A_v6s4hvpac_mqx__5__64.25_59.26__10_12__6__20_1_sixvf_boinc5768.zip
06/07/2005 00:49:55|LHC@home|Finished download of wjun2A_v6s4hvpac_mqx__5__64.25_59.26__10_12__6__20_1_sixvf_boinc5768.zip
06/07/2005 00:49:55|LHC@home|Throughput 97079 bytes/sec
06/07/2005 00:49:55||request_reschedule_cpus: files downloaded
ID: 8317 · Report as offensive     Reply Quote
tannengruen
Avatar

Send message
Joined: 27 Sep 04
Posts: 37
Credit: 4,496
RAC: 0
Message 8328 - Posted: 6 Jul 2005, 13:21:24 UTC

I think both of you are getting Ian wrong. I think he ment this here when talking about a "waste of time and resources":

06/07/2005 00:49:11|LHC@home|Requesting 0 seconds of work, returning 1 results
06/07/2005 00:49:12|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded

You get it? Requesting zero seconds of work is indeed a quite senseless task, even more because the scheduler is asked for more seconds right afterwards. So there are always two requests sent to the scheduler, which IS a waste of resources...
ID: 8328 · Report as offensive     Reply Quote
Profile Ageless
Avatar

Send message
Joined: 18 Sep 04
Posts: 143
Credit: 27,645
RAC: 0
Message 8329 - Posted: 6 Jul 2005, 13:40:11 UTC - in response to Message 8328.  

Since the scheduler doesn't take up any discernable cycles from the CPU, it doesn't take up any resources. What you are seeing is the new scheduler at work in the correct way:

First a finished unit is uploaded;
Then the scheduler automatically reports this work unit; it won't request work at the same time.

It then has recalculated the debt for all the projects it is attached to;
If the long term debt is too low (negative), it won't ask for new work.
If the long term debt is equal to zero, it will ask for new work.

Which is what it is doing in Thierry's example.
Jord

BOINC FAQ Service
ID: 8329 · Report as offensive     Reply Quote
tannengruen
Avatar

Send message
Joined: 27 Sep 04
Posts: 37
Credit: 4,496
RAC: 0
Message 8330 - Posted: 6 Jul 2005, 14:45:33 UTC
Last modified: 6 Jul 2005, 14:46:57 UTC

>If the long term debt is too low (negative), it won't ask for new work.

So what's the use of sending a request to the scheduler then?
And what's the benefit of this new solution? I can't see any...
I only see that there are two requests sent to the scheduler, most of the time only half a minute separated, where the old 4.19 client only sent one request.

Please have a look at the times:

05/07/2005 18:14:04|LHC@home|Requesting 0 seconds of work, returning 1 results
05/07/2005 18:31:29|LHC@home|Requesting 1 seconds of work, returning 0 results

06/07/2005 00:49:11|LHC@home|Requesting 0 seconds of work, returning 1 results
06/07/2005 00:49:51|LHC@home|Requesting 15 seconds of work, returning 0 results

And it's always like this with the 4.45 client, the same here.
ID: 8330 · Report as offensive     Reply Quote
Profile Ageless
Avatar

Send message
Joined: 18 Sep 04
Posts: 143
Credit: 27,645
RAC: 0
Message 8331 - Posted: 6 Jul 2005, 15:11:02 UTC - in response to Message 8330.  
Last modified: 6 Jul 2005, 15:11:52 UTC

> I only see that there are two requests sent to the scheduler, most of the time
> only half a minute separated, where the old 4.19 client only sent one
> request.

The use & benefit:
CC4.19 wouldn't report the unit automatically. You'd have to go to the Work tab, rightclick on the unit and Update Manually to report it. Or wait for the once per 24 hour automatic scheduler to come by. If something happened in the mean time, you'd lose the unit.

CC4.45 does it on its own. As I stated above, it's to report the unit, no work is requested at the first scheduler. The LTD (Long Term Debt) is then recalculated and only if it's zero or above will new work be requested. Else you'd see "Requesting 0 seconds of work, returning 0 results." .. Those are just status messages.


Jord

BOINC FAQ Service
ID: 8331 · Report as offensive     Reply Quote
Ian Thompson

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 60,866
RAC: 0
Message 8332 - Posted: 6 Jul 2005, 19:54:10 UTC
Last modified: 6 Jul 2005, 20:31:09 UTC

Hi All

I mean thier seems to be a lot of thinking and requesting going on.

I accept that a request for work could be 1 second long but it it appeared to take a lot longer to decide. why not just wait an hour before loading more work, set a min cache level as well as max.

I not sure the schedule is working the way you expect.

I seem to have few more curious log events.

05/07/2005 23:20:24|SETI@home|Finished upload of 29ja04ab.18903.13649.4826.40_1_0
05/07/2005 23:20:24|SETI@home|Throughput 75392 bytes/sec
05/07/2005 23:20:26|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
05/07/2005 23:20:26|SETI@home|Requesting 0 seconds of work, returning 1 results
05/07/2005 23:20:29|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
05/07/2005 23:20:32|SETI@home|Finished download of 15au03aa.8379.26353.786080.118
05/07/2005 23:20:32|SETI@home|Throughput 22914 bytes/sec
05/07/2005 23:20:32||request_reschedule_cpus: files downloaded
05/07/2005 23:20:32|SETI@home|Pausing result 18se03aa.24250.28097.204836.94_0 (removed from memory)
05/07/2005 23:20:32|LHC@home|Starting result wjun2A_v6s4hvpac_mqx__8__64.268_59.278__10_12__6__45_1_sixvf_boinc11281_1 using sixtrack version 4.67
05/07/2005 23:20:36||Suspending work fetch because computer is overcommitted.
05/07/2005 23:20:36||Using earliest-deadline-first scheduling because computer is overcommitted.
05/07/2005 23:49:18|The Lattice Project|Deferring communication with project for 1 hours, 16 minutes, and 19 seconds
06/07/2005 00:08:28||request_reschedule_cpus: process exited
06/07/2005 00:08:28|LHC@home|Computation for result wjun2A_v6s4hvpac_mqx__11__64.259_59.269__6_8__6__5_1_sixvf_boinc14911_2 finished
06/07/2005 00:08:28||Allowing work fetch again.
06/07/2005 00:08:28||Resuming round-robin CPU scheduling.
06/07/2005 00:08:28|SETI@home|Resuming result 18se03aa.24250.28097.204836.94_0 using setiathome version 4.18
06/07/2005 00:08:29||May run out of work in 2.00 days; requesting more
06/07/2005 00:08:29|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
06/07/2005 00:08:29|SETI@home|Requesting 17248 seconds of work, returning 0 results
06/07/2005 00:08:29|LHC@home|Started upload of wjun2A_v6s4hvpac_mqx__11__64.259_59.269__6_8__6__5_1_sixvf_boinc14911_2_0
06/07/2005 00:08:30|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
06/07/2005 00:08:31|SETI@home|Started download of 15au03aa.8379.26785.934666.124
06/07/2005 00:08:32|LHC@home|Finished upload of wjun2A_v6s4hvpac_mqx__11__64.259_59.269__6_8__6__5_1_sixvf_boinc14911_2_0
06/07/2005 00:08:32|LHC@home|Throughput 19849 bytes/sec
06/07/2005 00:08:34|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
06/07/2005 00:08:34|LHC@home|Requesting 0 seconds of work, returning 1 results
06/07/2005 00:08:36|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
06/07/2005 00:08:40|SETI@home|Finished download of 15au03aa.8379.26785.934666.124
06/07/2005 00:08:40|SETI@home|Throughput 41535 bytes/sec
06/07/2005 00:08:40||request_reschedule_cpus: files downloaded
06/07/2005 00:08:40|LHC@home|Pausing result wjun2A_v6s4hvpac_mqx__8__64.268_59.278__10_12__6__45_1_sixvf_boinc11281_1 (removed from memory)
06/07/2005 00:08:40|SETI@home|Starting result 18se03aa.603.30737.698572.235_1 using setiathome version 4.18
06/07/2005 00:08:41||request_reschedule_cpus: process exited
06/07/2005 00:08:48|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
06/07/2005 00:08:48|SETI@home|Requesting 14352 seconds of work, returning 0 results
06/07/2005 00:08:50|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
06/07/2005 00:08:50|SETI@home|Message from server: Not sending work - last RPC too recent: 19 sec
06/07/2005 00:08:50|SETI@home|No work from project
06/07/2005 00:08:51|SETI@home|Deferring communication with project for 10 minutes and 4 seconds
06/07/2005 00:08:51|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
06/07/2005 00:08:51|LHC@home|Requesting 15420 seconds of work, returning 0 results
06/07/2005 00:08:53|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
06/07/2005 00:08:54|LHC@home|Started download of wjun2A_v6s4hvpac_mqx__11__64.26_59.27__4_6__6__70_1_sixvf_boinc14975.zip


Note a communication to lattice was sent in the middle of an extremely breif overcommited cycle.
Quite why it thinks I am over-commited is just another question.

Better still if 1 second of work is all I needed to collect a work unit what will the scheduler think if I got 1000hours of work back.

<img border="0" src="http://boinc.mundayweb.com/one/stats.php?userID=2104" />
ID: 8332 · Report as offensive     Reply Quote
Pete49

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 250,303
RAC: 0
Message 8336 - Posted: 6 Jul 2005, 21:18:24 UTC - in response to Message 8332.  

> I seem to have few more curious log events.
>
> 05/07/2005 23:49:18|The Lattice Project|Deferring communication with project
> for 1 hours, 16 minutes, and 19 seconds
>
> Note a communication to lattice was sent in the middle of an extremely breif
> overcommited cycle.


That is not a scheduler request to Lattice. Rather, it is a locally generated line of information. Some time before that a request WAS made to Lattice and no work was available. Communication was then defered for an extensive period of time. Every hour after that request, the Client will let you know how much time is left on the defered request to Lattice. If you look one hour earlier you will find a 2:16:19 message and an hour later a 00:16:19 message.


<img src="http://www.boincstats.com/stats/teambanner.php?teamname=GasBuddy"> <img src="http://www.boincstats.com/stats/banner.php?cpid=84c0cf7846cbf28338406e54b3eb8a83">
ID: 8336 · Report as offensive     Reply Quote
Ian Thompson

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 60,866
RAC: 0
Message 8337 - Posted: 6 Jul 2005, 21:32:22 UTC
Last modified: 6 Jul 2005, 21:33:52 UTC

Hi All

Peter your right 1 hour before it made a request.

05/07/2005 22:49:15|The Lattice Project|Sending scheduler request to http://aspartate.umiacs.umd.edu/lattice_public_cgi/cgi
05/07/2005 22:49:15|The Lattice Project|Requesting 345600 seconds of work, returning 0 results
05/07/2005 22:49:16|The Lattice Project|Scheduler request to http://aspartate.umiacs.umd.edu/lattice_public_cgi/cgi succeeded
05/07/2005 22:49:16|The Lattice Project|Message from server: No work available
05/07/2005 22:49:16|The Lattice Project|No work from project
05/07/2005 22:49:18|The Lattice Project|Deferring communication with project for 2 hours, 16 minutes, and 20 seconds
05/07/2005 23:20:07||request_reschedule_cpus: process exited
05/07/2005 23:20:07|SETI@home|Computation for result 29ja04ab.18903.13649.4826.40_1 finished
05/07/2005 23:20:07|SETI@home|Starting result 18se03aa.24250.28097.204836.94_0 using setiathome version 4.18
05/07/2005 23:20:08|SETI@home|Started upload of 29ja04ab.18903.13649.4826.40_1_0
05/07/2005 23:20:13|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
05/07/2005 23:20:13|SETI@home|Requesting 16047 seconds of work, returning 0 results
05/07/2005 23:20:14|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
05/07/2005 23:20:15|SETI@home|Started download of 15au03aa.8379.26353.786080.118
05/07/2005 23:20:24|SETI@home|Finished upload of 29ja04ab.18903.13649.4826.40_1_0
05/07/2005 23:20:24|SETI@home|Throughput 75392 bytes/sec
05/07/2005 23:20:26|SETI@home|Sending scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi
05/07/2005 23:20:26|SETI@home|Requesting 0 seconds of work, returning 1 results
05/07/2005 23:20:29|SETI@home|Scheduler request to http://setiboinc.ssl.berkeley.edu/sah_cgi/cgi succeeded
05/07/2005 23:20:32|SETI@home|Finished download of 15au03aa.8379.26353.786080.118
05/07/2005 23:20:32|SETI@home|Throughput 22914 bytes/sec


Why wasn't it honoured and deferred communications for 2 Hours and 16mins 20 seconds

<img border="0" src="http://boinc.mundayweb.com/one/stats.php?userID=2104" />
ID: 8337 · Report as offensive     Reply Quote
Pete49

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 250,303
RAC: 0
Message 8338 - Posted: 6 Jul 2005, 21:41:53 UTC - in response to Message 8337.  

> Hi All
>
> Peter your right 1 hour before it made a request.
>
> 05/07/2005 22:49:15|The Lattice Project|Sending scheduler request to
> http://aspartate.umiacs.umd.edu/lattice_public_cgi/cgi
> 05/07/2005 22:49:15|The Lattice Project|Requesting 345600 seconds of work,
> returning 0 results
> 05/07/2005 22:49:16|The Lattice Project|Scheduler request to
> http://aspartate.umiacs.umd.edu/lattice_public_cgi/cgi succeeded
> 05/07/2005 22:49:16|The Lattice Project|Message from server: No work
> available
> 05/07/2005 22:49:16|The Lattice Project|No work from project
> 05/07/2005 22:49:18|The Lattice Project|Deferring communication with project
> for 2 hours, 16 minutes, and 20 seconds

> Why wasn't it honoured and deferred communications for 2 Hours and 16mins 20
> seconds
>
Lattice is in Alpha Test. There is no work for any one! I suspect summer hiatus at the University for the lack of work.

Someone better than I (Paul?) would have to explain how the BOINC Client decides just how much to defer a request. On my home cruncher, I have seen Lattice get defered for upwards of 2 days! It's SOP, nothing to worry about.
<img src="http://www.boincstats.com/stats/teambanner.php?teamname=GasBuddy"> <img src="http://www.boincstats.com/stats/banner.php?cpid=84c0cf7846cbf28338406e54b3eb8a83">
ID: 8338 · Report as offensive     Reply Quote
Pete49

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 250,303
RAC: 0
Message 8339 - Posted: 6 Jul 2005, 21:54:21 UTC - in response to Message 8330.  

> So what's the use of sending a request to the scheduler then?
> And what's the benefit of this new solution? I can't see any...
> I only see that there are two requests sent to the scheduler, most of the time
> only half a minute separated, where the old 4.19 client only sent one
> request.
>

Your messages deal with requests to LHC. LHC is OVERESTIMATING the time to completion by a factor fo 3 to 4 times depending upon local computer.

What's happening is this...

V4.45 defaults to return (report) work immediatly. That's the "request 0, return 1"

The new workunit begins to crunch and, in a couple of minutes, 13 hour to completion becomes 3 or 4 hours very quickly. Now your queue is short so the scheduler requests the appropriate number of seconds to fill the queue again but has no completed work to return. that's the "request xxx seconds, returning 0"

This behavior is an artifact of LHC's lousy time estimate.



If LHC
<img src="http://www.boincstats.com/stats/teambanner.php?teamname=GasBuddy"> <img src="http://www.boincstats.com/stats/banner.php?cpid=84c0cf7846cbf28338406e54b3eb8a83">
ID: 8339 · Report as offensive     Reply Quote
John McLeod VII
Avatar

Send message
Joined: 2 Sep 04
Posts: 165
Credit: 146,925
RAC: 0
Message 8340 - Posted: 7 Jul 2005, 2:22:52 UTC - in response to Message 8339.  

> This behavior is an artifact of LHC's lousy time estimate.

5.00 has code that slowly corrects for lousy time estimates. Every time that a WU takes less time than expected, a correction factor is reduced a little. (If the WU takes more time than expected, the correction factor is increased a great deal).


BOINC WIKI
ID: 8340 · Report as offensive     Reply Quote
Pete49

Send message
Joined: 18 Sep 04
Posts: 35
Credit: 250,303
RAC: 0
Message 8341 - Posted: 7 Jul 2005, 3:14:31 UTC - in response to Message 8340.  

> > This behavior is an artifact of LHC's lousy time estimate.
>
> 5.00 has code that slowly corrects for lousy time estimates. Every time that
> a WU takes less time than expected, a correction factor is reduced a little.
> (If the WU takes more time than expected, the correction factor is increased a
> great deal).
>

Unfortunatly, V5.0 is a major version change that is supported by SETI Alpha only.


That feature is in pre-alpha V4.49. I've given it a try and was not happy with it.


<img src="http://www.boincstats.com/stats/teambanner.php?teamname=GasBuddy"> <img src="http://www.boincstats.com/stats/banner.php?cpid=84c0cf7846cbf28338406e54b3eb8a83">
ID: 8341 · Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 2 Sep 04
Posts: 545
Credit: 148,912
RAC: 0
Message 8344 - Posted: 7 Jul 2005, 11:02:03 UTC

My experience with 4.45 is good enough that as a sign of confidence ... all my systems are running 4.45 (well, OS-X I can only run 4.43) ...
ID: 8344 · Report as offensive     Reply Quote
John McLeod VII
Avatar

Send message
Joined: 2 Sep 04
Posts: 165
Credit: 146,925
RAC: 0
Message 8352 - Posted: 8 Jul 2005, 4:02:00 UTC

Major version changes blocking connections was supposed to be fixed so it didn't happen. This did not work as expected, so 5.00 has been pulled, and replaced with 4.70.


BOINC WIKI
ID: 8352 · Report as offensive     Reply Quote

Message boards : Number crunching : Surely a pointless request


©2024 CERN