Message boards : Number crunching : Server not reporting "No Work from project"
Message board moderation

To post messages, you must log in.

AuthorMessage
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18296 - Posted: 18 Oct 2007, 23:52:26 UTC

Main page says no work. Scheduler contacts give the following:

10/18/2007 7:17:44 PM|lhcathome|Sending scheduler request: To fetch work
10/18/2007 7:17:44 PM|lhcathome|Requesting 2526 seconds of new work
10/18/2007 7:17:49 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/18/2007 7:17:49 PM|lhcathome|Deferring communication for 30 min 18 sec
10/18/2007 7:17:49 PM|lhcathome|Reason: requested by project
10/18/2007 7:48:09 PM|lhcathome|Sending scheduler request: To fetch work
10/18/2007 7:48:09 PM|lhcathome|Requesting 2550 seconds of new work
10/18/2007 7:48:14 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/18/2007 7:48:14 PM|lhcathome|Deferring communication for 30 min 18 sec
10/18/2007 7:48:14 PM|lhcathome|Reason: requested by project
ID: 18296 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18300 - Posted: 19 Oct 2007, 2:41:31 UTC

OK... It's getting a bit strange... You'll notice that within the same session there are two deferrals with the no work coming second. The next scheduler connect only has the one entry "requested by project" and goes back to the 30 minute interval instead of the longer backoff...

10/18/2007 9:49:50 PM|lhcathome|Sending scheduler request: To fetch work
10/18/2007 9:49:50 PM|lhcathome|Requesting 2960 seconds of new work
10/18/2007 9:49:55 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/18/2007 9:49:55 PM|lhcathome|Deferring communication for 30 min 18 sec
10/18/2007 9:49:55 PM|lhcathome|Reason: requested by project
10/18/2007 9:49:55 PM|lhcathome|Deferring communication for 41 min 9 sec
10/18/2007 9:49:55 PM|lhcathome|Reason: no work from project
10/18/2007 10:31:06 PM|lhcathome|Sending scheduler request: To fetch work
10/18/2007 10:31:06 PM|lhcathome|Requesting 3001 seconds of new work
10/18/2007 10:31:11 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/18/2007 10:31:11 PM|lhcathome|Deferring communication for 30 min 18 sec
10/18/2007 10:31:11 PM|lhcathome|Reason: requested by project
ID: 18300 · Report as offensive     Reply Quote
PovAddict
Avatar

Send message
Joined: 14 Jul 05
Posts: 275
Credit: 49,291
RAC: 0
Message 18301 - Posted: 19 Oct 2007, 3:26:07 UTC

The two deferral messages are a client bug. Upgrade to a newer version.
ID: 18301 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18302 - Posted: 19 Oct 2007, 3:33:44 UTC - in response to Message 18301.  

The two deferral messages are a client bug. Upgrade to a newer version.


Hmmm... This only started as of today... Why would a client bug manifest itself after several months of operation? Do you have a Trac bug number and a way I can look at the bug log? I used to be able to browse the list without logging in, but I can't seem to find how I did that...
ID: 18302 · Report as offensive     Reply Quote
PovAddict
Avatar

Send message
Joined: 14 Jul 05
Posts: 275
Credit: 49,291
RAC: 0
Message 18303 - Posted: 19 Oct 2007, 3:54:42 UTC - in response to Message 18302.  
Last modified: 19 Oct 2007, 4:08:28 UTC

Hmmm... This only started as of today... Why would a client bug manifest itself after several months of operation? Do you have a Trac bug number and a way I can look at the bug log?

I don't think it was reported on Trac. Devs probably found it themselves without any user reporting it.

Anyway, the latest version's logs look like this:
19/10/2007 00:45:07|boincsimap|Sending scheduler request: To fetch work.  Requesting 5556 seconds of work, reporting 0 completed tasks
19/10/2007 00:45:22|boincsimap|Scheduler request succeeded: got 1 new tasks
19/10/2007 00:45:32|boincsimap|Sending scheduler request: To fetch work.  Requesting 1151 seconds of work, reporting 0 completed tasks
19/10/2007 00:45:42|boincsimap|Scheduler request succeeded: got 1 new tasks
Yep, doesn't show deferral at all. Not a bug, it was removed on purpose. I already gave up on trying to get the change reverted. Complain yourself.

I used to be able to browse the list without logging in, but I can't seem to find how I did that...

And even though I have permission to do *whatever* on the wiki (I helped migrate the old documentation into the wiki, so I even have delete permissions) and *whatever* with the bug tickets, I suddenly lost even read permission for the source code and changesets. So I think somebody recently broke Trac permissions. Wouldn't surprise me if tickets are now blocked for everybody, or if only spammers can access the wiki :P

ID: 18303 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18304 - Posted: 19 Oct 2007, 3:59:51 UTC - in response to Message 18303.  
Last modified: 19 Oct 2007, 4:06:01 UTC

Hmmm... This only started as of today... Why would a client bug manifest itself after several months of operation? Do you have a Trac bug number and a way I can look at the bug log?

I don't think it was reported on Trac. Devs probably found it themselves without any user reporting it.



Well, I decided to restart BOINC and see what happens. Next connect in 4 minutes. I'll edit this with the results and further comment...

Well, same thing... This is 5.8.16, so it's not super-old... I wasn't fond of what I was hearing about 5.10.x versions, so I hung back with 5.8.16... This is a minor nuisance...and I'd prefer to be able to see the deferral, so I'll just hang out for a while and see what happens...
ID: 18304 · Report as offensive     Reply Quote
Profile Ageless
Avatar

Send message
Joined: 18 Sep 04
Posts: 143
Credit: 27,645
RAC: 0
Message 18307 - Posted: 19 Oct 2007, 10:50:35 UTC - in response to Message 18303.  

I suddenly lost even read permission for the source code and changesets. So I think somebody recently broke Trac permissions.

Nah, it's super secret code that they are adding. Only visible on a need-to-know-basis. :-D
Jord

BOINC FAQ Service
ID: 18307 · Report as offensive     Reply Quote
Keith T.
Avatar

Send message
Joined: 1 Mar 07
Posts: 47
Credit: 32,356
RAC: 0
Message 18312 - Posted: 19 Oct 2007, 13:45:23 UTC - in response to Message 18303.  

I used to be able to browse the list without logging in, but I can't seem to find how I did that...

And even though I have permission to do *whatever* on the wiki (I helped migrate the old documentation into the wiki, so I even have delete permissions) and *whatever* with the bug tickets, I suddenly lost even read permission for the source code and changesets. So I think somebody recently broke Trac permissions. Wouldn't surprise me if tickets are now blocked for everybody, or if only spammers can access the wiki :P


Matt Lebofsky, SETI@home server admin, also does work on the BOINC servers.

In his recent SAH Tech News he stated that he accidentally broke Trac a few days ago.


11 Oct 2007 23:28:33 UTC
I was going to get some programming done today but Dave needed php upgraded on the BOINC server, which was running Fedora Core 6. FC6 didn't have a sufficiently advanced php in its repositories, so this was as good a time as any to yum the system up to Fedora Core 7. This was slow, but worked like a charm.

Except I then realized the trac system (used for BOINC's web based public software development) was toasted due to the upgrade. It took over two hours of hair pulling, scouring log files, removing/reinstalling various software packages, poring through barely informative pages only found in Google's cache.. I don't really understand how what we ultimately did fixed the problem, but we seem to be out of the woods, more or less.

I hate to say it, but trac is written in python, and I've never had any positive experiences with this programming language. Every six months some random python program explodes as it is utterly sensitive to version upgrades, and tracing the problems is impossible as the code is difficult to read and scoured all over the system in vaguely named files. Others keep trying to convince me python is the bee's knees, but I just can't see it. I started out writing raw machine code on my Apple II+, so to me C is the pinnacle of programming languages (not C++). I'll shut up now before I further offend python programmers/developers.

- Matt


See http://setiathome.berkeley.edu/tech_news.php#134 and http://setiathome.berkeley.edu/forum_thread.php?id=42913 for more.

Keith.
ID: 18312 · Report as offensive     Reply Quote
PovAddict
Avatar

Send message
Joined: 14 Jul 05
Posts: 275
Credit: 49,291
RAC: 0
Message 18313 - Posted: 19 Oct 2007, 15:10:29 UTC - in response to Message 18312.  

Matt Lebofsky, SETI@home server admin, also does work on the BOINC servers.

In his recent SAH Tech News he stated that he accidentally broke Trac a few days ago.

I saw that, and it was a few days ago. But the source permissions got broken just last night.
ID: 18313 · Report as offensive     Reply Quote
Profile Viking69
Avatar

Send message
Joined: 24 Jul 05
Posts: 56
Credit: 5,602,722
RAC: 4
Message 18391 - Posted: 27 Oct 2007, 20:27:42 UTC

One of my PC's, the VISTA box using 5.10.20, is getting this message stream when left unattended and looking for work:

10/27/2007 1:05:23 PM|lhcathome|Message from server: Not sending work - last request too recent: 65 sec
10/27/2007 1:05:23 PM|lhcathome|Deferring communication for 1 min 0 sec
10/27/2007 1:05:23 PM|lhcathome|Reason: no work from project
10/27/2007 1:06:24 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:06:24 PM|lhcathome|Requesting 1199 seconds of new work
10/27/2007 1:06:29 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:06:29 PM|lhcathome|Message from server: Not sending work - last request too recent: 65 sec
10/27/2007 1:06:29 PM|lhcathome|Deferring communication for 1 min 44 sec
10/27/2007 1:06:29 PM|lhcathome|Reason: no work from project
10/27/2007 1:08:14 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:08:14 PM|lhcathome|Requesting 1199 seconds of new work
10/27/2007 1:08:19 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:08:19 PM|lhcathome|Message from server: Not sending work - last request too recent: 111 sec
10/27/2007 1:08:19 PM|lhcathome|Deferring communication for 1 min 0 sec
10/27/2007 1:08:19 PM|lhcathome|Reason: no work from project
10/27/2007 1:09:24 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:09:24 PM|lhcathome|Requesting 1199 seconds of new work
10/27/2007 1:09:29 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:09:29 PM|lhcathome|Message from server: Not sending work - last request too recent: 71 sec
10/27/2007 1:09:29 PM|lhcathome|Deferring communication for 5 min 16 sec
10/27/2007 1:09:29 PM|lhcathome|Reason: no work from project
10/27/2007 1:14:50 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:14:50 PM|lhcathome|Requesting 1200 seconds of new work
10/27/2007 1:14:55 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:14:55 PM|lhcathome|Message from server: Not sending work - last request too recent: 325 sec
10/27/2007 1:14:55 PM|lhcathome|Deferring communication for 19 min 10 sec
10/27/2007 1:14:55 PM|lhcathome|Reason: no work from project

I have detatched and rebooted with the same result, "no work from project" because of too recent a request.

I am currently focusing on POEM and LHC on my home PC's and this is the only one doing this. the other 3 PC's are XP(2) and 2k(1).
ID: 18391 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18393 - Posted: 27 Oct 2007, 21:26:46 UTC - in response to Message 18391.  

Your issue is one where the server *IS* reporting no work, so it might be beneficial to start a new post instead of using this thread...

FYI, the current minimum time between scheduler contacts is a little over 30 minutes, so you need to simply set that box to no new tasks for 31 minutes and then try again...

Brian

One of my PC's, the VISTA box using 5.10.20, is getting this message stream when left unattended and looking for work:

10/27/2007 1:05:23 PM|lhcathome|Message from server: Not sending work - last request too recent: 65 sec
10/27/2007 1:05:23 PM|lhcathome|Deferring communication for 1 min 0 sec
10/27/2007 1:05:23 PM|lhcathome|Reason: no work from project
10/27/2007 1:06:24 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:06:24 PM|lhcathome|Requesting 1199 seconds of new work
10/27/2007 1:06:29 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:06:29 PM|lhcathome|Message from server: Not sending work - last request too recent: 65 sec
10/27/2007 1:06:29 PM|lhcathome|Deferring communication for 1 min 44 sec
10/27/2007 1:06:29 PM|lhcathome|Reason: no work from project
10/27/2007 1:08:14 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:08:14 PM|lhcathome|Requesting 1199 seconds of new work
10/27/2007 1:08:19 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:08:19 PM|lhcathome|Message from server: Not sending work - last request too recent: 111 sec
10/27/2007 1:08:19 PM|lhcathome|Deferring communication for 1 min 0 sec
10/27/2007 1:08:19 PM|lhcathome|Reason: no work from project
10/27/2007 1:09:24 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:09:24 PM|lhcathome|Requesting 1199 seconds of new work
10/27/2007 1:09:29 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:09:29 PM|lhcathome|Message from server: Not sending work - last request too recent: 71 sec
10/27/2007 1:09:29 PM|lhcathome|Deferring communication for 5 min 16 sec
10/27/2007 1:09:29 PM|lhcathome|Reason: no work from project
10/27/2007 1:14:50 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 1:14:50 PM|lhcathome|Requesting 1200 seconds of new work
10/27/2007 1:14:55 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 1:14:55 PM|lhcathome|Message from server: Not sending work - last request too recent: 325 sec
10/27/2007 1:14:55 PM|lhcathome|Deferring communication for 19 min 10 sec
10/27/2007 1:14:55 PM|lhcathome|Reason: no work from project

I have detatched and rebooted with the same result, "no work from project" because of too recent a request.

I am currently focusing on POEM and LHC on my home PC's and this is the only one doing this. the other 3 PC's are XP(2) and 2k(1).

ID: 18393 · Report as offensive     Reply Quote
Daxa

Send message
Joined: 29 Dec 06
Posts: 100
Credit: 184,937
RAC: 0
Message 18395 - Posted: 27 Oct 2007, 22:35:55 UTC

Great advice from Brian Silvers! It would be nice if BOINC manager would always set the next try for 30 minutes, 18 seconds... but that's a subject for another thread.

Thanks again, Brian. I'm using the "no new tasks" technique and setting my wrist watch for 31 minutes... so simple, yet so smart.



_______

"Three quarks for Muster Mark!"
. . . . . . . - James Joyce, Finnegans Wake . . . .

ID: 18395 · Report as offensive     Reply Quote
Daxa

Send message
Joined: 29 Dec 06
Posts: 100
Credit: 184,937
RAC: 0
Message 18396 - Posted: 27 Oct 2007, 22:48:14 UTC
Last modified: 27 Oct 2007, 22:49:08 UTC

...and yes, this issue is technically off topic (I just realized.)

If anyone has other questions about the "Server IS reporting No Work from Project saying that my last request was too recent" issue, start a new thread.

Cheers!


_______

"Three quarks for Muster Mark!"
. . . . . . . - James Joyce, Finnegans Wake . . . .

ID: 18396 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18397 - Posted: 27 Oct 2007, 22:56:44 UTC - in response to Message 18395.  
Last modified: 27 Oct 2007, 23:03:28 UTC

It would be nice if BOINC manager would always set the next try for 30 minutes, 18 seconds... but that's a subject for another thread.

Thanks again, Brian. I'm using the "no new tasks" technique and setting my wrist watch for 31 minutes... so simple, yet so smart.


It depends on various conditions (that I don't know/understand) as to what delay time is set for the next comm attempt. IMO, with the current work quota set so low, I don't see the need for the backoff anyway... I would think that the current 30 minute policy probably increased the load on the scheduler.

Edit: OK, I figure perhaps the 30-minute backoff was to try to get reporting done before forcing the 24-hour backoff. Dunno what to say. I'll check to see what happens with the 4 units I pick up over the next 2 hours, but I think the 30-minute rule doesn't get the final 2 results reported until tomorrow unless I hit the update button myself...

Brian
ID: 18397 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18404 - Posted: 28 Oct 2007, 0:49:03 UTC - in response to Message 18397.  



Edit: OK, I figure perhaps the 30-minute backoff was to try to get reporting done before forcing the 24-hour backoff. Dunno what to say. I'll check to see what happens with the 4 units I pick up over the next 2 hours, but I think the 30-minute rule doesn't get the final 2 results reported until tomorrow unless I hit the update button myself...


Yep, that's exactly what happened. I had one short running result, so it got reported on the contact that happened 30 minutes after the download of the first two results. The 2nd result finished at about 40 minutes after it was first downloaded, so the 2nd result went up in the report just a few minutes ago. However, here's what the scheduler did:

10/27/2007 8:41:44 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 8:41:44 PM|lhcathome|Requesting 256151 seconds of new work, and reporting 1 completed tasks
10/27/2007 8:41:54 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 8:41:54 PM|lhcathome|Message from server: No work sent
10/27/2007 8:41:54 PM|lhcathome|Message from server: (reached daily quota of 4 results)
10/27/2007 8:41:54 PM|lhcathome|Deferring communication for 22 hr 27 min 48 sec
10/27/2007 8:41:54 PM|lhcathome|Reason: requested by project

So, to report the remaining 2 results, I'll have to manually push the update button or wait until the next contact some 22 hours away...

Brian
ID: 18404 · Report as offensive     Reply Quote
Profile Viking69
Avatar

Send message
Joined: 24 Jul 05
Posts: 56
Credit: 5,602,722
RAC: 4
Message 18405 - Posted: 28 Oct 2007, 1:04:19 UTC - in response to Message 18404.  



Edit: OK, I figure perhaps the 30-minute backoff was to try to get reporting done before forcing the 24-hour backoff. Dunno what to say. I'll check to see what happens with the 4 units I pick up over the next 2 hours, but I think the 30-minute rule doesn't get the final 2 results reported until tomorrow unless I hit the update button myself...


Yep, that's exactly what happened. I had one short running result, so it got reported on the contact that happened 30 minutes after the download of the first two results. The 2nd result finished at about 40 minutes after it was first downloaded, so the 2nd result went up in the report just a few minutes ago. However, here's what the scheduler did:

10/27/2007 8:41:44 PM|lhcathome|Sending scheduler request: To fetch work
10/27/2007 8:41:44 PM|lhcathome|Requesting 256151 seconds of new work, and reporting 1 completed tasks
10/27/2007 8:41:54 PM|lhcathome|Scheduler RPC succeeded [server version 505]
10/27/2007 8:41:54 PM|lhcathome|Message from server: No work sent
10/27/2007 8:41:54 PM|lhcathome|Message from server: (reached daily quota of 4 results)
10/27/2007 8:41:54 PM|lhcathome|Deferring communication for 22 hr 27 min 48 sec
10/27/2007 8:41:54 PM|lhcathome|Reason: requested by project

So, to report the remaining 2 results, I'll have to manually push the update button or wait until the next contact some 22 hours away...

Brian


OK, so I misread the POST header. But still the back off times were not for the reason of reaching the daily quota, I had see that before. But the systems were unattended and I was not controling the times of the WAIT between communication events, it was all on the software and the server. If I go back far enough in the logs I will find an item similar to the one shown above. When I looked at the web site after the last message I saw that there was work, about 90,000 units at that time and rising as I refreshed.
ID: 18405 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 18407 - Posted: 28 Oct 2007, 1:21:13 UTC - in response to Message 18405.  

I understood your issue. I was merely pointing out that this subject is not reflective of your actual issue.

Anyway, to further explain what's going on in your case, the project has implemented a rule that says that no computer can contact the server more than once in any 30 minute and 18 second interval. If BOINC decides it is going to connect again in 1 minute, then I believe that your 30 minutes and 18 seconds are reset. In other words, if you were 20 minutes into the 30 minute wait and your system contacted the scheduler, you would then be told to wait 30 more minutes, not 10 more. This behavior is different from the 24-hour quota timer, which appears to keep track of specifically when the results were sent and adjusts accordingly.

I don't know if there is anything that can be done about BOINC deciding to automatically connect again in less than the set limit by LHC. It is a different issue than what I was originally reporting, that's why I'm suggesting that you create a new thread to discuss the issue.

Brian


OK, so I misread the POST header. But still the back off times were not for the reason of reaching the daily quota, I had see that before. But the systems were unattended and I was not controling the times of the WAIT between communication events, it was all on the software and the server. If I go back far enough in the logs I will find an item similar to the one shown above. When I looked at the web site after the last message I saw that there was work, about 90,000 units at that time and rising as I refreshed.

ID: 18407 · Report as offensive     Reply Quote

Message boards : Number crunching : Server not reporting "No Work from project"


©2024 CERN