Thread 'I think we should restrict work units'

Author	Message
Osku87 Send message Joined: 2 Nov 05 Posts: 21 Credit: 162,484 RAC: 681	Message 13878 - Posted: 4 Jun 2006, 14:10:38 UTC - in response to Message 13876. Thanks to your clear explanation, I raised my cach for .01 to 10 days. And yup, As soon as there was work to do, I was able to get a bunch of it to work on. Please, don't do that. Engouraging people to keep that kind of a cache without a proper reason on fast computers will bring only a problems. As we have seen servers at Cern (or where ever they are) won't stand the excessive amount of downloads what around 80 computers downloading WUs for ten days cause. So don't come here whining when server is down and you wan't download work. You know the reason exactly. ID: 13878 · Reply Quote

Philip Martin Kryder Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0	Message 13882 - Posted: 4 Jun 2006, 20:25:45 UTC - in response to Message 13878. Thanks to your clear explanation, I raised my cach for .01 to 10 days. And yup, As soon as there was work to do, I was able to get a bunch of it to work on. Please, don't do that. Engouraging people to keep that kind of a cache without a proper reason on fast computers will bring only a problems. As we have seen servers at Cern (or where ever they are) won't stand the excessive amount of downloads what around 80 computers downloading WUs for ten days cause. So don't come here whining when server is down and you wan't download work. You know the reason exactly. So, you are saying that you have direct knowledge that the servers are so badly configured that they FAIL rather than throttle back connections to a level that they can handle? Can you tell us where and how you learned this? If your assertion is true, shouldn't be handled more directly by reconfiguring the servers rather than expecting 65k crunchers to configure their machines in some special way? And by the way, I was never intending to be "...Engouraging [sic] people to keep that kind of a cache..." I think that YOU should set YOUR cache to .01 and leave it there. You should NEVER raise YOUR cache above .1. By the way, what do you mean by "...a proper reason???" ID: 13882 · Reply Quote

MB Atlanos Send message Joined: 14 Jul 05 Posts: 11 Credit: 81,274 RAC: 0	Message 13890 - Posted: 5 Jun 2006, 18:38:31 UTC - in response to Message 13882. Last modified: 5 Jun 2006, 18:53:45 UTC Please, don't do that. Engouraging people to keep that kind of a cache without a proper reason on fast computers will bring only a problems. As we have seen servers at Cern (or where ever they are) won't stand the excessive amount of downloads what around 80 computers downloading WUs for ten days cause. So don't come here whining when server is down and you wan't download work. You know the reason exactly. So, you are saying that you have direct knowledge that the servers are so badly configured that they FAIL rather than throttle back connections to a level that they can handle? Can you tell us where and how you learned this? If your assertion is true, shouldn't be handled more directly by reconfiguring the servers rather than expecting 65k crunchers to configure their machines in some special way? And by the way, I was never intending to be "...Engouraging [sic] people to keep that kind of a cache..." I think that YOU should set YOUR cache to .01 and leave it there. You should NEVER raise YOUR cache above .1. By the way, what do you mean by "...a proper reason???" (End of Quote) First: Calm down, dont take everthing too personal and stop shouting/make demands at other people who express there dislike with your behavior. Your reaction also is not very constructive - please dont try to be sarkastic, its not your best skill. ;) Second: Search the forum, there are several reports that to many connections have kicked the DB-driven-websites and also the forum to nirvana. IIRC 50+ became critical, we see that a few days ago at the last batch of work. If you not noticed: In the last few months the normal LHC-User see not very much activity by an admin, currently there are no projektadmin at all. Look at the appropriate forumposts and the anwsers from chrulle, your former admin. Oh, one more thing: please stop to waste space in your quotings, not everyone likes unnecessary scrolling. ;) Sidenote: before you go ballistic at this post - observe the emoticons. ID: 13890 · Reply Quote

Osku87 Send message Joined: 2 Nov 05 Posts: 21 Credit: 162,484 RAC: 681	Message 13891 - Posted: 5 Jun 2006, 19:46:16 UTC - in response to Message 13890. By the way, what do you mean by "...a proper reason???" Like being a modem (or other with dial up and minute fee line) user. I'm sorry if someone took the tone of my message too serious. It was ment to wake some guys up... ;) As MB_Atlanos said there have been a couple of times when too much of server load has gotten the servers upside down. ID: 13891 · Reply Quote

ksba Send message Joined: 27 Sep 04 Posts: 40 Credit: 1,742,415 RAC: 0	Message 13897 - Posted: 7 Jun 2006, 22:25:35 UTC Hi, im not the fastest user, i'm user nr 2 :) I have something about 100 PCs working at LHC. I have set LHC to 90%, and Rosetta at 10% and i set cache to "0.1 Day". If there is fresh work, i get 1 WU at normal PC, 2 WU at HT's and 4 at very fast DualCores. If too many fill the buffers up and let us wait many many days, its a shame. If you have a flat and are allways online, please set it as low as possible!! I'd like to get work at the second day too, and not only on the first 5 hours :) Do i need to set it to 5 Day's (joke!) ID: 13897 · Reply Quote

David Lahr Send message Joined: 27 Dec 05 Posts: 7 Credit: 461,367 RAC: 0	Message 13898 - Posted: 8 Jun 2006, 5:26:35 UTC - in response to Message 13897. Well, I'm sold, consider my cache increased to maximum size! ID: 13898 · Reply Quote

John Hunt Send message Joined: 13 Jul 05 Posts: 133 Credit: 162,641 RAC: 0	Message 13910 - Posted: 9 Jun 2006, 22:05:33 UTC As I post this (23.00 hrs UK time) the front page of the LHC site shows that there are still 340 WUs out there somewhere, still unprocessed. This is days after I (and I'm sure many other people!) had returned the final WU unit from the last batch of WUs to be issued. So those 340 WUs are being held in the cache of some irresponsible crunchers who 'hoard' work. This must ultimately slow down the whole LHC project! Jeez - when will some people learn that we are working for LHC, not LHC working for our egos..... ID: 13910 · Reply Quote

Philip Martin Kryder Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0	Message 13914 - Posted: 10 Jun 2006, 7:15:42 UTC - in response to Message 13910. As I post this (23.00 hrs UK time) the front page of the LHC site shows that there are still 340 WUs out there somewhere, still unprocessed. This is days after I (and I'm sure many other people!) had returned the final WU unit from the last batch of WUs to be issued. So those 340 WUs are being held in the cache of some irresponsible crunchers who 'hoard' work. This must ultimately slow down the whole LHC project! Jeez - when will some people learn that we are working for LHC, not LHC working for our egos..... how do you know that the "hoarders" didn't have a "proper" reason such as a slow modem or dial up line? ID: 13914 · Reply Quote

John Hunt Send message Joined: 13 Jul 05 Posts: 133 Credit: 162,641 RAC: 0	Message 13916 - Posted: 10 Jun 2006, 8:15:24 UTC - in response to Message 13862. Matt - I want to thank you for taking the time to post this and start this thread. Prior to your having done so, I was have difficulty getting work units to run for LHC. Thanks to your clear explanation, I raised my cach for .01 to 10 days. And yup, As soon as there was work to do, I was able to get a bunch of it to work on. Again, thanks for your help in showing us how to get the maximum number of work units to process. Phil 'Nuff said....... ID: 13916 · Reply Quote

Trog Dog Send message Joined: 25 Nov 05 Posts: 39 Credit: 41,119 RAC: 0	Message 13917 - Posted: 10 Jun 2006, 10:28:33 UTC - in response to Message 13914. how do you know that the "hoarders" didn't have a "proper" reason such as a slow modem or dial up line? or an older machine, or one that doesn't crunch 24/7? ID: 13917 · Reply Quote

Dronak Send message Joined: 19 May 06 Posts: 20 Credit: 297,111 RAC: 0	Message 13918 - Posted: 10 Jun 2006, 14:15:28 UTC - in response to Message 13917. how do you know that the "hoarders" didn't have a "proper" reason such as a slow modem or dial up line? or an older machine, or one that doesn't crunch 24/7? That's possible, sure, and I don't think people would complain much if people have a legitimate reason for using a large cache. One way to check this would be to find some work units that are still pending, then look at the computer's details. Some of the information there must give you an idea what the computer is like in terms of age, speed, percent of the time BOINC runs, network connectivity, etc. That should help you decide if the person is keeping a large cache for a legitimate reason or not. The on and off work status of LHC is new to me, so it's tough for me to understand exactly what's going on, why, and how to solve the problems. I do know that it's disappointing to see so many work units in progress, waiting to be done, while my computer has been sitting dry for over 1 week. I think many others share this feeling, and that's one of the reasons for the complaints. We suspect that not everybody has a proper reason for maintaining a huge cache of work, and the people who don't have a proper reason are just being greedy, taking work from others for themselves and preventing work from being done promptly. ID: 13918 · Reply Quote

Philip Martin Kryder Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0	Message 13919 - Posted: 10 Jun 2006, 15:04:15 UTC - in response to Message 13918. While I can acknowledge and appreciate your feelings, I'm surprised to find them so common in a scientific study. Instead of casting wide nets of guilt by inuendo, based on intuition and emotion, why not gather the data that you describe? Some folks might even consider gathering data to support a hypothesis before posting. As for deciding based on the attributes of the machine or network if any given cache size is "legitimate", who am I or who is anyone to decide what "legitimate " is? I denigrate the underlying premise that somehow "greed is bad" but having a slow connection or slow computer is somehow makes it "proper" to have a large cache. A person who was less "greedy" might buy a faster computer or a faster Internet connection or a second phone line. Would that then make them "good"? What if they bought a larger computer and then increased their cache, would they then be "greedy" or would they be "good?" In summary: 1) LHC could control hoarding by limiting deadlines. 2) LHC could control server crashes by limiting max connections relative to known server capacity. 3) If the use of large caches is "improper" or "illegitimate", then this is the type of feedback that the BOINC folks need in order to establish the need for new and better controls and algorithms in BOINC. 4) There is lots of other great science to do on other projects when LHC runs dry. crunch on! how do you know that the "hoarders" didn't have a "proper" reason such as a slow modem or dial up line? or an older machine, or one that doesn't crunch 24/7? That's possible, sure, and I don't think people would complain much if people have a legitimate reason for using a large cache. One way to check this would be to find some work units that are still pending, then look at the computer's details. Some of the information there must give you an idea what the computer is like in terms of age, speed, percent of the time BOINC runs, network connectivity, etc. That should help you decide if the person is keeping a large cache for a legitimate reason or not. The on and off work status of LHC is new to me, so it's tough for me to understand exactly what's going on, why, and how to solve the problems. I do know that it's disappointing to see so many work units in progress, waiting to be done, while my computer has been sitting dry for over 1 week. I think many others share this feeling, and that's one of the reasons for the complaints. We suspect that not everybody has a proper reason for maintaining a huge cache of work, and the people who don't have a proper reason are just being greedy, taking work from others for themselves and preventing work from being done promptly. ID: 13919 · Reply Quote

KWSN - A Shrubbery Send message Joined: 3 Jan 06 Posts: 14 Credit: 32,201 RAC: 0	Message 13921 - Posted: 10 Jun 2006, 16:53:38 UTC - in response to Message 13914. As I post this (23.00 hrs UK time) the front page of the LHC site shows that there are still 340 WUs out there somewhere, still unprocessed. This is days after I (and I'm sure many other people!) had returned the final WU unit from the last batch of WUs to be issued. So those 340 WUs are being held in the cache of some irresponsible crunchers who 'hoard' work. This must ultimately slow down the whole LHC project! Jeez - when will some people learn that we are working for LHC, not LHC working for our egos..... how do you know that the "hoarders" didn't have a "proper" reason such as a slow modem or dial up line? Proper reason or not, the WU that are currently being processed are ones that had their origional deadline exceeded. Even the last WU sent from the recent batch would have timed out by now. Granted, things happen and sometimes it's not possible to return results but I seriously doubt this is the case with the majority of the current backlog. I don't have an issue with people having enough work to keep them running but manipulating the program to the point of work missing the deadline is excessive. ID: 13921 · Reply Quote

m.mitch Send message Joined: 4 Sep 05 Posts: 112 Credit: 2,318,981 RAC: 15	Message 13922 - Posted: 10 Jun 2006, 17:23:00 UTC Last modified: 10 Jun 2006, 17:23:49 UTC I don't know. I think it all may be a bit harsh. I fit in to the category of un-returned results. It happens. BOINC restarted on one of my PC's and hung for a while then came up empty. I shut it down and it seemed to restart okay. But latter when checking LHC I found a number of unfinished WU's that should have timed out by now. It wouldn't be the only scenario and it would take to many users for that problem to add up but that's why LHC have so much redundancy. If it were a problem the project would be modified. I'm doing the best I can, sometimes I make mistakes. Click here to join the #1 Aussie Alliance on LHC. ID: 13922 · Reply Quote

Steve Cressman Send message Joined: 28 Sep 04 Posts: 47 Credit: 6,394 RAC: 0	Message 13931 - Posted: 11 Jun 2006, 5:42:11 UTC If we were not still waiting for the stragglers to get the work done we could possibly be already working on another batch. It is not right that there are still units to be done when others attached to the project have been idle for a week or more. I think a solution that the project could do is to decrease the daily quota. And I would bet that when the right number for the quota was found they would get all the results much faster then they do now. It might take a bit of trial and error for them to find the right amount to set it at but it most definitely would be better then have the majority of the host computers sitting idle asking for work. When I say idle, I mean in reguards to this project because hopefully this is not the only project they are attached to. 98SE XP2500+ @ 2.1 GHz Boinc v5.8.8 ID: 13931 · Reply Quote

Trog Dog Send message Joined: 25 Nov 05 Posts: 39 Credit: 41,119 RAC: 0	Message 13936 - Posted: 11 Jun 2006, 7:36:46 UTC - in response to Message 13931. I think a solution that the project could do is to decrease the daily quota. I can see it now all those super crunchers complaining that they have hit the daily limit (refer to the wailing and gnashing of teeth about the curse of 32 over at Einstein) and how its slowing down the science having their machines sitting dry. It's not fair that their super machines are being penalised because they are too fast. Whichever way you cut it, slice it or dice it, the work is sporadic someone somewhere will always run out of work before someone else. ID: 13936 · Reply Quote

bowlingguy300 Send message Joined: 1 Sep 04 Posts: 14 Credit: 3,857 RAC: 0	Message 13939 - Posted: 11 Jun 2006, 12:01:57 UTC spreading the project around to EVERYONE, would speed up the process so much I dont understand why they would allow this. a small amount of people take all the work units and we (the ones that dont have thier machines turned up to recieve more then they should), have to wait a week for them to finish all that they have. how can they say that they are faster then having all of us doing the work? you can only process one workunit at a time per machine, and so many are sitting...waiting... without! so come on guys, please turn your stuff back to default and let US ALL DO THIS PROJECT! Stop the Greed! and Share! ID: 13939 · Reply Quote

1fast6 Send message Joined: 2 Jun 06 Posts: 5 Credit: 245,858 RAC: 0	Message 13940 - Posted: 11 Jun 2006, 14:03:37 UTC Last modified: 11 Jun 2006, 14:07:06 UTC well, I joined the project over a week ago, just to watch all of my machines sit idle... very frustrating... I wonder if the admins would consider setting it up to continuously release work, and the first 3 that are returned that validate get full credit(I assume this project uses a triple validate for credit system) .. the stragglers get less (or is that a boinc controlled issue)... it would seem that would get them the results they need quicker and give us all something to do, while rewarding those who get the work in quickest (and not rewarding those who hoard work)... its painful to watch the servers wait for a timeout for less than 30 workunits to be returned... ID: 13940 · Reply Quote

Osku87 Send message Joined: 2 Nov 05 Posts: 21 Credit: 162,484 RAC: 681	Message 13943 - Posted: 11 Jun 2006, 17:56:20 UTC 1fast6: Haven't you read that the projects big idea for the moment is to NOT release work all the time. Is it any sense to waste computer time for meaningless crunching? ID: 13943 · Reply Quote

Philip Martin Kryder Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0	Message 13945 - Posted: 11 Jun 2006, 18:45:45 UTC - in response to Message 13940. ... its painful to watch the servers wait for a timeout for less than 30 workunits to be returned... do we really know that the servers are waiting for stragglers? Or, could they be waiting for their own analysis to complete in order to build the next work units based on the work completed. ID: 13945 · Reply Quote