Message boards :
Number crunching :
I think we should restrict work units
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 11 · Next
Author | Message |
---|---|
Send message Joined: 19 May 06 Posts: 20 Credit: 297,111 RAC: 0 |
I notice that this is slowed down by a minority of users who set their caches to maximum. When the number of work units available hits zero, we still have to wait a week or more while the people who grab a maximum number of units empty their cache before the scientists can even begin the analyzing process. I'm sure MattDavis or someone will correct me if I'm wrong, but I thought the original post, quoted in part here, was saying that you *shouldn't* max out your cache. Doing that means you get a lot of work, true. But it also means that the work gets done slower because you're sitting on work that other people with a lower cache (getting work as they complete it) could be doing. Leaving some computers dry is not the best way to get work done promptly. It slows down the process and makes everyone wait longer to get more work. Wasn't that the whole point behind the original post and subject of limiting work units? To make sure that everyone gets a fair share, not to have some people hogging work for themselves while others' computers get left dry? |
Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0 |
I notice that this is slowed down by a minority of users who set their caches to maximum. When the number of work units available hits zero, we still have to wait a week or more while the people who grab a maximum number of units empty their cache before the scientists can even begin the analyzing process. hmm - You mean that there may have been unintended consequences from starting this thread? Even so, I'm thankful for the idea. |
Send message Joined: 2 Nov 05 Posts: 21 Credit: 105,075 RAC: 0 |
Thanks to your clear explanation, I raised my cach for .01 to 10 days. Please, don't do that. Engouraging people to keep that kind of a cache without a proper reason on fast computers will bring only a problems. As we have seen servers at Cern (or where ever they are) won't stand the excessive amount of downloads what around 80 computers downloading WUs for ten days cause. So don't come here whining when server is down and you wan't download work. You know the reason exactly. |
Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0 |
Thanks to your clear explanation, I raised my cach for .01 to 10 days. So, you are saying that you have direct knowledge that the servers are so badly configured that they FAIL rather than throttle back connections to a level that they can handle? Can you tell us where and how you learned this? If your assertion is true, shouldn't be handled more directly by reconfiguring the servers rather than expecting 65k crunchers to configure their machines in some special way? And by the way, I was never intending to be "...Engouraging [sic] people to keep that kind of a cache..." I think that YOU should set YOUR cache to .01 and leave it there. You should NEVER raise YOUR cache above .1. By the way, what do you mean by "...a proper reason???" |
Send message Joined: 14 Jul 05 Posts: 11 Credit: 81,274 RAC: 0 |
(End of Quote) First: Calm down, dont take everthing too personal and stop shouting/make demands at other people who express there dislike with your behavior. Your reaction also is not very constructive - please dont try to be sarkastic, its not your best skill. ;) Second: Search the forum, there are several reports that to many connections have kicked the DB-driven-websites and also the forum to nirvana. IIRC 50+ became critical, we see that a few days ago at the last batch of work. If you not noticed: In the last few months the normal LHC-User see not very much activity by an admin, currently there are no projektadmin at all. Look at the appropriate forumposts and the anwsers from chrulle, your former admin. Oh, one more thing: please stop to waste space in your quotings, not everyone likes unnecessary scrolling. ;) Sidenote: before you go ballistic at this post - observe the emoticons. |
Send message Joined: 2 Nov 05 Posts: 21 Credit: 105,075 RAC: 0 |
By the way, what do you mean by "...a proper reason???" Like being a modem (or other with dial up and minute fee line) user. I'm sorry if someone took the tone of my message too serious. It was ment to wake some guys up... ;) As MB_Atlanos said there have been a couple of times when too much of server load has gotten the servers upside down. |
Send message Joined: 27 Sep 04 Posts: 40 Credit: 1,742,415 RAC: 0 |
Hi, im not the fastest user, i'm user nr 2 :) I have something about 100 PCs working at LHC. I have set LHC to 90%, and Rosetta at 10% and i set cache to "0.1 Day". If there is fresh work, i get 1 WU at normal PC, 2 WU at HT's and 4 at very fast DualCores. If too many fill the buffers up and let us wait many many days, its a shame. If you have a flat and are allways online, please set it as low as possible!! I'd like to get work at the second day too, and not only on the first 5 hours :) Do i need to set it to 5 Day's (joke!) |
Send message Joined: 27 Dec 05 Posts: 7 Credit: 461,367 RAC: 0 |
Well, I'm sold, consider my cache increased to maximum size! |
Send message Joined: 13 Jul 05 Posts: 133 Credit: 162,641 RAC: 0 |
As I post this (23.00 hrs UK time) the front page of the LHC site shows that there are still 340 WUs out there somewhere, still unprocessed. This is days after I (and I'm sure many other people!) had returned the final WU unit from the last batch of WUs to be issued. So those 340 WUs are being held in the cache of some irresponsible crunchers who 'hoard' work. This must ultimately slow down the whole LHC project! Jeez - when will some people learn that we are working for LHC, not LHC working for our egos..... |
Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0 |
As I post this (23.00 hrs UK time) the front page of the LHC site shows that there are still 340 WUs out there somewhere, still unprocessed. This is days after I (and I'm sure many other people!) had returned the final WU unit from the last batch of WUs to be issued. how do you know that the "hoarders" didn't have a "proper" reason such as a slow modem or dial up line? |
Send message Joined: 13 Jul 05 Posts: 133 Credit: 162,641 RAC: 0 |
Matt - I want to thank you for taking the time to post this and start this thread. 'Nuff said....... |
Send message Joined: 25 Nov 05 Posts: 39 Credit: 41,119 RAC: 0 |
or an older machine, or one that doesn't crunch 24/7? |
Send message Joined: 19 May 06 Posts: 20 Credit: 297,111 RAC: 0 |
That's possible, sure, and I don't think people would complain much if people have a legitimate reason for using a large cache. One way to check this would be to find some work units that are still pending, then look at the computer's details. Some of the information there must give you an idea what the computer is like in terms of age, speed, percent of the time BOINC runs, network connectivity, etc. That should help you decide if the person is keeping a large cache for a legitimate reason or not. The on and off work status of LHC is new to me, so it's tough for me to understand exactly what's going on, why, and how to solve the problems. I do know that it's disappointing to see so many work units in progress, waiting to be done, while my computer has been sitting dry for over 1 week. I think many others share this feeling, and that's one of the reasons for the complaints. We suspect that not everybody has a proper reason for maintaining a huge cache of work, and the people who don't have a proper reason are just being greedy, taking work from others for themselves and preventing work from being done promptly. |
Send message Joined: 21 May 06 Posts: 73 Credit: 8,710 RAC: 0 |
While I can acknowledge and appreciate your feelings, I'm surprised to find them so common in a scientific study. Instead of casting wide nets of guilt by inuendo, based on intuition and emotion, why not gather the data that you describe? Some folks might even consider gathering data to support a hypothesis **before** posting. As for deciding based on the attributes of the machine or network if any given cache size is "legitimate", who am I or who is anyone to decide what "legitimate " is? I denigrate the underlying premise that somehow "greed is bad" but having a slow connection or slow computer is somehow makes it "proper" to have a large cache. A person who was less "greedy" might buy a faster computer or a faster Internet connection or a second phone line. Would that then make them "good"? What if they bought a larger computer and then increased their cache, would they then be "greedy" or would they be "good?" In summary: 1) LHC could control hoarding by limiting deadlines. 2) LHC could control server crashes by limiting max connections relative to known server capacity. 3) If the use of large caches is "improper" or "illegitimate", then this is the type of feedback that the BOINC folks need in order to establish the need for new and better controls and algorithms in BOINC. 4) There is lots of other great science to do on other projects when LHC runs dry. crunch on!
|
Send message Joined: 3 Jan 06 Posts: 14 Credit: 32,201 RAC: 0 |
As I post this (23.00 hrs UK time) the front page of the LHC site shows that there are still 340 WUs out there somewhere, still unprocessed. This is days after I (and I'm sure many other people!) had returned the final WU unit from the last batch of WUs to be issued. Proper reason or not, the WU that are currently being processed are ones that had their origional deadline exceeded. Even the last WU sent from the recent batch would have timed out by now. Granted, things happen and sometimes it's not possible to return results but I seriously doubt this is the case with the majority of the current backlog. I don't have an issue with people having enough work to keep them running but manipulating the program to the point of work missing the deadline is excessive. |
Send message Joined: 4 Sep 05 Posts: 112 Credit: 2,068,660 RAC: 379 |
I don't know. I think it all may be a bit harsh. I fit in to the category of un-returned results. It happens. BOINC restarted on one of my PC's and hung for a while then came up empty. I shut it down and it seemed to restart okay. But latter when checking LHC I found a number of unfinished WU's that should have timed out by now. It wouldn't be the only scenario and it would take to many users for that problem to add up but that's why LHC have so much redundancy. If it were a problem the project would be modified. I'm doing the best I can, sometimes I make mistakes. Click here to join the #1 Aussie Alliance on LHC. |
Send message Joined: 28 Sep 04 Posts: 47 Credit: 6,394 RAC: 0 |
If we were not still waiting for the stragglers to get the work done we could possibly be already working on another batch. It is not right that there are still units to be done when others attached to the project have been idle for a week or more. I think a solution that the project could do is to decrease the daily quota. And I would bet that when the right number for the quota was found they would get all the results much faster then they do now. It might take a bit of trial and error for them to find the right amount to set it at but it most definitely would be better then have the majority of the host computers sitting idle asking for work. When I say idle, I mean in reguards to this project because hopefully this is not the only project they are attached to. 98SE XP2500+ @ 2.1 GHz Boinc v5.8.8 |
Send message Joined: 25 Nov 05 Posts: 39 Credit: 41,119 RAC: 0 |
I can see it now all those super crunchers complaining that they have hit the daily limit (refer to the wailing and gnashing of teeth about the curse of 32 over at Einstein) and how its slowing down the science having their machines sitting dry. It's not fair that their super machines are being penalised because they are too fast. Whichever way you cut it, slice it or dice it, the work is sporadic someone somewhere will always run out of work before someone else. |
Send message Joined: 1 Sep 04 Posts: 14 Credit: 3,857 RAC: 0 |
spreading the project around to EVERYONE, would speed up the process so much I dont understand why they would allow this. a small amount of people take all the work units and we (the ones that dont have thier machines turned up to recieve more then they should), have to wait a week for them to finish all that they have. how can they say that they are faster then having all of us doing the work? you can only process one workunit at a time per machine, and so many are sitting...waiting... without! so come on guys, please turn your stuff back to default and let US ALL DO THIS PROJECT! Stop the Greed! and Share! |
Send message Joined: 2 Jun 06 Posts: 5 Credit: 245,858 RAC: 0 |
well, I joined the project over a week ago, just to watch all of my machines sit idle... very frustrating... I wonder if the admins would consider setting it up to continuously release work, and the first 3 that are returned that validate get full credit(I assume this project uses a triple validate for credit system) .. the stragglers get less (or is that a boinc controlled issue)... it would seem that would get them the results they need quicker and give us all something to do, while rewarding those who get the work in quickest (and not rewarding those who hoard work)... its painful to watch the servers wait for a timeout for less than 30 workunits to be returned... |
©2024 CERN