61) Message boards : Number crunching : Newbie Needs Help (Message 11207)
Posted 5 Nov 2005 by Profile Gary Roberts
Post:
You have your computers hidden (a setting in your preferences on the website) so it is difficult for anyone to offer advice or give you help since we can't see what is going on.

If downloading is stuck, go to the "Transfers" tab in BOINC Manager and select the stuck transfer that you should be able to see there and then hit "Retry Now".

62) Message boards : Number crunching : Scheduler ignoring LHC (Message 11206)
Posted 5 Nov 2005 by Profile Gary Roberts
Post:
... I have noticed that LHC is never scheduled for execution unless I intervene.


Your "intervention" is one of the reasons why LHC is not being scheduled. You are trying to force BOINC to break your own rules and BOINC is resisting because it's committed to following the rules. If you want LHC to have a bigger slice of your resources then simply change the rules and give it a bigger resource share. If your intention is to share your resources equally between 5 projects then please leave BOINC alone and it will do exactly that for you.

I usually try rebooting (XP), resetting the LHC ptoject in BOINC (5.2.6). What happens is if I turn my computer off for a while and I get close to the LHC WU deadline and start up again, round-robin scheduling is turned off and earliest-first starts. LHC then is scheduled and gets most of the CPU time. As soon as it reverts back to round-robin, it stops again.


What you are doing is crazy. Let's analyze it bit by bit.

1. Rebooting. BOINC has certain numbers. BOINC shuts down. BOINC restarts. BOINC has exactly the same numbers. Do you really expect BOINC to do anything different?

2. Resetting LHC. BOINC trashes all existing LHC work. BOINC gets new LHC work. Other projects are still "owed" so other projects will still run. All you have achieved is the trashing of perfectly good work. Resetting should be regarded as a "last resort" option when there is a "real" problem.

3. Leave computer off until close to LHC deadline. Boy, that's a good one :). A recipe for how to really screw things up!! You've just wasted perfectly good crunch time when LHC's "debt" could have been partly "repaid" to other projects and then forced BOINC to run LHC and accumulate even more debt with even longer to wait before LHC could run again. You need to go read about work scheduling and the concept of debt in the Wiki. You could start here.

4. ... back to round-robin, work stops. Well of course it does because LHC now has an even larger negative LTD than it did before. It's now going to be even longer before it is allowed to start again unless it has to go into panic mode again, which it probably will.

How do you solve this?? Well for starters, leave BOINC alone as the others have suggested. However, the best thing you can do is reduce your "connect to network" preference setting to 0.1 days or less for the moment and allow BOINC to drain all your excessive caches. BOINC will always stabilize more quickly if it's not trying all the time to deal with excessive work. As work completes, you can "update" particular projects to report the completed work if you wish but BOINC will handle everything on its own if you allow it. Once it has managed the crisis, BOINC will settle into normal round-robin scheduling according to your resource shares and that is the only place you should make changes if you wish to give a particular project more or less work. When things return to normal (which might take a week or two) you could then start gradually increasing your "connect to network" preference so as to start keeping a bit more work on hand. Firstly go from 0.1 to 0.3 for two days and see how that looks. Then try 0.6 for a few more days and see if BOINC is able to comfortably maintain round-robin scheduling without having to resort to EDF. If BOINC has to start invoking EDF then you have probably gone to far with your "connect to network" setting.

While you are waiting patiently for all the above to happen, start reading all the accumulated wisdom in the Wiki so you better understand what is going on.

Good luck!!



Things were going fine until a large work download of Predictor.


A sure sign that your "Connect to Network" interval is way too large.

Another point. You have ended up with two computer IDs for your machine, probably as a result of your "resetting". At some stage you should merge these using the function at the bottom of the old computer's page on the website.

Your newer computer ID shows 16 results in progress, another sign that your cache is too large. Do you have the same list of results under the work tab of BOINC Manager as is visible on the website?

63) Message boards : Number crunching : When will LHC upgrade so we can use 5.2.2 ? (Message 10835)
Posted 22 Oct 2005 by Profile Gary Roberts
Post:
<blockquote>I just wish the #2 guy in Australia would upgrade to 5.x.x. He's going to overtake me in a few days..bwaaaa...bwaaaa :(

Live long and crunch.
</blockquote>

Hey Paul, my ears have been burning all afternoon and I just knew there was someone somewhere casting all sorts of evil spells and incantations, whilst trying to impede my progress :). Now that I've found you, I think I'll go change my 60/40 EAH/LHC resource share to 80/20 the other way just to bring the axe down a little less gently :).

Actually mate, I'll tell ya what I'll do :). I'll cruise along for a bit until I'm right on your tailgate and then I'll lower the LHC share to 30 and then to 20. Should be just enough to still keep slowly overtaking. I might have to bump it up to 25 to keep right on your hammer :). Should be a real gas!!! :). Then, just when you are about to be first to 100K, I'll put the foot down and pip you on the post - just like a Melbourne Cup finish!! :).

Seriously, I was interested in what you had to say about your 5 year old son. I've got a 2 year old grandson who is really sharp and a real delight to play with. I reckon I'm going to have a ball inciting his interest in the wonders of the cosmos too, just like you and your son.

Good on you Paul, and take care of that little "Gas Giant". The country needs plenty more of 'em.

64) Message boards : Number crunching : Claimed credit changed? (Message 10612)
Posted 7 Oct 2005 by Profile Gary Roberts
Post:
The file deleter is pretty aggressive I think. Results get uploaded, a quorum is formed and it doesn't seem to take that long for the results to then be removed from the online database. I guess you would really notice it if yours was the last result in. It'd probably disappear quite quickly.
65) Message boards : Number crunching : Reason for not getting work: won't finish in time... (Message 10591)
Posted 5 Oct 2005 by Profile Gary Roberts
Post:
<blockquote>
3 pieces of information would be of use in helping people identify what you're seeing:
</blockquote>

Actually if you click on his name and view his computers you get all three items for free :).

It would be nice however if Kamukwam came back and told us whether he finally understood what MikeW was telling him. I thought it was clear enough the first time but even clearer in the second attempt. I'd like to ask the user if he could actually leave the machine on a bit more. It's a bit hard for multiple projects to grab much of a slice of the time if the numbers are as MikeW calculated. Maybe his machines have been off for quite a while and now that they are running again BOINC is just doing its normal thing of settling into a stable operating pattern.

This thought actually got my attention and having now looked more closely, the above would seem to be the case because his #2 ranked computer was attached back in Feb but now has a results list going back only to Oct 02 and his #1 ranked computer, attached Aug 24, only goes back to Sep 28. The earlier data said LHC had a 25% share so he must be running close to 24/7 now (at least on #1) to have as many results as he does in his results list. Its a shame that LHC doesn't seem to show the version of BOINC when you examine a result in the results list.

I would reckon that the answer to his original query is that BOINC is trying to stabilize and is preventing the downloading of work in order to assist the process. All he needs to do is be patient. He has 2 results in progress on each machine so everything looks cool!!

Edit: It's a pity you can't PM people sometimes when you reckon you actually might have worked out the answer to their question :). I hope he comes back and can decipher the explanation - particularly MikeW's second one which is very clear.
66) Message boards : Number crunching : only 10 k WU's left (Message 9482)
Posted 19 Aug 2005 by Profile Gary Roberts
Post:
<blockquote>
... right now we still have 24846 jobs in flight.

</blockquote>

Wouldn't that fact be largely due to the tendency of some users to set up excessively large caches? I imagine this is counter productive for your requirements where you would like everything back in as soon as possible.

All the "project friendly" crunchers who are trying to give quick turn around are now out of work. Why not send those 24846 jobs to all those who would like work right now. I imagine you could get most of the outstandings crunched and back very quickly. Bugger those who want to sit on the work for as long as possible.

I'm not criticising or trying to be disruptive. I'm happy to support the project and would like to see it get the best outcome possible.
67) Message boards : Number crunching : only 10 k WU's left (Message 9477)
Posted 19 Aug 2005 by Profile Gary Roberts
Post:
Thanks for the information. A couple of small questions:-

1. Would it be possible to put a very small note in the front page news (a day or two before it actually happens) just to say something like, "When the current work runs out that will be all for the moment. We believe there will be more work soon"? That would save a lot of unnecessary speculation and having to hunt through messages looking for possible answers.

2. There are a lot of new users (myself included) who thought that having left the account creation open signified that there might be a more continuous flow of work. You seem now to have more users than you really need if the work keeps running out this quickly. Isn't it better to close off account creation so that the users you have can get a more continuous supply?

3. The feast or famine nature of this project is a little hard to get used to. I know that the scientists need to analyse results returned to put together the next lot of work. I guess I'm just wondering about getting some sort of overlap going so that we can crunch a second lot while they are analysing the first lot?

I'm not complaining, I'm just musing in public :).
68) Message boards : Number crunching : Related to the new php code (Message 9312)
Posted 10 Aug 2005 by Profile Gary Roberts
Post:
<blockquote>Might be useful for Chrulle, a test thread I made on Break the forums forums, so you can see what the different greens do on lighter grey and darker grey backgrounds. </blockquote>

If you have a good look at this link you will see how much easier it is to read many of the different green shades when the background is lighter. If you look at the dark background of the computer results list and compare the readability of the titles on a lighter background compared with the results themselves you get the same message. Yes, the titles are in bold and that helps but there is also a benefit from the lighter background. If the results list itself could have an even lighter background that that of the titles then I think basically any green shade would be more readable and soothing on the eyes than peering into the murk of the current dark background.

Just a personal opinion, of course.
69) Message boards : Number crunching : Related to the new php code (Message 9267)
Posted 9 Aug 2005 by Profile Gary Roberts
Post:
The brighter lime green you had when Fuzzy complained was the clearest one that I've seen. The current one is a bit dark. It doesn't really bother me though.
70) Message boards : Number crunching : only 10 k WU's left (Message 9138)
Posted 3 Aug 2005 by Profile Gary Roberts
Post:
> I personnally have run at 0.1 days and at 3.0 days. and feel better running at
> 3.0 days (thank you very much).....

Amazing who you run into in all sorts of places making all sorts of statements :).

I would have thought that Ye Olde Grande Wiki Master would have seen the merit in not further inciting the masses into choosing settings that would put even greater strain on said Grande Wiki Master to further produce explanatory documentation to extricate said masses from the morass that will develop as a direct result of unrealistic resource share choices combined with over eager cache size settings. By the way, the preceding sentence was designed to be deliberately obfuscatory in order to match the mood of said masses as they try to fathom the ever increasingly complicated documentation needed to guide them out of the said morass. That is, of course, provided that the said masses manage to find their way into the relevant sections(s) of said Wiki in the first place :).

Somewhat more seriously, I've noticed a seemingly increasing trend for many people to want to support several projects with a resource share of something like 85/5/5/5 and then compound that with a 5 day cache setting just to make sure that they don't really need to run their backup projects if at all possible. Then they gripe like mad when one or more backup projects hogs the crunch time for vastly excessive times just to partake of the deadline dance. And then they get even more upset when the main project takes over and creates a deadline drought where there are no work units allowed to be downloaded for the backup projects for weeks at a time.

And to cap it all off they then do crazy things like indiscriminate resetting and aborting left right and centre just to try and tame their self created monster. They set an entirely inappropriate mishmash of resource share and cache size and then they tear it all down when it behaves ever so predictably.

So for smart people like yourself, Mr Ye Olde Grande Wiki Master, who knows exactly how to handle the deadline dance, please keep your personal cache choice to be exactly that -- personal :). Please tell said masses that a cache size of 3 days or more is a mortal sin whereas 0.5 days is just the bees knees and will take them directly to Nirvana or whatever other form of higher plane they want to be on :).

Disclaimer and EULA
All characters in this drama are entirely fictional and any resemblance to any person, real or imaginary, is entirely coincidental. No plants, animals, insects, birds, whales or dolphins were hurt, killed or even offended in any of the stunts performed. You are hereby expressly forbidden from taking umbrage at any statement contained herein even if you can prove that said statement is directly defamatory to yourself. You are expressly forbidden from reading any of the foregoing until you have agreed to this EULA. Your reading of any character, word, symbol or phrase contained herein automatically binds you to acceptance of this EULA.
71) Message boards : Number crunching : only 10 k WU's left (Message 9099)
Posted 2 Aug 2005 by Profile Gary Roberts
Post:
There are about 3.6K new work units now available so the outage was short lived.

I trust the number will keep increasing for a while....
72) Message boards : Number crunching : sign up now! (Message 8801)
Posted 22 Jul 2005 by Profile Gary Roberts
Post:
> We have reached more than 8000 users and have turned off the account creation for the time being.
>

And now account creation is on again for another 2000. That seems to be a very good sign ....

Look at all those old familiar faces turning up in droves. With Seti/BOINC running out of WUs, the sudden largesse here is amazingly timely.


Previous 20


©2024 CERN