Posts by Krunchin-Keith [USA]

1) Message boards : Team invites : USA is open for membership...join the #1 team!!! all welcome (Message 23791) Posted 24 Dec 2011 by Krunchin-Keith [USA] Post: team USA wishes everyone a Merry Christmas and Happy New Year. Be safe this holiday season. We ahve grown from 33rd to 28th with the help of new members over the past few months. Thank You to all those and any cuyrrent or past members that made this possible. Join us at our new website
2) Message boards : Number crunching : no more work? (Message 23776) Posted 18 Dec 2011 by Krunchin-Keith [USA] Post: Don't really know. Being that it is nearing Christmas and New Years, I really would not expect much until after taht as the people there need time off too for family.
3) Message boards : Number crunching : Faulty Computers or Modified BOINC ?? Huge Credits (Message 23753) Posted 30 Nov 2011 by Krunchin-Keith [USA] Post: Well the whole problem is when the project started it was using CreditNew with the lower of two users claims. Several users complained about getting too little credit and wanted the method changed to average credit. Now that the method has been changed, this has allowed cheaters in. You can't keep every body happy either way.
4) Message boards : Number crunching : no more work? (Message 23711) Posted 19 Nov 2011 by Krunchin-Keith [USA] Post: Ok, but that still doesn't explain why I've seen zero tasks for the last 5 days. Prior to this the longest was three days, but I upped the number of days buffered to two and not seen an outage longer than two days. Now it's 5. Just wondering. http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=9939655 Remember that normal tasks have a 7 day deadline. There are 12,000 plus hosts attached, but that does not say how many cpu cores there are 1,2,4,6,8,12,24 per host. It does not take long, minutes sometimes, for less than half of those hosts to snatch up 5,000 or 12,000 tasks at one task per core. However it sometimes can take up to 7 days for all those to clear out, then some will be over deadline and reissued, only to be snatched up by the next waiting hosts, a new small batch appears, then gets snatched up. To you it might look like the same 5,000 in progress, but it can be a different 5,000 or mix of the two. This project has batches of work, then the results need to be annalyzed by a human before they issue more work, also too Eric has been on vcacation if you read his status thread. There can be periods of no new work issued here and that is normal. Actually since the restart of the project back at Cern, it has had more work than normal. There have been small batches issued in the last 5 days, it just happens that some of the other 11,999 hosts got the work before yours made another request for work.
5) Message boards : Number crunching : Server status page shows 0 tasks ready but work is issued (Message 23704) Posted 17 Nov 2011 by Krunchin-Keith [USA] Post: The server page shows a zero at "Tasks ready to send" but I got some WU's to crunch a few minutes ago. Yes, that page is a cached page. Resneds can be generated and probably were after that time and you probabbly got some of those.
6) Message boards : Number crunching : Result server available ? (Message 23702) Posted 17 Nov 2011 by Krunchin-Keith [USA] Post: BOINCstats is updateing fine. It reads the data file here once a day, so remember there can be upto a 24 hour deley before you see your total change. After you complete a task, it has to be uplaoded, then also ift depends on the wingman to complete a task before a valid match is granted and credit issued. This often can take 7 days if you get paired with someone with a slow or overlaoded ssytem. It can take longer if the wingman errors and the tasks has to be issued to a third host. To see results here, go to the "my account" link at bottom of page, there is a tasks choice, click view next to it, which will then show you a list of your tasks and status. At the top is another filter so you can limit it to see error, valid, pending, in progress and so on.
7) Message boards : Number crunching : Profile not updating (Message 23660) Posted 6 Nov 2011 by Krunchin-Keith [USA] Post: Yes, there is no one avaialble to make changes at the moment. Limited staff (?) See thread
8) Message boards : Number crunching : Resource share (Message 23653) Posted 5 Nov 2011 by Krunchin-Keith [USA] Post: Hi fellow BOINC'ers. I am quite new to this BOINC'ing, and i got a very general question about the BOINC manager. In the "Projects" tab, there is something called "resource share", which to me looks like its a way to set how important a project is. Right now I am crunching for LHC and einstein, but i only have einstein so my computer wont be idle in case there isnt any jobs. is it possible to choose, lets say, 25% for einstein and 75% for LHC? or have i totally misundestood what "resource share" means? Regards Gundersen You got it. Enter 25 on einstein and 75 her. Boinc will do the rest. The numbers do not have to eaqual 100. 1 at einstein and 3 here would do the same, although some people find adding to 100 easier. You can also enter a 0 for einstein, then it will only run when lhc has no work, otherwise you will ahve all lhc work. 0 is a backup project. With 25-75 you will get some einstein and lhc at the same time 1 hour for einstein and 3 hours for lhc, if both have work.
9) Message boards : Number crunching : no more work? (Message 23625) Posted 1 Nov 2011 by Krunchin-Keith [USA] Post: The current turn around limit is 129600 seconds (1.5 days) This is what the scientists want, which would not be a problem if the mechanism functioned correctly. And considering over 5,000 tasks dissapeared in less than a day, there is ample computers to handle resends. Over 5,000 in less than a day is impressive. If anybody wants to get in on the resends all they need to do is decrease their cache and get their turnaround time below 1.5 days, easily done. It was more like, less than 14 hours. Considering that and that we now have over 7000 users and 11,000 hosts (with recent credit) I think there is enough. What we need to do is first get the entire mechanism functioning properly before trying to adjust anymore the time / requirments for the "reliable hosts". Consider too the 1/cpu limit and the short compute time 8 hours for average work if it is started proptly that is. The resends are going to be from several sources (abort, detach, inconclusive, timeout), but the longest is the timeout because a host didn't start it within the 7 days, most likely the 2nd result is already done, so to wait any longer for the third to complete is what the scientists want to avoid. We have allowed another 3.5 days deadline, but hoping that most of the "relaiable hosts" will return it faster. So under normal circumstances most work should be completed within 8 days, reducing the time batches go on. As it is now, some are older than that because of the 10 day delay between the timeout and the resend which is not good, that makes some of the tasks over 18 days old from when they started. I beleive once this delay problem is solved, that also the "no more work" issue will decrease too. There will not have to be a hold up to sumbit new work. The old work will still clear out first, resends mixed in along the way.
10) Message boards : Number crunching : no more work? (Message 23621) Posted 1 Nov 2011 by Krunchin-Keith [USA] Post: Time: UTC+1 01.11.2011 02:58:58 LHC@home 1.0 Sending scheduler request: To fetch work. 01.11.2011 02:58:58 LHC@home 1.0 Requesting new tasks 01.11.2011 02:58:59 LHC@home 1.0 Scheduler request completed: got 0 new tasks 01.11.2011 02:58:59 LHC@home 1.0 Message from server: No work sent greetz littleBouncer That usually means the server queue is empty. Remember the server status page is cached, it can be empty within minutes of showing lots of work, especially now that there are much more users and active hosts. When Sixtrack is out of work (as it is now) it says: Tue 01 Nov 2011 09:46:10 AM MDT \| LHC@home 1.0 \| (Project has no jobs available) That proves there was work when littleBouncer requested work. The reason he didn't get any is because his turnaround time is too high. The same has happened to several posters in this thread. It has nothing to do with Linux vs. Windows because my Linux box gets tasks that have been sent to Windows hosts. They do not use homogeneous redundancy here, Sixtrack tasks can and do go to either OS. little Bouncer's turnaround time is 2.1 days on one of his hosts and 1.82 on his other host. IMHO, the turnaround time requirement seems a little low. Oops sorry, i was up til 3am last night and had to come in early to work today, my brain is fried. The current turn around limit is 129600 seconds (1.5 days) This is what the scientists want, which would not be a problem if the mechanism functioned correctly. And considering over 5,000 tasks dissapeared in less than a day, there is ample computers to handle resends. Normally these should send out earlier as needed and not all be saved up, so this problem at the end would not be noticed. What will happen if things worked correctly is the quicker hosts will get resends instead of normal time work and normal work would be sent to the slower hosts, so more hosts get work. I think also new work was held up so the queue could empty out all the tasks backlogged that have been waiting over 10 days to resend (they are now over 17 days old at least from when they started) because the scheduler mechanism is not working. We think it is an older scheduler and somewhere after that the way it handles resends was changed, so the options in the docs are for a newer scheduler and the one in use does not recognize them, so it malfunctions. Igor has plans to do an update, when he can find time in his schedule. There was some reason this was being held off, but since T4T did a sucessful one, the reason may be nulled now and an update can proceed, time permitting.
11) Message boards : Number crunching : no more work? (Message 23618) Posted 1 Nov 2011 by Krunchin-Keith [USA] Post: Why my computers get no work, even there was no invalid result returned, and the server status is 5'006 ready to send? Even when I suspend all other projects, I get no work..... greetz littleBouncer What messages are you getting ? It is hard to answer without more details. It could be something like, there is only linux work left and you have windows, or the other way around. Have you reached your quota for the day ? what version client are you doing and are you pressing update or waiting for the client to naturally request workl, it makes adifference. oops, sorry but all infos you can see on computers-page (they are not anominous)^^ , OS = Windows 7 only this message , when I start BM...: Not sure why you linked to my computers ? Time: UTC+1 01.11.2011 02:58:58 LHC@home 1.0 Sending scheduler request: To fetch work. 01.11.2011 02:58:58 LHC@home 1.0 Requesting new tasks 01.11.2011 02:58:59 LHC@home 1.0 Scheduler request completed: got 0 new tasks 01.11.2011 02:58:59 LHC@home 1.0 Message from server: No work sent greetz littleBouncer That usually means the server queue is empty. Remember the server status page is cached, it can be empty within minutes of showing lots of work, especially now that there are much more users and active hosts.
12) Message boards : Number crunching : no more work? (Message 23614) Posted 31 Oct 2011 by Krunchin-Keith [USA] Post: Why my computers get no work, even there was no invalid result returned, and the server status is 5'006 ready to send? Even when I suspend all other projects, I get no work..... greetz littleBouncer What messages are you getting ? It is hard to answer without more details. It could be something like, there is only linux work left and you have windows, or the other way around. Have you reached your quota for the day ? what version client are you doing and are you pressing update or waiting for the client to naturally request workl, it makes adifference.
13) Message boards : Number crunching : no more work? (Message 23612) Posted 31 Oct 2011 by Krunchin-Keith [USA] Post: I keep getting: Sun 30 Oct 2011 10:04:21 PM CDT LHC@home 1.0 Message from server: (won't finish in time) BOINC runs 98.9% of time, computation enabled 99.6% of that I have an AMD Phenom II x4 965 Liquid cooled - 8 Gig of ram Running Linux on SATA II raid arrays... What do you mean it won't finish it time??? It doesn't get too much faster than that The problem is your computer has done only done 1 task and it has not been validated yet. You have a "0 turn around time". I think until that task validates this will not be increased, but i'm not sure on that. You'll either have to wait it out. You might try a project reset, but I do not know what effect that will have (I make no guarrantee it will fix anything or not mess anything up).
14) Message boards : Number crunching : Long delays in jobs (Message 23600) Posted 30 Oct 2011 by Krunchin-Keith [USA] Post: I yust found 2 WU's on my machine that have a shorter deadline, both are WU's that where not reported back in time, there where so posed to be resend on the 20 and 21 october, i think. WU 352551 & 363394. The WU's i reported earlyer are stil not resend, i think they will be at the end of the cue but how did these (WU 352551 & 363394) WU's get resend if the resend policy is not working correctly? It would be most helpful if you (or anyone posting) could at least give a link. Sometimes it can take considerable time to click through your account and computers to find the example. I've sometimes given up because users have too many computers to click through looking for a result. If you check the wu, yes it was resent after a timeout, but look at the dates. there was a 10 day delay from the timeout to the resend, meaning it just got stuck in the queue. Resends are supposed to be accelerated and with a higher priority, they should go out before other work pending at lower priority, it should not take 10 days, only minutes. Adding 10 extra days is not accelerating retries as noted in the boinc docs which iswhat we are trying to accomplish. --- As for the update, don't worry about breaking 10 things, 10 things are already broken and not fixed yet.
15) Message boards : Number crunching : Long delays in jobs (Message 23595) Posted 29 Oct 2011 by Krunchin-Keith [USA] Post: I thought that errored or inconclusive WU's would be distributed again with higher priority and shorter deadlines to trusted hosts? I have 2 WU's that are inconclusive but are yet unsent? WU 471878 & 449758. They are supposed to be. There is something wrong. From the settings it should be working. I too see "unsent" tasks now days old, which means they are not getting properly into the queue or marked somehow. This certainly is not acceleration when the resends just sit there. I've noted this to Igor. At this point my suspission is some server component is out of date and the settings noted in the boinc docs are for a newer version. We've determind now that the server software is over 1 year old. Be patient, this will all get sorted out eventually.
16) Message boards : Number crunching : no more work? (Message 23575) Posted 22 Oct 2011 by Krunchin-Keith [USA] Post: Grab 'em while you can! Tasks ready to send as of 20 Oct 2011 16:52:16 UTC: 42,621 I see your 42K and raise you 102K, the pot is now 144K. I've noticed no one complainting in the last two days, so I had to check. So for now there is NO "no more work".
17) Message boards : Number crunching : Long delays in jobs (Message 23570) Posted 20 Oct 2011 by Krunchin-Keith [USA] Post: This discussion is getting off topic. - Reminder to all, stick to the topic of the thread, if you want to discuss another problem or matter, start a new thread. Sorry to all for being short on this, but I am extremely busy with a personal tragedy and I do not have time to play games or waste time hiding all the off topic posts at this time.
18) Message boards : Number crunching : Long delays in jobs (Message 23524) Posted 15 Oct 2011 by Krunchin-Keith [USA] Post: Looking for a WU with "validation inconclusive" I found this host. Looks like it became suddenly unstable. Within the next hour the host most likely crashed all ~450 WUs immediately after the download (Maximum daily WU quota per CPU = 80, 6-core CPU> 460). Do anybody know a possibility to identify this kind off rapid WU transfer automatically to stop it earlier (server side)? No nothing automatic except the quota system which is designed to do exactly that. The only way to reduce this situation is lower quota for everybody. But then when a batch of short run tasks are issued, it would be easy to meet the quota for a day in minutes and not earn much credit the rest of the day, thus the current 80 quota. If the host is producing errors, then that hosts' quota is lowered automatically towards 1, until it produces valid results, then it can earn a higher quota back up to the max set by the project. The only way to stop this kind of thing is for users to stop hiding their hosts, then someone could send them a PM saying, hey check your host. Maybe they will see they also are earning not much credit and look into it. ps, latest check shows all tasks have been issued. now let us see how long it takes to return those 24,000 still processing.
19) Message boards : Number crunching : Long delays in jobs (Message 23518) Posted 15 Oct 2011 by Krunchin-Keith [USA] Post: Yes, I see they are moving now. But the mystery is still why so long a delay. One of the eight I got as a resend, started as a 530.08 started on the 4th, so it took 10 days to get resent. This is not good and certainly is not 'accelerated'. But looking at the deadline of the resends, they are shortened by 0.5 which is the value we used, so the 7 days gets shortened to 3.5 days. So part of the mechanism is working. If that part works then also the priority should be increased. I went back to see what hosts they were sent to, 4 of them dissapeared. Oh, my one host already finished them and returned them while i was writing, but it has the other four and is working on those. All 8 got to one host. I was hoping to see some on others so i could look at the turnaround time on those, the one host has less than a day average, so it at least falls in the paramaters we set. My one host doing those now, got plenty of other tasks from here in the last four days, it would have had plenty of time to do these resends had they been sent first but was given other work. This still leaves the question though why the scheduler waited, if the tasks had a higher priority they should have moved first.
20) Message boards : Number crunching : SixTrack and LHC@home status (Message 23517) Posted 15 Oct 2011 by Krunchin-Keith [USA] Post: Hello, Do you have an estimated date for the GPU support for the LHC project? the gain in computation time will be phenomenal. Thx Read the first post in this thread.

Next 20

LHC@home