Message boards :
Number crunching :
Server problems
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Jul 04 Posts: 182 Credit: 1,880 RAC: 0 |
Hi everyone, I think the server has had an power outtage and that the DB has chrashed. At least that what it looks like to me. I will contact the Cern people to tell them. I might be able to guide them through a quick fix. But for the time being you should probably choose "no new work" as it is very doubtful any work issued in the meantime will be registered in the DB. (i.e. it is lost. ) At least until someone can take the scheduler offline. Cheers, Chrulle Chrulle Research Assistant & Ex-LHC@home developer Niels Bohr Institute |
Send message Joined: 10 Feb 06 Posts: 3 Credit: 76,833 RAC: 0 |
Hi Chrulle! Thanks for this useful post!! Just one question: What will happen to our science results and pending credit? Everything is gone?? This thing happened even on (from the news page) 26.1.2006 10:02 UTC Parts of the database had chrashed due to a power failure in the computer centre. They are now back up and running. Is there a way to recover all the results (and credit)? Or will we collect another bunch of "pending results"? Thanks again! |
Send message Joined: 14 Jul 05 Posts: 41 Credit: 1,788,341 RAC: 0 |
Yes, thank you Chrulle ... it's always nice when there's a quick reply for all the happy crunchers!! I'm not so concerned about the credit myself (ok, kidding), but moreso the overall sporadic and chaotic structure of the project since you've gone. We seem to lose interest with each incident, which is a shame because this project is actually very cool and productive. Just my 2.277561 cents ... |
Send message Joined: 2 Sep 04 Posts: 309 Credit: 715,258 RAC: 0 |
Yes, thank you Chrulle ... it's always nice when there's a quick reply for all the happy crunchers!! Yeah, I can't agree more. If the project sponsors don't give a crap then neither will we. This is the 2nd time this year this has happened and at the time we were willing to forgive. But once bitten..... Time for a UPS. Got a spare 2k? It will buy a beauty! |
Send message Joined: 26 Sep 05 Posts: 85 Credit: 421,130 RAC: 0 |
Chrulle certainly did put a lot into keeping this project up when he was here, and even a bit after he had left. Now the project people need to look into either getting someone, or doing something to keep this up themself. And yes, it would be a shame if all the WUs we had crunched from this batch got sent the way of the bit bucket or something... |
Send message Joined: 27 Jul 04 Posts: 182 Credit: 1,880 RAC: 0 |
And he still is. At least until i get a job. ;-) I have made a quick fix. Please check whether the system works again. Some credit and jobs have probably been lost, since i do not have the time to trawl through the backups. Chrulle Research Assistant & Ex-LHC@home developer Niels Bohr Institute |
Send message Joined: 14 Jul 05 Posts: 41 Credit: 1,788,341 RAC: 0 |
You da man!! Thanks ... my pending credit is/has been the same, but I can now see the individual wu results (even the pending ones, which all had at least 3 turn-ins)!! |
Send message Joined: 26 Sep 05 Posts: 85 Credit: 421,130 RAC: 0 |
thx for the fix... My granted credit has gone up, though I still have a lot of pending. Some WUs issued multiple instances though a quorum of 3 had already been attained (probably because the WUs weren't showing), and the WUs still being crunched seems to have gone up. But it does look things are back in business and returning to a state of normallacy :) Edit: Did notice a few glitches however. On these WU, all crunchers were granted credit, but mine is still showing "status pending" http://lhcathome.cern.ch/result.php?resultid=6912464 http://lhcathome.cern.ch/workunit.php?wuid=1340533 http://lhcathome.cern.ch/result.php?resultid=6917176 http://lhcathome.cern.ch/workunit.php?wuid=1341476 http://lhcathome.cern.ch/result.php?resultid=6887056 http://lhcathome.cern.ch/workunit.php?wuid=1335498 http://lhcathome.cern.ch/result.php?resultid=6874966 http://lhcathome.cern.ch/workunit.php?wuid=1333094 http://lhcathome.cern.ch/result.php?resultid=6891698 http://lhcathome.cern.ch/workunit.php?wuid=1336415 http://lhcathome.cern.ch/result.php?resultid=6893817 http://lhcathome.cern.ch/workunit.php?wuid=1336812 http://lhcathome.cern.ch/result.php?resultid=6918033 http://lhcathome.cern.ch/workunit.php?wuid=1341647 http://lhcathome.cern.ch/result.php?resultid=6864012 http://lhcathome.cern.ch/workunit.php?wuid=1330903 http://lhcathome.cern.ch/result.php?resultid=6900101 http://lhcathome.cern.ch/workunit.php?wuid=1338070 http://lhcathome.cern.ch/result.php?resultid=6879542 http://lhcathome.cern.ch/workunit.php?wuid=1333993 http://lhcathome.cern.ch/result.php?resultid=6898881 http://lhcathome.cern.ch/workunit.php?wuid=1337824 The other WUs in pending, are pending for all, but these have granted credit to all other users, but have it pending for me. Not sure if the validator will go back, and grant it to me, that the dBase is repaired, or not. Those are both the WU and result pairs... |
Send message Joined: 13 Jul 05 Posts: 143 Credit: 263,300 RAC: 0 |
Chrulle -- Thank you for taking the time to watch over the project and keeping us informed If I've lived this long, I've gotta be that old |
Send message Joined: 24 Oct 04 Posts: 79 Credit: 257,762 RAC: 0 |
And he still is. At least until i get a job. ;-) Love the disclaimer Chrulle Ex-admin....and TY for all your attention and caring here....But this still raises the red flag for us crunchers that still needs to be addressed...ie:no active admin.....I will have more faith and continue to crunch as best as I have done over the last year and a half if I (We) get communication,feedback and responses as we have in the past...Is Ben Segal still around? Maybe an e-mail to him on the current unpleasant conditions? Perhaps a short update from a Current not ex (hehe) admin would be appropriate about now on how to keep any enthusiasm for LHC :) [EDIT] Oh and of course there are still massive (relatively) pending credit issues,.... quorum but not validated still left, but again TY again Chrulle for un log-jamming for us [EDIT] |
Send message Joined: 14 Jul 05 Posts: 2 Credit: 142,986 RAC: 0 |
i'm sorry for asking again the problem ... but where are the pendings credits ... ??? ;-) |
Send message Joined: 14 Jul 05 Posts: 41 Credit: 1,788,341 RAC: 0 |
You'd have to look at your Account statistics ... you can get there from the home page, and login with your email/password. It'll show your enlistment date, total credit, rac, and then pending credit. If you look at each computer, you can then look at the results turned in and see how many have been returned, quorum met, granted credit, etc. Once you get to a specific Result, click on the Work Unit ID for all those details. |
Send message Joined: 24 Oct 04 Posts: 79 Credit: 257,762 RAC: 0 |
I assume after 24 hours of no response that there is no answers to my questions....or the answers are apparent by no response. I assume we are on the back-burner because the scientists have already gotten the answers they required for the installation of the magnets. I suspect as we get closer to the initialization of LHC there will be a flurry of new studies. Howevever in the meantime I am SADLY suspending LHC on most of my computers and await some response(hopefully favorable) to concerns here raised. |
Send message Joined: 14 Jul 05 Posts: 41 Credit: 1,788,341 RAC: 0 |
As dissappointing as it seems, there is still light at the end of the tunnel (no pun intended! :) ) Good news is ... other projects like Stardust@Home have seen first hand what our participation can achieve. LHC@Home better start looking at the vast (and free to them) computing power that is within their reach. For now, I'll just wait 'til next batch of wu's, and crunch for another 3 days. http://planetary.org/programs/projects/stardustathome/update_051806.html |
Send message Joined: 2 Sep 04 Posts: 22 Credit: 4,047,548 RAC: 3 |
Hello! It seems the database/server problems are solved, at least I have received some wu's this last hour! I guess these are some wu's not finnished on time or with error, hope this will clear out some of the pending credits! Have a nice crunching weekend! Hans Sveen Oslo, Norway |
Send message Joined: 26 Sep 05 Posts: 85 Credit: 421,130 RAC: 0 |
|
Send message Joined: 28 Sep 04 Posts: 47 Credit: 6,394 RAC: 0 |
The server has a problem that I don't quite understand because it does not happen with any of the other projects that I am attached to. Every time my boinc client asks for more work from LHC a new host is spawned. Everyday I have a long list of host that I need to merge. If this is happenning to other people that could explain some of the other problems the server is having. 98SE XP2500+ @ 2.1 GHz Boinc v5.8.8 |
Send message Joined: 25 Nov 05 Posts: 39 Credit: 41,119 RAC: 0 |
The server has a problem that I don't quite understand because it does not happen with any of the other projects that I am attached to. Every time my boinc client asks for more work from LHC a new host is spawned. Everyday I have a long list of host that I need to merge. If this is happenning to other people that could explain some of the other problems the server is having. Same here, are you using BAM at all, and is it only affecting your linux boxes? |
Send message Joined: 14 Jul 05 Posts: 41 Credit: 1,788,341 RAC: 0 |
It happened for me on both WinXP and Win2k, with various different hardware platforms?! I actually merged ALL of them to my laptop and have been fine since. |
Send message Joined: 28 Sep 04 Posts: 47 Credit: 6,394 RAC: 0 |
|
©2024 CERN