Message boards : Number crunching : Chrulle, any news on database recovery?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile The Gas Giant

Send message
Joined: 2 Sep 04
Posts: 309
Credit: 715,258
RAC: 0
Message 12587 - Posted: 31 Jan 2006, 2:36:35 UTC

Chrulle,

Any news on if you were able to recover any of the wu's that were sent in during the 16hr db crash? I know I've got 34 wu's stuck in the ether.

Live long and crunch.

Paul
(S@H1 8888)
BOINC/SAH BETA
ID: 12587 · Report as offensive     Reply Quote
[B@H] Ray

Send message
Joined: 13 Jul 05
Posts: 82
Credit: 6,336
RAC: 0
Message 12613 - Posted: 1 Feb 2006, 23:38:00 UTC

Looks like we lost them, mine finaly ran out of time and were marked as not turned in. Now most have been removed from the result page. How many did you turn in during that time? I only had 4, but even that was a waist of computer time.

Pizza@Home - Rays Place - Rays place Forums
ID: 12613 · Report as offensive     Reply Quote
Profile The Gas Giant

Send message
Joined: 2 Sep 04
Posts: 309
Credit: 715,258
RAC: 0
Message 12614 - Posted: 2 Feb 2006, 2:59:14 UTC
Last modified: 2 Feb 2006, 3:00:32 UTC

I lost 34 (I have a list of them before they were wiped off the system)! Oh for a UPS and automated shutdown, this all would have been avoided.

It serves me right in a way as I noticed the problem but still let my computer upload the results and report them as I didn't think it would be possible that a facility like CERN would not have UPS's and automated shutdown systems on their BOINC servers. More fool me really. But I did see my pending increasing during this time and thought all uploaded results were received and logged.

Paul.
ID: 12614 · Report as offensive     Reply Quote
[B@H] Ray

Send message
Joined: 13 Jul 05
Posts: 82
Credit: 6,336
RAC: 0
Message 12615 - Posted: 2 Feb 2006, 4:02:31 UTC - in response to Message 12614.  

I lost 34 (I have a list of them before they were wiped off the system)! Oh for a UPS and automated shutdown, this all would have been avoided.

It serves me right in a way as I noticed the problem but still let my computer upload the results and report them as I didn't think it would be possible that a facility like CERN would not have UPS's and automated shutdown systems on their BOINC servers. More fool me really. But I did see my pending increasing during this time and thought all uploaded results were received and logged.

Paul.


I did the same thing let them upload, usually systems will not accept them and BOINC after a few tries sets a long time before the next retry.
I imagine there are a couple hundred out there who did the same. Two of mime had a 3 copies not returned, probably the same thing as ours.
Ray

Pizza@Home - Rays Place - Rays place Forums
ID: 12615 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 12625 - Posted: 3 Feb 2006, 10:17:54 UTC - in response to Message 12614.  
Last modified: 3 Feb 2006, 10:18:43 UTC

... I didn't think it would be possible that a facility like CERN would not have UPS's and automated shutdown systems on their BOINC servers. ...



Data is protected by taking backups, so there is a known maximum loss from any power outage - you can't lose more than the work since the last backup.

The cost of a UPS has to be weighed against the value of the data that is saved when it operates. We've been told a power outage like this last happened at CERN 20years ago. The cost of replacing 16hours computer work -- even if it had all been done in house -- would likely have been less than the cost of providing UPS for 20yrs, in which case CERN made the right choice in taking the risk.

Time was when power outages would break the hardware (disk head crashes, etc) and in those days UPS budgets took into account the cost of replacing hardware, and the opportunity cost of post-power-restore downtime while the engineers replaced broken components. Nowadays all but the biggest hard drives will power down without hardware damage, and so the hardware protection aspect of UPS no longer counts.

The third reason for having UPS is for safety - it is more vital to have UPS on the emergency lighting than on the computers for example. I am quite sure (unless CERN is very different nowadays) the emergency lights would have been on all over that IT computer building even tho all but the most essential computers would have been off.

River~~
ID: 12625 · Report as offensive     Reply Quote
Profile Chrulle

Send message
Joined: 27 Jul 04
Posts: 182
Credit: 1,880
RAC: 0
Message 12628 - Posted: 3 Feb 2006, 12:53:35 UTC
Last modified: 3 Feb 2006, 12:54:58 UTC

Backup power is also much more important down in the ring than for the computers. If there is a power glitch down there, the cooling will stop and then the power wires that looks like a half-width ATA cable and are carrying ~10'000 Amp will stop being superconducting. Then they melt very fast heating other superconducting material near them creating a chain reaction. Then the magnets will fail and we will lose the beam, it is believed that the beam can burn through 30 meters of copper. So if we get a problem down there we lose the work of more than 10'000 people for 10 years and all the equipment. Here we just lose a days worth of computer time.

It shouldn't even have been that much. I checked that everything was running and it was. I just forgot to check all the tables in the database.


Chrulle
Research Assistant & Ex-LHC@home developer
Niels Bohr Institute
ID: 12628 · Report as offensive     Reply Quote
Aaron Finney

Send message
Joined: 14 Jul 05
Posts: 60
Credit: 140,661
RAC: 0
Message 12639 - Posted: 5 Feb 2006, 1:04:44 UTC - in response to Message 12628.  

Yeah but coppers pretty "soft" having a melting point just BARELY above gold, which as we know is pretty easy to melt.

What about molybdenum or tungsten?
ID: 12639 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 12640 - Posted: 5 Feb 2006, 7:24:55 UTC - in response to Message 12639.  

Yeah but coppers pretty "soft" having a melting point just BARELY above gold, which as we know is pretty easy to melt.

What about molybdenum or tungsten?


Copper also has one of the highest values for thermal conductivity. Copper's thermal conductivity is more than twice that of tungsten, and almost three times that of molybdenum. Security safes are made from it because it conducts heat away so fast that thermal cutting tools are almost ineffective. Note that I said security safes, NOT fire safes!

Burning through 30 metres of copper is quite feat. I'd bet the same beam would burn through more tungsten or molybdenum.


Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 12640 · Report as offensive     Reply Quote
River~~

Send message
Joined: 13 Jul 05
Posts: 456
Credit: 75,142
RAC: 0
Message 12641 - Posted: 5 Feb 2006, 9:24:52 UTC - in response to Message 12640.  


Copper also has one of the highest values for thermal conductivity. ...

Burning through 30 metres of copper is quite feat. I'd bet the same beam would burn through more tungsten or molybdenum.


Thermal conductivity works the other way round when the heat is dumped in very fast.

If a magnet goes you get the full stored energy of the beam dumped in the time it takes light to get round the full ring of the LHC (assuming just one magnet goes and all the rest keep the particles on track). OK, it is a big ring but the is still not very long.

The heat will not be conducted away during the time it takes for the energy to be dumped; but then once the high temparatures have been attained the conductivity works to spread the damage.

Siting a security safe next to one of the LHC magnets and tangential to the bend line at the magnet would therefore not be such a secure idea...

River~~
ID: 12641 · Report as offensive     Reply Quote

Message boards : Number crunching : Chrulle, any news on database recovery?


©2024 CERN