Message boards : Number crunching : power glitch causes signal 11 error?
Message board moderation

To post messages, you must log in.

Profile jay

Send message
Joined: 10 Aug 07
Posts: 51
Credit: 533,175
RAC: 0
Message 25468 - Posted: 7 Mar 2013, 15:16:26 UTC
Last modified: 7 Mar 2013, 15:17:41 UTC


I had a power outage and I *thought* my UPS would handle it.
But several LHC and a WCG task errored out with signal 11 - segmentation fault.
(after the restart)
I had done a suspend-all and shutdown before the battery reached 50%.

Is this because the glitch a bad checkpoint file???

PCs on a different UPS (Both linux and windows vista) had no effect.

I realize the the WUs will get recycled to the next user - but
I wonder if there is anything else I can do?

other stuff
filesystems ext2
/var/lib and swap mounted on a sdd
/ and /home on a HDD

(edit) Perhaps I should have posted this to the BOINC forum.
ID: 25468 · Report as offensive     Reply Quote

Send message
Joined: 21 Jun 10
Posts: 33
Credit: 3,191,470
RAC: 615
Message 25469 - Posted: 7 Mar 2013, 19:42:26 UTC

Yep, that's a known problem with Linux.

I've had my Ubuntu machines get that error when there was an internet connection problem. Windows machines would continue to work just fine.

The BOINC people know about it and the WCG team knows about it.

AFAIK, we are still waiting for a fix.
ID: 25469 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 852
Credit: 1,619,050
RAC: 0
Message 25473 - Posted: 8 Mar 2013, 16:31:50 UTC - in response to Message 25468.  

Thanks for the feedback jay; LHC (SixTrack) should restart OK!
However I am wondering a bit about Checkpoint/Restart for
other reasons..... I'll keep you posted. Nothing you can do.
ID: 25473 · Report as offensive     Reply Quote

Message boards : Number crunching : power glitch causes signal 11 error?

©2020 CERN