Message boards :
CMS Application :
CMS Tasks Failing
Message board moderation
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 . . . 22 · Next
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
thank you, Ivan, for the information. It's always valuable :-) |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
There is another intervention going on at the moment. It doesn't appear to be affecting us -- yet... They are still trying to do a database transfer. Our queue is starting to drain, but luckily I had it doubled to 2,000 last night, so we have 2-3 hours before we run out. Best set your machines to No New Tasks until we see how this plays out. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
since this morning, all my CMS tasks fail after about 12-18 minutes, with final state: 207 (0x000000CF) EXIT_NO_SUB_TASKS which probably means that there are no jobs available :-( |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
Oops, I overslept. More jobs should be in the queue soon. (Bad news yesterday, one of my fellow expeditioners from Mawson Base in 1980 had a stroke and died at the weekend; apparently his daughter married Australia's current wicket-keeper which explains why the Aussie team was wearing black arm-bands at the start of the Boxing Day test in Melbourne today). [Edit] Oh, [naughty word!], the WMAgent appears to have gone down too. The only one on my contact list who isn't on holidays is in Chicago, so it might be a couple of hours yet before he responds. [/Edit] |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
Bad news yesterday, one of my fellow expeditioners from Mawson Base in 1980 had a stroke and died at the weekendIvan, I am sorry for this :-( |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
... the WMAgent appears to have gone down too. The only one on my contact list who isn't on holidays is in Chicago, so it might be a couple of hours yet before he responds.Ivan, any idea when CMS will be working again? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
No, sorry, no response from anyone on holidays. Please set No New Tasks or switch to backup projects, Of course, I'll let you know when there are jobs again, but I'm just about to go to sleep for an extended period... See you in 15 or 18 hours! (I have to refill a cryomagnet's outer heat-shield with liquid nitrogen tomorrow...). |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
No, sorry, no response from anyone on holidays. Please set No New Tasks or switch to backup projects, Of course, I'll let you know when there are jobs again, but I'm just about to go to sleep for an extended period... See you in 15 or 18 hours! (I have to refill a cryomagnet's outer heat-shield with liquid nitrogen tomorrow...). https://tinyurl.com/snow-is-cold-enough-for-me |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
No, sorry, no response from anyone on holidays. Please set No New Tasks or switch to backup projects, Of course, I'll let you know when there are jobs again, but I'm just about to go to sleep for an extended period... See you in 15 or 18 hours! (I have to refill a cryomagnet's outer heat-shield with liquid nitrogen tomorrow...). Tja, our magnet's just at LHe boiling point, but it runs at 4 Tesla. It runs in persistent mode, which means we haven't had to apply any current in all my 15+ years at the lab. We have to fill with LHe twice a year, and top up the LN2 outer chamber every week. Sorry I forgot to mention yesterday that we have jobs again, but I guess most people would have noticed. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
After CMS tasks have run flawless for a while, some of them started to error out again after short time, stderr telling the following: 2017-12-31 14:03:23 (12584): Guest Log: [ERROR] Could not connect to Condor server on port 9618 This type of problem has been around many times in the past. Anyone any idea why? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
I'll try to look into it next week; don't feel shy about reminding me if you don't hear anything from me!Thanks, Ivan, for your help :-) Happy New Year! |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,943,725 RAC: 21,026 |
good morning everybody, and a Happy New Year! Unfortunately, the year starts with another CMS problem: either there are no jobs available, or the WMAgent is down again :-( |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
©2024 CERN