Message boards :
CMS Application :
no new WUs available
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
for some time now, the Project Status page shows "0" for CMS WUs (and other projects as well). Major problem somewhere? |
Send message Joined: 26 Jul 05 Posts: 63 Credit: 4,083,755 RAC: 0 |
|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
there seem no new WUs to be available for the past few hours - is it a short-term problem only, or something else? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
thanks for the quick information, Ivan :-) |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
good morning, Ivan again, no tasks available (a few hours before, there were tasks but no jobs). Please fill the queue :-) |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
good morning, Ivan Hi; I was a bit surprised, as the batch estimate was another 30 hours or more last night. I quickly dispatched a new batch but when I looked at WMStats there were no data. Seems at least one WMAgent component has failed (AnalyticsDataCollector). The maintainers have been notified. Sorry for the inconvenience. The new batch has arrived in the system but we may not see any action until WMAGent is tickled back into life. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
OK, WMAgent has been reset, and the job queue is filling again. Some of the monitors lag a bit (esp. Server Status and Job Activities) so it might not be obvious yet. Looks like the monitor that stops task creation when there are no jobs is doing its job. OK, it stopped again just now, and the queue is draining fast. Please set No New Tasks or otherwise prepare for lack of CMS jobs. I've mailed CERN; they may be delving deeper into the failure before restarting. More when I get into work in ~30 mins. [EDIT] OK, don't panic! WMAgent has been restarted and the queue is starting to fill again. As you were! Now I can be a bit more leisurely about my stroll down Church Road. :-) [/EDIT/ |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
If I remember correctly, they put in place a new release of the WMAgent a few months ago or so. As it seems, it didn't exactly make things better :-) |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
If I remember correctly, they put in place a new release of the WMAgent a few months ago or so. Actually, that was in a different VM; this is in a new VM running CERN CentOS 7, but the way it's configured it can't handle the number of jobs we are running (mySQL crashes). I'l be reverting back to the old SLC6 VM until a more capable VM can be provisioned. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 30 May 08 Posts: 93 Credit: 5,160,246 RAC: 0 |
Looks like the CMS queue is running dry... Server Status shows 7 unsent tasks and I returned 3 results that ended like this: 2017-11-30 00:34:19 (23855): Guest Log: [ERROR] No jobs were available to run. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
other subprojects (like LHCb) are also failing. Looks as if the WMAgent is down again. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 456 |
The Server shows ZERO tasks for the moment! |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
The Server shows ZERO tasks for the moment! That's strange, because we have 2,000 in the queue. There are a couple of other little incongruities -- I'm starting to suspect that the new WMAgent channel we brought up last week is only serving the CERN VMs, and not Volunteer machines. I'll contact the team... |
©2024 CERN