Message boards :
CMS Application :
"No jobs were available to run" since this morning.
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
VMAgent down again? stderr says "2017-06-16 05:43:39 (7528): VM Completion Message: No jobs were available to run" |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
VMAgent down again? Yes: agent: vocms0159.cern.ch (1.1.2.patch2) agent last updated: 2017/6/16 (Fri) 08:06:34 UTC : 0 h 4 m data last updated: N/A status: Components or Thread down; team: testbed-vocms0159 I've messaged Alan and Laurence. |
Send message Joined: 20 Jun 14 Posts: 374 Credit: 238,712 RAC: 0 |
The automatic brake worked but I will reduced the buffer. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
The automatic brake worked ... okay, but this would only eliminate the symtoms, but not the cause. The cause, as mostly, is the WMAgent. What's wrong with the WMAgent so that it fails every other week? |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
The automatic brake worked ... It's a complex system. In fact this same problem has apparently been affecting the production systems as well. There's a new release being prepared, it will probably be deployed in a week or so. |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
For some understanding of the complexity of WMAgent, take a look at this wiki. Thanks, Ivan, for providing the link. So, let's keep our fingers crossed that the new release will be more stable :-) |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
I see another WMAgent problem. It really seems to be time to implement the new release of the WMAgent :-) |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
I see another WMAgent problem. I guess we'll find out soon enough. :-/ [Edit] Ah, it wasn't a WMAgent problem per se, but a side-effect of a network problem. See https://cern.service-now.com/service-portal/view-outage.do?from=CSP-Service-Status-Board&n=OTG0038195 if it's a public URL. [/Edit] |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
Right now, more than 11.700 "unsent" tasks are shown on the Project Status page; however, when trying to fetch work, BOINC says "no tasks available" - on all my hosts. Why so? |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
from what it looks like, there may be a major server problem - now ATLAS tasks cannot be downloaded either (although 16.500 shown as "unsent") :-) |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
I received CMS jobs as well as ATLAS jobs. However, one CMS errored out with "computation error" after 12 minutes. STDERR says: 2017-06-19 17:13:02 (6060): VM Heartbeat file specified, but missing. 2017-06-19 17:13:02 (6060): VM Heartbeat file specified, but missing file system status. (errno = '2') Any idea what the reason for this could be? |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
Ivan wrote last week: There's a new release (of the WMAgent) being prepared, it will probably be deployed in a week or so. Ivan, is this taking place right now, and the reason why no tasks (CMS and others as well) can be downloaded? |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
two of my hosts just downladed new CMS tasks. So all seems to run well again, whatever the reason for the disruption was. |
Send message Joined: 29 Aug 05 Posts: 1006 Credit: 6,272,230 RAC: 352 |
Ivan wrote last week: No, not as far as I'm aware. I know that Laurence is tinkering with his cluster, but I doubt that would affect all projects. I'll be sure to give you as much warning as I can when the WMAgent update is scheduled. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,880,115 RAC: 121,603 |
I'll be sure to give you as much warning as I can when the WMAgent update is scheduled. Thanks a lot, Ivan |
Send message Joined: 2 May 07 Posts: 2101 Credit: 159,817,517 RAC: 132,770 |
The Counter for CMS-Tasks went also to ZERO. (500 at the moment) |
©2024 CERN