Message boards :
CMS Application :
"No jobs were available to run" since this morning.
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
The Counter for CMS-Tasks went also to ZERO. (500 at the moment) Sorry, where are you seeing that? Everything looks pretty normal to me. In fact we have an up-tick at the moment, possibly picking up overspill from another app that's not sending tasks. [Edit] Strange, we seem to be picking up tasks at the expense of LHCb, although they have a lot in the queue. ATLAS was down to zero tasks available, but some are coming through now. [/Edit] |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
... ATLAS was down to zero tasks available, but some are coming through now. Ivan, where do you see ATLAS Tasks coming through? |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,922,687 RAC: 124,848 |
Sorry, where are you seeing that? Server-Status page at LHCatHome. Atlas give unresolved back which are running. Since Friday there is no ATLAS in the pipe. Hopeful, that the Server Status page is not the real one. Edit: CGI testing, the whole day-Friday, see -dev forum under News. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
Sorry, where are you seeing that? That's exactly where I am looking at all the time. And Since Friday, under "unsent" I see only zero. Plus, whenever the BOINC Manager tries to download ATLAS tasks, it says "no tasks available". |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
|
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,922,687 RAC: 124,848 |
CMS is dropping down to 281 Tasks at the moment. This may be not enough up to monday morning. |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
CMS is dropping down to 281 Tasks at the moment. This may be not enough up to monday morning. We're surviving. The queue of jobs on the Condor server is down from its normal 700 but that's not totally unusual. We are running more jobs than normal at present, which may be putting some pressure on the pipeline. I think we are a fair way from being critical yet. |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
|
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,541,788 RAC: 120,716 |
Best set No New Tasks as soon as practicable. Done. Thank you Ivan. |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
|
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
Alan tells me he has an appointment tomorrow morning, so the intervention will take place after lunch (CERN time). Hopefully we'll have jobs again later in the afternoon. How does it look, Ivan? New software implemented successfully? So far, no new tasks ready for download yet. |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
Alan tells me he has an appointment tomorrow morning, so the intervention will take place after lunch (CERN time). Hopefully we'll have jobs again later in the afternoon. Yes, Alan finished the update about an hour ago. I submitted a new batch 45 minutes ago and the queue on the condor server is back up to its usual 700. You can turn on the taps again. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
all my 3 hosts downloaded new tasks - so it seems to work fine :-) |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
Yes, although we're ramping up a bit more slowly than I expected. This is possibly partly because Laurence has cut his cluster down to just 10 cores while he tries some experiments -- he had considerably more than that before. Also, some people may not have seen the news announcements and their machines have entered quota back-off. And, as there was some evidence a week or so back, some hosts may have started running other apps as backfill and we need to wait for those tasks to finish before they come back to CMS. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
Yes, although we're ramping up a bit more slowly than I expected. which can be seen clearly here: https://lhcathomedev.cern.ch/lhcathome-dev/cms_job.php But I guess it will be back to normal soon. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
hm, for some reason the number of running jobs stagnates at a level of sligthly above 400 - as seen here: https://lhcathomedev.cern.ch/lhcathome-dev/cms_job.php what might be the reason? |
Send message Joined: 29 Aug 05 Posts: 1005 Credit: 6,269,877 RAC: 404 |
hm, for some reason the number of running jobs stagnates at a level of sligthly above 400 - as seen here: For a start Laurence has been doing tests with his cluster, so that's a few hundred cores lost. Also, we don't have as many active users as normal. This is possibly due to people heeding my warning that a drought was coming, and not yet re-starting; people who didn't see my warning and whose machines are still in a quota back-off; and people whose hosts were set to switch to other apps when CMS had no jobs, and have yet to fully switch back. I'm not panicking yet. |
Send message Joined: 18 Dec 15 Posts: 1688 Credit: 103,125,977 RAC: 122,409 |
for a few hours, I have not been able to download CMS tasks ("no tasks available") - BTW, no ATLAS tasks either. Anything wrong with the newly installed WMAgent? |
Send message Joined: 14 Jan 10 Posts: 1274 Credit: 8,480,870 RAC: 2,011 |
for a few hours, I have not been able to download CMS tasks ("no tasks available") - BTW, no ATLAS tasks either. See my post in the ATLAS application thread - https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4331&postid=31176 |
©2024 CERN