Message boards :
CMS Application :
no new WUs available
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 24 · Next
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
hm, although the Project Status Page still shows zero tasks available, I have received some about half an hour ago. So there are tasks around again. However, they are still waiting in the queue here, so I cannot tell whether there will be jobs for these tasks. Is the information from the Project Status Page lagging behind to some extent? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
hm, although the Project Status Page still shows zero tasks available, I have received some about half an hour ago. Yes, it's not updated continuously; the time of the snapshot is at the bottom of the page. It's currently saying 1049 GMT, which is an hour out of date. I seem to be picking up jobs, but my computers are a bit mixed up because of the long SETI@Home drought that has just ended. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
thanks for the Information, Ivan. I never noticed the time stamp at the bottom of the Project Status Page (shame on me) :-( Right now it says "11:53:51 UTC" and shows 114 tasks available. So everything is okay again :-) |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
the number of tasks available is dropping again. Right now (13:01 hrs UTC) it's at 33. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
the number of available tasks is back to zero since a couple of hours ago. I guess there won't be a solution to the problem before tomorrow (at the earliest)? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
the number of available tasks is back to zero since a couple of hours ago. I've submitted a small batch of jobs to the "old" WMAgent, and they are just starting to run. These should go out to volunteer machines, so let's see if the situation changes. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
Looks like the WMAgent failed about 0830 this morning, and the job queue drained 90 minutes later. So, we are out of jobs at the moment -- I'd actually submitted a new batch of jobs before I'd tracked it down to the WMAgent. :-/ So best to set No New Tasks until CERN reacts to my e-mail and kicks vocms0267 into life again. Just to mention at this point that we might see more outages over the next month as various year's-end maintenance programmes take place, as well as the whole CERN site shutting down for about two weeks for holidays. Good(?) news -- it snowed in London overnight! That doesn't happen often. |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
2017-12-10 03:34:29 (5236): Guest Log: [INFO] CMS application starting. Check log files. 2017-12-10 03:35:25 (5236): Guest Log: [DEBUG] HTCondor ping 2017-12-10 03:35:32 (5236): Guest Log: [DEBUG] 0 2017-12-10 03:47:35 (5236): Guest Log: [ERROR] Condor exited after 729s without running a job. 2017-12-10 03:47:35 (5236): Guest Log: [INFO] Shutting Down. 2017-12-10 03:47:35 (5236): VM Completion File Detected. 2017-12-10 03:47:35 (5236): VM Completion Message: Condor exited after 729s without running a job. 4am here and over and over CMS is doing this here and over at -dev .....and no I do not get up early to do this.........that server needs me to throw a few snow balls at it. Volunteer Mad Scientist For Life |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
Yes, sorry for your frustration. I didn't get a reply from Alan, who is usually quite quick to respond, so I opened a ticket with CERN IT. I don't really expect a response from them until some time tomorrow, unfortunately. Luckily I've still got S@H and Einstein to keep my server ticking over and generate a bit of heat. It kept snowing on-and-off today, so it was a very slushy shopping trip this afternoon. |
Send message Joined: 30 May 08 Posts: 93 Credit: 5,160,246 RAC: 0 |
...so I opened a ticket with CERN IT. I don't really expect a response from them until some time tomorrow, unfortunately. Sounds like a good time to open the spigot and run a few more Theory jobs... :D |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
...so I opened a ticket with CERN IT. I don't really expect a response from them until some time tomorrow, unfortunately. Yeah, back-up projects time. Stay warm, everyone (they're forecasting -12 C for areas of the UK for the next two nights!). |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
Yes, sorry for your frustration. I didn't get a reply from Alan, who is usually quite quick to respond, so I opened a ticket with CERN IT. I don't really expect a response from them until some time tomorrow, unfortunately. Funny I thought of your snow today watching an NFL game with the Colts @ Buffalo with a game in deep snow all over the stadium and the field was nothing but snow (ended up and overtime game and then players making *snow angels* and disappearing in the snow Sort of cold here but still sunshine and I just switched all mine back to Theory tasks here and over at -dev and fired up another Einstein GPU machine that has been taking a break all year and running these VB tasks. I like it when a crunching day ends and I have a long list of Valids Volunteer Mad Scientist For Life |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
Why is it that out of all the LHC sub-projects, CMS is failing most frequently (at least, that's my Impression)? Particularly since the installation of the new release of the WMAgent some months ago there are problems once and so often. It's really too bad :-( |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
Ivan, what I don't understand is: why can CMS tasks still be downloaded (and consequently being started on the volunteer's PC, until they fail after a while), and why are there constantly close to 200 unsent tasks shown on the project status page, if it has been clear since one day ago that there will be no jobs coming in? I may be mistaken now, but I think to remember to have read somewhere here, several months ago, that there was some automated steps established which stopped the creation of new CMS tasks some time after no jobs are available. Or am I wrong? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
Ivan, what I don't understand is: why can CMS tasks still be downloaded (and consequently being started on the volunteer's PC, until they fail after a while), and why are there constantly close to 200 unsent tasks shown on the project status page, if it has been clear since one day ago that there will be no jobs coming in? No, that was happening, at least here on LHC@home. It could be that Laurence needs to tweak the script to take account of the new WMAgent we had installed two weeks ago, or maybe it's something to do with the new website set-up. I'll make enquiries. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 30 May 08 Posts: 93 Credit: 5,160,246 RAC: 0 |
CMS jobs have spiked back up and I just got a task. Are we back for good? |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
CMS jobs have spiked back up and I just got a task. Are we back for good? No, not really. I submitted a small batch of jobs with the old WMAgent and they made it through to the Condor server just as we were about to manually disable task creation. There's no sign of the new agent coming back to life, so I guess I'll submit a bigger batch to keep us going until tomorrow. |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
CMS jobs have spiked back up and I just got a task. Are we back for good? OK, we finally got in touch with the WMAgent expert (turns out he was on holidays, and so is our other expert!), and the new agent is back in operation again, so I can start submitting larger job batches again. Just as well, too, as our number of running jobs has taken an increase -- we nearly drained the queue this morning as I was expecting last night's batch to last 24 hours! We also have another WMAgent expert "on our books" now, so hopefully we won't have this long an outage again. However, as I said before, it's the winter holiday season, so remedies may be slow to come in some circumstances. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,941,196 RAC: 21,920 |
According to the Project Status Page, CMS has run out of tasks :-( |
©2024 CERN