Message boards :
CMS Application :
CMS@Home downtime pencilled in for Thursday
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
CERN want to update the WMAgent we use to submit CMS@Homejobs. This is tentatively scheduled for Thursday. I'll let the queue drain, and possibly kill off some of the smaller batches I've been submitting to get me finer timing control. I expect that we can go another 24 hours before I start draining the queue, but keep an eye on the running jobs graph and be prepared to set No New Tasks when you see it dipping -- or do it beforehand if you really want to protect your daily job quotas. More news as details become firmer. |
Send message Joined: 15 Jun 08 Posts: 2563 Credit: 257,114,404 RAC: 112,903 |
Wouldn´t it be more efficient to also stop the WU generation and perhaps decrease the number of unsent WUs beforehand? I´m sure a lot of users are still on holidays. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
Perhaps, but I don't think I can do that myself. My main line of thought is that Laurence's "opportunistic" cluster of machines isn't controlled by BOINC so that if people stop their boxes in a considered manner then his servers will drain the queue with minimum damage to Volunteer's PCs. Plus I have deliberately submitted smaller batches lately -- I can abort un-started batches at will to tailor the queue drainage to Alan's timetable. |
Send message Joined: 27 Sep 08 Posts: 853 Credit: 696,387,302 RAC: 129,979 |
I agree with computezrmle, but it seems like it's difficult for you and the other CERN staff. Maybe Laurance can create some easy mechanism for you? |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
|
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
Wouldn´t it be more efficient to also stop the WU generation and perhaps decrease the number of unsent WUs beforehand? Pardon me for only just thinking this through, but the main problem in shutting down the WMAgent system is to make sure that the queues are drained as far as possible before shutting down -- limiting WU creation would actually hinder this. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
|
Send message Joined: 27 Sep 08 Posts: 853 Credit: 696,387,302 RAC: 129,979 |
Thanks Ivan, I put to NNT this morning as per your advice. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
|
Send message Joined: 15 Jun 08 Posts: 2563 Credit: 257,114,404 RAC: 112,903 |
The first WUs after the update started without errors. Thank you. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
|
Send message Joined: 15 Jun 08 Posts: 2563 Credit: 257,114,404 RAC: 112,903 |
Task upload also works. The current tasks have significant shorter runtimes. Roughly 65% compared to older tasks (on both of my hosts). |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 8,134,418 RAC: 13,358 |
Task upload also works. That's a little strange, I'm reasonably sure I didn't change the number of events per job, just the jobs per batch. I'll take a look. [Added] I've had two tasks fail with a heartbeat failure, the rest are still running, so no data yet. [/Added] [And again] Job failure rate so far is 2.7%, somewhat less than recent values. [/Aa] |
©2025 CERN