Message boards :
CMS Application :
no new WUs available
Message board moderation
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · Next
Author | Message |
---|---|
![]() Send message Joined: 29 Aug 05 Posts: 1090 Credit: 9,319,923 RAC: 1,550 ![]() |
|
![]() ![]() Send message Joined: 24 Oct 04 Posts: 1203 Credit: 69,447,006 RAC: 68,090 ![]() ![]() |
Looks like we are out of work and getting close to the dreaded weekend https://lhcathome.cern.ch/lhcathome/server_status.php |
![]() ![]() Send message Joined: 24 Oct 04 Posts: 1203 Credit: 69,447,006 RAC: 68,090 ![]() ![]() |
I have seen a few hundred of these today with CMS https://lhcathome.cern.ch/lhcathome/result.php?resultid=419899668 https://lhcathome.cern.ch/lhcathome/result.php?resultid=419899118 https://lhcathome.cern.ch/lhcathome/results.php?hostid=9930008 |
Send message Joined: 18 Dec 15 Posts: 1874 Credit: 137,484,834 RAC: 51,528 ![]() ![]() ![]() |
after CMS ran out of jobs, the automatic stop of tasks download took place. |
Send message Joined: 12 Jul 11 Posts: 103 Credit: 1,217,156 RAC: 249 ![]() ![]() |
Hi it seems there are some CMS available at the moment, but my Mac intel can't get any : is it normal ? Thanks |
![]() Send message Joined: 12 Aug 06 Posts: 443 Credit: 12,416,000 RAC: 2,347 ![]() ![]() ![]() |
Apologies for the short blackout, I've been having the most terrible weekend. It started with a full glass of wine spilling right into my keyboard, and hasn't got better since... (more details suppressed to save those of a sensitive disposition!)Sorry for the late post, but if you're still having trouble, you can buy phone signal boosters - make your own little cell! I'm planning on trying one for the north pennines hills in england where I've bought land to caravan holiday on. Big aerial (well a foot or two) up a tree, transmitter either a little one in the caravan or you can make a bigger cell to cover a large area with a bigger one. |
![]() Send message Joined: 12 Aug 06 Posts: 443 Credit: 12,416,000 RAC: 2,347 ![]() ![]() ![]() |
after CMS ran out of jobs, the automatic stop of tasks download took place.I don't follow. There's no tasks so it automatically stops you downloading the tasks which aren't there? |
![]() Send message Joined: 28 Sep 04 Posts: 763 Credit: 56,498,034 RAC: 29,213 ![]() ![]() ![]() |
after CMS ran out of jobs, the automatic stop of tasks download took place.I don't follow. There's no tasks so it automatically stops you downloading the tasks which aren't there? The CMS jobs and Boinc tasks are two different things. Boinc handles the download of tasks which launch the VM machines. The VM machines then contact Cern and download the jobs that are then crunched. One Boinc task will handle several sets of jobs (depending on the speed of host). If there are jobs available when VM requests them, the VM will keep crunching them until minimum 12 hours has passed. If 12 hours has passed and the current set of jobs is finished, the task ends normally. The crunching has a maximum time limit of 18 hours when Boinc task is finished anyway. [edit] If there aren't any jobs available when the VM requests them, the task ends there. This could happen right at the start of the VM and the task would end in about 20 minutes. Boinc server should notice this and stop generating new tasks. ![]() |
![]() Send message Joined: 12 Aug 06 Posts: 443 Credit: 12,416,000 RAC: 2,347 ![]() ![]() ![]() |
The CMS jobs and Boinc tasks are two different things. Boinc handles the download of tasks which launch the VM machines. The VM machines then contact Cern and download the jobs that are then crunched. One Boinc task will handle several sets of jobs (depending on the speed of host). If there are jobs available when VM requests them, the VM will keep crunching them until minimum 12 hours has passed. If 12 hours has passed and the current set of jobs is finished, the task ends normally. The crunching has a maximum time limit of 18 hours when Boinc task is finished anyway.Is it not possible to avoid this? Not sure if it's CMS, probably more Theory. The VM machines for Theory download common files every time. So someone running many theory tasks is hammering your servers for the same file over and over (and also overloading the user's internet connection which causes impatient VMs to not bother continuing, sitting with no CPU usage for 10 days). If this file was stored using Boinc, each machine would only request the big common file once. Yes I know users can run squid, but: a) Most people don't know how or can't be bothered or have never heard of it. b) People such as myself tried it for 2 months until it broke and now I can't get it to run again even with a fresh install on another computer - it's so unintuitive and needs a GUI. c) Why should we have to? |
![]() Send message Joined: 28 Sep 04 Posts: 763 Credit: 56,498,034 RAC: 29,213 ![]() ![]() ![]() |
Sorry, I don't know enough of the inner workings of LHC tasks to comment that. Anyway, that's the way at least CMS is configured right now. ![]() |
![]() Send message Joined: 12 Aug 06 Posts: 443 Credit: 12,416,000 RAC: 2,347 ![]() ![]() ![]() |
the VM will keep crunching them until minimum 12 hours has passedExplains why my recent ones have finished very early. Sporadic job availability? |
![]() Send message Joined: 28 Sep 04 Posts: 763 Credit: 56,498,034 RAC: 29,213 ![]() ![]() ![]() |
the VM will keep crunching them until minimum 12 hours has passedExplains why my recent ones have finished very early. Sporadic job availability? That's most likely, or some communication failure when it was requesting new jobs. ![]() |
![]() Send message Joined: 12 Aug 06 Posts: 443 Credit: 12,416,000 RAC: 2,347 ![]() ![]() ![]() |
Looks like we have ATLAS and CMS back up and running. No Theory though, which means two of my 10 machines can't run. Not enough RAM for the other subprojects. |
Send message Joined: 12 Jul 11 Posts: 103 Credit: 1,217,156 RAC: 249 ![]() ![]() |
Hi Any idea about this ? |
![]() Send message Joined: 29 Aug 05 Posts: 1090 Credit: 9,319,923 RAC: 1,550 ![]() |
Hi, we need to do an upgrade to the WMAgent, but we can't wait for the current workflow to finish. So we'll need to actively terminate it. I'm suggesting that it be done tomorrow, so you should set your machines to "No New Tasks" now to give current tasks time to finish off. Sorry for the inconvenience. ![]() |
![]() Send message Joined: 29 Aug 05 Posts: 1090 Credit: 9,319,923 RAC: 1,550 ![]() |
|
Send message Joined: 18 Dec 15 Posts: 1874 Credit: 137,484,834 RAC: 51,528 ![]() ![]() ![]() |
the queue has run dry :-( |
![]() Send message Joined: 29 Aug 05 Posts: 1090 Credit: 9,319,923 RAC: 1,550 ![]() |
|
![]() Send message Joined: 29 Aug 05 Posts: 1090 Credit: 9,319,923 RAC: 1,550 ![]() |
Heads up! A change to our submission machine yesterday has been preventing access to the central CA certificate stores at CERN, so I'm not able to submit new jobs at the moment. Unfortunately this will need intervention from CERN IT, so it's unlikely to be resolved before Monday. There is less than an hour's worth of tasks in the queue, so we'll run out Real Soon Now (© Jerry Pournelle). Please set No New Tasks or take other measures to look after your machines. ![]() |
![]() Send message Joined: 29 Aug 05 Posts: 1090 Credit: 9,319,923 RAC: 1,550 ![]() |
Heads up! Problem fixed and a new workflow submitted. ![]() |
©2025 CERN