Message boards : CMS Application : no new WUs available
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · Next

AuthorMessage
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1090
Credit: 9,319,923
RAC: 1,550
Message 51538 - Posted: 13 Feb 2025, 14:16:25 UTC - in response to Message 51537.  

OK, things have filtered down the queues, and jobs are available again.

Merci pour votre Malades.
ID: 51538 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1203
Credit: 69,447,006
RAC: 68,090
Message 51607 - Posted: 28 Feb 2025, 6:18:06 UTC

Looks like we are out of work and getting close to the dreaded weekend

https://lhcathome.cern.ch/lhcathome/server_status.php
ID: 51607 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1203
Credit: 69,447,006
RAC: 68,090
Message 51616 - Posted: 28 Feb 2025, 21:50:14 UTC
Last modified: 28 Feb 2025, 22:32:26 UTC

ID: 51616 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1874
Credit: 137,484,834
RAC: 51,528
Message 51746 - Posted: 23 Mar 2025, 17:04:04 UTC - in response to Message 51616.  

after CMS ran out of jobs, the automatic stop of tasks download took place.
ID: 51746 · Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 12 Jul 11
Posts: 103
Credit: 1,217,156
RAC: 249
Message 51776 - Posted: 30 Mar 2025, 11:04:53 UTC

Hi

it seems there are some CMS available at the moment, but my Mac intel can't get any : is it normal ?

Thanks
ID: 51776 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 443
Credit: 12,416,000
RAC: 2,347
Message 51777 - Posted: 30 Mar 2025, 19:32:41 UTC - in response to Message 51006.  
Last modified: 30 Mar 2025, 19:34:24 UTC

Apologies for the short blackout, I've been having the most terrible weekend. It started with a full glass of wine spilling right into my keyboard, and hasn't got better since... (more details suppressed to save those of a sensitive disposition!)
I'm also hampered by my cellphone supplier switching off 3G connectivity, so I can only connect to 4G at just one bar on the power meter, barely managing 2-300 Kbps data rate.
Finally managed to log in to CERN, after trying all afternoon/night, to see that jobs are flowing. I'll try to submit a new batch of jobs in the next few hours.
Sorry for the late post, but if you're still having trouble, you can buy phone signal boosters - make your own little cell! I'm planning on trying one for the north pennines hills in england where I've bought land to caravan holiday on. Big aerial (well a foot or two) up a tree, transmitter either a little one in the caravan or you can make a bigger cell to cover a large area with a bigger one.
ID: 51777 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 443
Credit: 12,416,000
RAC: 2,347
Message 51778 - Posted: 30 Mar 2025, 19:36:19 UTC - in response to Message 51746.  

after CMS ran out of jobs, the automatic stop of tasks download took place.
I don't follow. There's no tasks so it automatically stops you downloading the tasks which aren't there?
ID: 51778 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 763
Credit: 56,498,034
RAC: 29,213
Message 51779 - Posted: 30 Mar 2025, 21:23:25 UTC - in response to Message 51778.  
Last modified: 30 Mar 2025, 21:36:29 UTC

after CMS ran out of jobs, the automatic stop of tasks download took place.
I don't follow. There's no tasks so it automatically stops you downloading the tasks which aren't there?

The CMS jobs and Boinc tasks are two different things. Boinc handles the download of tasks which launch the VM machines. The VM machines then contact Cern and download the jobs that are then crunched. One Boinc task will handle several sets of jobs (depending on the speed of host). If there are jobs available when VM requests them, the VM will keep crunching them until minimum 12 hours has passed. If 12 hours has passed and the current set of jobs is finished, the task ends normally. The crunching has a maximum time limit of 18 hours when Boinc task is finished anyway.
[edit] If there aren't any jobs available when the VM requests them, the task ends there. This could happen right at the start of the VM and the task would end in about 20 minutes. Boinc server should notice this and stop generating new tasks.
ID: 51779 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 443
Credit: 12,416,000
RAC: 2,347
Message 51780 - Posted: 31 Mar 2025, 3:07:07 UTC - in response to Message 51779.  
Last modified: 31 Mar 2025, 3:08:32 UTC

The CMS jobs and Boinc tasks are two different things. Boinc handles the download of tasks which launch the VM machines. The VM machines then contact Cern and download the jobs that are then crunched. One Boinc task will handle several sets of jobs (depending on the speed of host). If there are jobs available when VM requests them, the VM will keep crunching them until minimum 12 hours has passed. If 12 hours has passed and the current set of jobs is finished, the task ends normally. The crunching has a maximum time limit of 18 hours when Boinc task is finished anyway.
[edit] If there aren't any jobs available when the VM requests them, the task ends there. This could happen right at the start of the VM and the task would end in about 20 minutes. Boinc server should notice this and stop generating new tasks.
Is it not possible to avoid this? Not sure if it's CMS, probably more Theory. The VM machines for Theory download common files every time. So someone running many theory tasks is hammering your servers for the same file over and over (and also overloading the user's internet connection which causes impatient VMs to not bother continuing, sitting with no CPU usage for 10 days). If this file was stored using Boinc, each machine would only request the big common file once.

Yes I know users can run squid, but:
a) Most people don't know how or can't be bothered or have never heard of it.
b) People such as myself tried it for 2 months until it broke and now I can't get it to run again even with a fresh install on another computer - it's so unintuitive and needs a GUI.
c) Why should we have to?
ID: 51780 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 763
Credit: 56,498,034
RAC: 29,213
Message 51782 - Posted: 31 Mar 2025, 6:11:38 UTC - in response to Message 51780.  

Sorry, I don't know enough of the inner workings of LHC tasks to comment that. Anyway, that's the way at least CMS is configured right now.
ID: 51782 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 443
Credit: 12,416,000
RAC: 2,347
Message 51783 - Posted: 31 Mar 2025, 8:26:05 UTC - in response to Message 51779.  

the VM will keep crunching them until minimum 12 hours has passed
Explains why my recent ones have finished very early. Sporadic job availability?
ID: 51783 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 763
Credit: 56,498,034
RAC: 29,213
Message 51789 - Posted: 31 Mar 2025, 10:08:13 UTC - in response to Message 51783.  

the VM will keep crunching them until minimum 12 hours has passed
Explains why my recent ones have finished very early. Sporadic job availability?

That's most likely, or some communication failure when it was requesting new jobs.
ID: 51789 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 443
Credit: 12,416,000
RAC: 2,347
Message 51791 - Posted: 31 Mar 2025, 14:54:04 UTC - in response to Message 51789.  

Looks like we have ATLAS and CMS back up and running. No Theory though, which means two of my 10 machines can't run. Not enough RAM for the other subprojects.
ID: 51791 · Report as offensive     Reply Quote
[AF>Le_Pommier] Jerome_C2005

Send message
Joined: 12 Jul 11
Posts: 103
Credit: 1,217,156
RAC: 249
Message 51792 - Posted: 31 Mar 2025, 15:09:23 UTC - in response to Message 51776.  

Hi

it seems there are some CMS available at the moment, but my Mac intel can't get any : is it normal ?

Thanks

Any idea about this ?
ID: 51792 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1090
Credit: 9,319,923
RAC: 1,550
Message 51874 - Posted: 7 May 2025, 10:38:28 UTC

Hi, we need to do an upgrade to the WMAgent, but we can't wait for the current workflow to finish. So we'll need to actively terminate it. I'm suggesting that it be done tomorrow, so you should set your machines to "No New Tasks" now to give current tasks time to finish off. Sorry for the inconvenience.
ID: 51874 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1090
Credit: 9,319,923
RAC: 1,550
Message 51875 - Posted: 8 May 2025, 14:32:28 UTC

Intervention is over and I've submitted a new batch of jobs. They should be available soon.
ID: 51875 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1874
Credit: 137,484,834
RAC: 51,528
Message 51934 - Posted: 6 Jun 2025, 7:25:31 UTC

the queue has run dry :-(
ID: 51934 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1090
Credit: 9,319,923
RAC: 1,550
Message 51957 - Posted: 14 Jun 2025, 21:31:03 UTC - in response to Message 51934.  

the queue has run dry :-(

Sorry 'bout that. Since I was forced to retire I don't always wake up in time to check the queues. :-(
ID: 51957 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1090
Credit: 9,319,923
RAC: 1,550
Message 51958 - Posted: 14 Jun 2025, 21:39:21 UTC

Heads up!

A change to our submission machine yesterday has been preventing access to the central CA certificate stores at CERN, so I'm not able to submit new jobs at the moment.

Unfortunately this will need intervention from CERN IT, so it's unlikely to be resolved before Monday. There is less than an hour's worth of tasks in the queue, so we'll run out Real Soon Now (© Jerry Pournelle). Please set No New Tasks or take other measures to look after your machines.
ID: 51958 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1090
Credit: 9,319,923
RAC: 1,550
Message 51961 - Posted: 16 Jun 2025, 13:13:26 UTC - in response to Message 51958.  

Heads up!

A change to our submission machine yesterday has been preventing access to the central CA certificate stores at CERN, so I'm not able to submit new jobs at the moment.

Unfortunately this will need intervention from CERN IT, so it's unlikely to be resolved before Monday. There is less than an hour's worth of tasks in the queue, so we'll run out Real Soon Now (© Jerry Pournelle). Please set No New Tasks or take other measures to look after your machines.

Problem fixed and a new workflow submitted.
ID: 51961 · Report as offensive     Reply Quote
Previous · 1 . . . 22 · 23 · 24 · 25 · 26 · Next

Message boards : CMS Application : no new WUs available


©2025 CERN