Message boards : CMS Application : no new WUs available
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · Next

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 48764 - Posted: 7 Oct 2023, 5:29:22 UTC

the queue has run dry again
ID: 48764 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1130
Credit: 49,813,277
RAC: 6,973
Message 48765 - Posted: 7 Oct 2023, 6:46:52 UTC - in response to Message 48764.  

the queue has run dry again


That figures.....I had to do the Windows 10 Updates on 4 of mine and when I finally get done I tried to get another Atlas......gone.......then I figured I guess I can just get back to CMS......gone.

By the time I try to get Theory some Threadrippers will eat all of those too.
ID: 48765 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2125
Credit: 159,928,317
RAC: 43,780
Message 48766 - Posted: 7 Oct 2023, 6:56:28 UTC - in response to Message 48765.  

By the time I try to get Theory some Threadrippers will eat all of those too.

Total number of generated events: 6014.9 billions
The Theory Team have an eye on it.
ID: 48766 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 680
Credit: 43,816,201
RAC: 13,553
Message 48778 - Posted: 13 Oct 2023, 10:41:37 UTC

The queue is empty again. Friday the 13th and weekend coming...
ID: 48778 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 48779 - Posted: 13 Oct 2023, 10:59:04 UTC - in response to Message 48778.  

The queue is empty again. Friday the 13th and weekend coming...
this recently permanent on and off has become quite troublesome :-(
ID: 48779 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 48985 - Posted: 6 Dec 2023, 17:56:07 UTC

since this afternoon, there are no jobs being provided for the tasks which can still be downloaded. Obviously, the automatic stop of task delivery in case of no jobs available does not work :-(
ID: 48985 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1009
Credit: 6,283,501
RAC: 799
Message 48988 - Posted: 7 Dec 2023, 8:58:30 UTC

Sorry, there have been some disruptions that we can't control. At the moment I have several workflows stalled in the Agent for reasons that I have yet to ascertain.
ID: 48988 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 48989 - Posted: 7 Dec 2023, 9:22:06 UTC - in response to Message 48988.  

Sorry, there have been some disruptions that we can't control. At the moment I have several workflows stalled in the Agent for reasons that I have yet to ascertain.
hello Ivan, nice to see you back :-) Hope you are fully okay now, healthwise!
Thanks for your efforts to make CMS run again (CMS is definitely my favorite subproject)!
ID: 48989 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1009
Credit: 6,283,501
RAC: 799
Message 48990 - Posted: 7 Dec 2023, 14:40:52 UTC - in response to Message 48989.  

Sorry, there have been some disruptions that we can't control. At the moment I have several workflows stalled in the Agent for reasons that I have yet to ascertain.
hello Ivan, nice to see you back :-) Hope you are fully okay now, healthwise!
Thanks for your efforts to make CMS run again (CMS is definitely my favorite subproject)!

Hi, good to be back. I'm now undergoing a "phased" transition back to my duties and will be officially back to my "contracted hours" (i.e. 50%) from January.
As for today's problems, I'm suspecting that a certificate[1] has expired, and I'm trying to track down someone to check it.

[1] CN=Robot: WmCore Service Account
ID: 48990 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 228,025,997
RAC: 125,276
Message 48991 - Posted: 7 Dec 2023, 15:07:00 UTC - in response to Message 48990.  

... good to be back. I'm now undergoing a "phased" transition back to my duties and will be officially back ...

+1 +1 +1
ID: 48991 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1009
Credit: 6,283,501
RAC: 799
Message 48992 - Posted: 7 Dec 2023, 15:34:34 UTC - in response to Message 48990.  

As for today's problems, I'm suspecting that a certificate[1] has expired, and I'm trying to track down someone to check it.

[1] CN=Robot: WmCore Service Account

OK, we have some jobs running again, but not my usual workflows as yet. These will almost certainly have different performance profiles, as we are trying to get a different calculation running. Let us know how they perform.
ID: 48992 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 228,025,997
RAC: 125,276
Message 48993 - Posted: 7 Dec 2023, 16:07:22 UTC - in response to Message 48992.  

Got this task:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=402719666

After some downloads and the usual benchmark runs CPU usage dropped to effectively 0%.
No "cmsRun" process at the top console.
No try to contact the WMAgent service.
Nonetheless, glidein reported "0" = success.
ID: 48993 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1289
Credit: 8,528,821
RAC: 2,748
Message 48994 - Posted: 8 Dec 2023, 12:05:23 UTC

Now I got a real sub-task from Ivan's flow: ireid_TC_SLC7_IDR_CMS_Home_231206_131958_9405
ID: 48994 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 48997 - Posted: 8 Dec 2023, 13:53:44 UTC

I have re-started CMS on some of my machines - everything seems to work fine.
What seems to me is that the new series ("CMS_141....) consumes less memory.
ID: 48997 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 228,025,997
RAC: 125,276
Message 49003 - Posted: 10 Dec 2023, 21:25:30 UTC - in response to Message 48992.  

OK, we have some jobs running again, but not my usual workflows as yet. These will almost certainly have different performance profiles, as we are trying to get a different calculation running. Let us know how they perform.

They run much longer than the standard CMS tasks.
Looks like they process more than 5 times the #events.

Thus, the runtimes are too close to the hard 18 h task limit which causes a couple of them to be shut down by the BOINC watchdog.
In this case BOINC marks the task as valid but they don't return scientific results.
ID: 49003 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2433
Credit: 228,025,997
RAC: 125,276
Message 49038 - Posted: 14 Dec 2023, 6:34:08 UTC

Looks like the backend queue again doesn't send CMS subtasks.
But the project server doesn't notice it and continues generating empty envelope tasks.
ID: 49038 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 49039 - Posted: 14 Dec 2023, 7:58:57 UTC - in response to Message 49038.  

Looks like the backend queue again doesn't send CMS subtasks.
But the project server doesn't notice it and continues generating empty envelope tasks.
there was the same problem last week -
Ivan, could you please look into this, so that once no subtasks are available, the generation of empty envelope tasks is stopped. This worked well some time ago, so this mechanism obviously got broken at some point of time, and was not repaired so far.
ID: 49039 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 49074 - Posted: 27 Dec 2023, 9:00:04 UTC

no tasks available; this time, the automatic stop mechanism for submitting tasks if no subtasks are available seemed to work well.
ID: 49074 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1693
Credit: 104,799,646
RAC: 78,593
Message 49079 - Posted: 29 Dec 2023, 7:36:35 UTC - in response to Message 49074.  

no tasks available; this time, the automatic stop mechanism for submitting tasks if no subtasks are available seemed to work well.
some time later, new tasks could be downloaded.

Today, again no new tasks ... :-(
ID: 49079 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2125
Credit: 159,928,317
RAC: 43,780
Message 49159 - Posted: 9 Jan 2024, 8:05:32 UTC

Win11pro, Boinc 7.24.1
The flow for a new CMS-Task (only one is running here) need one hour.
Problem of the scheduler? https://lhcathome.cern.ch/lhcathome/results.php?hostid=10795955
ID: 49159 · Report as offensive     Reply Quote
Previous · 1 . . . 15 · 16 · 17 · 18 · 19 · 20 · Next

Message boards : CMS Application : no new WUs available


©2024 CERN