Message boards : CMS Application : no new WUs available
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next

AuthorMessage
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34018 - Posted: 21 Jan 2018, 22:16:09 UTC - in response to Message 33997.  
Last modified: 21 Jan 2018, 22:35:40 UTC

Ivan, are you aware that no CMS tasks are available?

Erich; sorry, I've just noticed that myself (fell asleep in a nice warm kitchen... Not sure how I missed your message earlier tho'). I'll send out some emails, other projects are also affected if the Server Status Page is to be believed.
ID: 34018 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34019 - Posted: 21 Jan 2018, 22:27:46 UTC - in response to Message 34018.  

It's the task creator that's the problem as far as I can tell. The job queue is alive and well, and the WMAgent shows no problem. I've emailed the general production crew but maybe it's exactly the wrong time of the week to expect a response. :-(
ID: 34019 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34021 - Posted: 22 Jan 2018, 8:23:08 UTC

There are tasks available again -- I just got one on my home PC.
ID: 34021 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1514
Credit: 43,979,523
RAC: 44,531
Message 34044 - Posted: 23 Jan 2018, 9:19:39 UTC

Ivan, there are new tasks availble (according to the Project Status Page), and obviously there are also jobs available.

Still, as seen here:
https://lhcathomedev.cern.ch/lhcathome-dev/cms_job.php

the number of "running jobs" has been falling markedly within the past hours. Why so? To me, this does not look like a "natural" fluctuation.[/url]
ID: 34044 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34047 - Posted: 23 Jan 2018, 13:23:19 UTC - in response to Message 34044.  

Ivan, there are new tasks availble (according to the Project Status Page), and obviously there are also jobs available.

Still, as seen here:
https://lhcathomedev.cern.ch/lhcathome-dev/cms_job.php

the number of "running jobs" has been falling markedly within the past hours. Why so? To me, this does not look like a "natural" fluctuation.[/url]

Yes, I asked about that and it is believed to be linked to some Condor maintenance -- see this thread. I seem to still be getting tasks and jobs; I suspect that it's Lawrence's machines that are running out. There is no increase in failure rate in WMStats. Unfortunately the graphical application I use to track running and queued jobs has been down since yesterday, so I only have WMStats and the CMS Job page to monitor progress. Oh, the proxy we use has been showing no activity since 0900, too.
ID: 34047 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34048 - Posted: 23 Jan 2018, 15:18:55 UTC - in response to Message 34047.  

We seem to be slowly picking up jobs again.
ID: 34048 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1006
Credit: 47,363,658
RAC: 6,574
Message 34054 - Posted: 23 Jan 2018, 19:51:03 UTC - in response to Message 34048.  

We seem to be slowly picking up jobs again.


YOW I forgot this is a 982.17MB vdi ........this will take a while but as long as they run Valids after I get them running on this 8-core that is ok.........especially since I won't be home for a rare few hours today while the d/l is happening.........can't let Ivan run ALL the CMS tasks
Volunteer Mad Scientist For Life
ID: 34054 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34055 - Posted: 23 Jan 2018, 20:10:10 UTC - in response to Message 34054.  

We seem to be slowly picking up jobs again.


YOW I forgot this is a 982.17MB vdi ........this will take a while but as long as they run Valids after I get them running on this 8-core that is ok.........especially since I won't be home for a rare few hours today while the d/l is happening.........can't let Ivan run ALL the CMS tasks

Are you running any sort of local caching proxy? That could greatly reduce your downloads as most things would only need to be fetched once. computezrmle sent me a recipe on how to set it up but now I have a 40 Mbps link at home I don't need it (I have 1 Gbps at work, but we're throttled to 30 Mbps and some machines are still on a 100 Mbps switch).
ID: 34055 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1006
Credit: 47,363,658
RAC: 6,574
Message 34056 - Posted: 23 Jan 2018, 20:43:50 UTC - in response to Message 34055.  
Last modified: 23 Jan 2018, 20:48:19 UTC

Yeah I tried that before but it doesn't help me and my satellite dish isp.....now if they would just send me a disc with the files in the mail I could just install in seconds

Read lots of squid info and also comments on both sides of that subject.
So I just d/l the vdi and then getting tasks and running them are fast and easy.
In fact it would be easier if I copied the vdi to disc here and then install it on the other pc's here.

Edit: just checked and it was only one hour to get to 50% so that is faster than usual during this time of day (I do have high-speed between 2am and 8am but I save that to make sure the VB tasks will get past HTCondor ping)
ID: 34056 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34084 - Posted: 25 Jan 2018, 21:47:23 UTC

Digits cruciate! There are now some merge jobs running, where there had been none for a couple of days. These only run on Laurence's CERN VMs because the bandwidth required is not compatible with home volunteers. So, it looks like at least a few (8 at the moment, to be exact) of these machines have come to life. I hope that means that someone has tickled them back to awareness -- Laurence has been on paternity leave :-) for the last week or so so it's been difficult to contact him :-(.
ID: 34084 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1006
Credit: 47,363,658
RAC: 6,574
Message 34085 - Posted: 26 Jan 2018, 2:24:22 UTC - in response to Message 34084.  

Digits cruciate! There are now some merge jobs running, where there had been none for a couple of days. These only run on Laurence's CERN VMs because the bandwidth required is not compatible with home volunteers. So, it looks like at least a few (8 at the moment, to be exact) of these machines have come to life. I hope that means that someone has tickled them back to awareness -- Laurence has been on paternity leave :-) for the last week or so so it's been difficult to contact him :-(.

Ivan, it sounds to me like Laurence is just trying to make us all feel old
Last time I created another human was 1982 (ok I decided I better not comment on that disaster)
(ok I still have all my hair and it is not all gray yet).....but I do have all 50 cores running now!
Volunteer Mad Scientist For Life
ID: 34085 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34097 - Posted: 26 Jan 2018, 12:25:50 UTC - in response to Message 34085.  

Well, all merge jobs from the last batch have run, and we are up to around 1250 jobs running at the moment, so something has definitely sprung back to life.
ID: 34097 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1992
Credit: 143,246,946
RAC: 93,220
Message 34098 - Posted: 26 Jan 2018, 12:32:50 UTC - in response to Message 34097.  

... so something has definitely sprung back to life.

It seems that this "something" returns lots of errors:
https://lhcathome.cern.ch/lhcathome/cms_job.php
:-?
ID: 34098 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34100 - Posted: 26 Jan 2018, 14:43:47 UTC - in response to Message 34098.  

... so something has definitely sprung back to life.

It seems that this "something" returns lots of errors:
https://lhcathome.cern.ch/lhcathome/cms_job.php
:-?

Yes, that tends to happen after an outage. Hopefully it will settle down soon. Remember that Condor tries to run each job three times before giving up so we're probably not losing anything. I'll keep an eye on it.
ID: 34100 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1514
Credit: 43,979,523
RAC: 44,531
Message 34172 - Posted: 31 Jan 2018, 8:15:47 UTC

good morning, Ivan

BOINC says "not CMS tasks available" :-(
ID: 34172 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34175 - Posted: 31 Jan 2018, 9:35:19 UTC - in response to Message 34172.  
Last modified: 31 Jan 2018, 9:38:10 UTC

good morning, Ivan

BOINC says "not CMS tasks available" :-(

I'll take a look in ~30 mins when I get in to work -- running a bit late this morning...

[Edit] One of my monitoring tools that's been dead for a week has just sprung back to life -- that should make tracking the queues easier! [/Edit]
ID: 34175 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1514
Credit: 43,979,523
RAC: 44,531
Message 34196 - Posted: 31 Jan 2018, 19:04:13 UTC

Ivan, could you find out yet why there are no tasks available for CMS?
ID: 34196 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 484
Message 34207 - Posted: 1 Feb 2018, 9:09:12 UTC - in response to Message 34196.  
Last modified: 1 Feb 2018, 9:12:28 UTC

Ivan, could you find out yet why there are no tasks available for CMS?

Are you not getting any? I've been getting them again since 1433 yesterday.

[/Edit] I see you got some at 2003 last night. PEBKAC? :-) [/Edit]
ID: 34207 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1514
Credit: 43,979,523
RAC: 44,531
Message 34222 - Posted: 1 Feb 2018, 19:03:12 UTC - in response to Message 34207.  

Are you not getting any?
I got tasks during this day, but now again it says "no tasks available for CMS simulation, although the Project Status Page shows about 190 unsent.
ID: 34222 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1514
Credit: 43,979,523
RAC: 44,531
Message 34223 - Posted: 1 Feb 2018, 19:09:17 UTC - in response to Message 34222.  
Last modified: 1 Feb 2018, 19:15:49 UTC

Are you not getting any?
I got tasks during this day, but now again it says "no tasks available for CMS simulation, although the Project Status Page shows about 190 unsent.
Edit: just to find out, I tried to download a Theory task and a LHCb task, for both plenty of unsent should be available according to the Project Status Page.
However, no success either. So something seems to be wrong somewhere.

EDIT: now, all of a sudden, new CMS tasks were downloaded :-)
ID: 34223 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 13 · Next

Message boards : CMS Application : no new WUs available


©2022 CERN