Message boards : CMS Application : Condor Problems?
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2401
Credit: 225,503,531
RAC: 125,044
Message 28727 - Posted: 30 Jan 2017, 10:03:09 UTC

The good news is:
I got 1 WU on each of my hosts (as configured) and they started as expected.

At the moment the project servers or the network seems to be saturated.
It took more than 25 minutes (instead of the normal 3-8) for cvmfs to check all network files and hand over control to cmsRun.
ID: 28727 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 373
Credit: 238,712
RAC: 0
Message 28728 - Posted: 30 Jan 2017, 10:45:57 UTC - in response to Message 28727.  

The good news is:
I got 1 WU on each of my hosts (as configured) and they started as expected.

At the moment the project servers or the network seems to be saturated.
It took more than 25 minutes (instead of the normal 3-8) for cvmfs to check all network files and hand over control to cmsRun.


The squid proxy monitoring does seem to show a spike after the system was unblocked.
ID: 28728 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1005
Credit: 6,269,877
RAC: 404
Message 29015 - Posted: 2 Mar 2017, 12:59:40 UTC

I've just noticed that something has gone wrong with job allocation. Perhaps best to set No New Tasks until it's sorted.
ID: 29015 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1005
Credit: 6,269,877
RAC: 404
Message 29022 - Posted: 2 Mar 2017, 22:56:50 UTC - in response to Message 29015.  

I've just noticed that something has gone wrong with job allocation. Perhaps best to set No New Tasks until it's sorted.

Some jobs are flowing now, so you can try (cautiously) allowing new jobs again.
ID: 29022 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2401
Credit: 225,503,531
RAC: 125,044
Message 29053 - Posted: 3 Mar 2017, 15:42:48 UTC

This WU didnĀ“t get a job:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=122623337


And this one had an unusual short runtime:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=122385861

Probably also because there was no follow up job.
ID: 29053 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1005
Credit: 6,269,877
RAC: 404
Message 29096 - Posted: 7 Mar 2017, 10:41:54 UTC - in response to Message 29053.  

Small outage -- looks like my window onto WMAgent status was telling me lies and we ran out of jobs. New batch submitted so hopefully up again soon. Sorry 'bout that...
ID: 29096 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : CMS Application : Condor Problems?


©2024 CERN