Message boards : CMS Application : no new WUs available
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 20 · Next

AuthorMessage
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 998
Credit: 6,264,307
RAC: 71
Message 34225 - Posted: 1 Feb 2018, 19:29:52 UTC - in response to Message 34223.  

Are you not getting any?
I got tasks during this day, but now again it says "no tasks available for CMS simulation, although the Project Status Page shows about 190 unsent.
Edit: just to find out, I tried to download a Theory task and a LHCb task, for both plenty of unsent should be available according to the Project Status Page.
However, no success either. So something seems to be wrong somewhere.

EDIT: now, all of a sudden, new CMS tasks were downloaded :-)

Strange, tho' I note that there has been a drop-off in running jobs from about 1715.
You don't have a time-of-day-controlled firewall by any chance? :-0!
ID: 34225 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,469,604
RAC: 104,060
Message 34229 - Posted: 2 Feb 2018, 8:58:41 UTC

the Project Status Page once more shows "0" unsent tasks for CMS, as well as for Theory and LHCb.
Any idea what's going on?
ID: 34229 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 998
Credit: 6,264,307
RAC: 71
Message 34231 - Posted: 2 Feb 2018, 10:19:16 UTC - in response to Message 34229.  

the Project Status Page once more shows "0" unsent tasks for CMS, as well as for Theory and LHCb.
Any idea what's going on?

No, not yet. There doesn't seem to be any change in the number of running jobs. Investigating...
ID: 34231 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,469,604
RAC: 104,060
Message 34232 - Posted: 2 Feb 2018, 11:17:29 UTC

right now, the Project Status Page shows 200 "unsent" CMS tasks (and about same number for LHCb and Theory).
However, when trying to download any, BOINC says "no tasks available"
ID: 34232 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,469,604
RAC: 104,060
Message 34236 - Posted: 2 Feb 2018, 15:39:09 UTC

I guess now it seems clear what the problem is - Nils wrote in the Sixtrack Thead short time ago:

"...Sadly yesterday and today the problem is much worse than usual as the response-time of our database is degraded and BOINC has problems pulling out tasks from the DB. At times BOINC clients will not be able to fetch any task from LHC@home, not even for the VM applications..."
ID: 34236 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 998
Credit: 6,264,307
RAC: 71
Message 34237 - Posted: 2 Feb 2018, 15:49:10 UTC - in response to Message 34236.  

Yes, that seems to be part of the problem. Right now I'm getting "Server error: feeder not running", and the server status hasn't been updated recently.
ID: 34237 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,469,604
RAC: 104,060
Message 34238 - Posted: 2 Feb 2018, 17:14:13 UTC - in response to Message 34237.  

Yes, that seems to be part of the problem. ...
It's really too bad that since about Mid-December, when the huge ATLAS desaster startet (followed by all the Sixtrack problems), the whole LHC situation becomes worse and worse.

So far, CMS, LHCb and Theory were NOT affected - until day before yesterday. From then on, ALL sub-projects are having major problems :-(
ID: 34238 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 998
Credit: 6,264,307
RAC: 71
Message 34239 - Posted: 2 Feb 2018, 19:43:57 UTC - in response to Message 34238.  

I got some tasks at 1909, and the server status page is showing some tasks available. So far the number of running jobs hasn't dropped too far, but it remains to be seen if we can get back to the levels of last weekend. I need to submit a new batch of jobs tomorrow; before they take effect we might get a shot at some smaller test batches that are waiting to run, hopefully with some of them going to a new Tier-3 site we are setting up (it will be Laurence's farm eventually, but currently I believe just one VM is involved). If we get that to work we will be a lot closer to integrating into the CMS Production team upon which my need to continually monitor will drop away (just in time for retirement?).
ID: 34239 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,469,604
RAC: 104,060
Message 34240 - Posted: 3 Feb 2018, 6:20:24 UTC

early evening yesterday, new CMS tasks could be downloaded :-)

However, what I notice on my tasks list (website): for all CMS tasks that were finished and uploaded from about yesterday noon on, in the column "credit" it says "pending".
With CMS, I never saw this before. Normally, the credit shows up short time after upload.
What's wrong?
ID: 34240 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,020,469
RAC: 136,235
Message 34241 - Posted: 3 Feb 2018, 8:12:42 UTC - in response to Message 34240.  

... What's wrong?

Nothing, as long as your tasks are in the validation queue.

See:
https://lhcathome.cern.ch/lhcathome/server_status.php
Task data as of 3 Feb 2018, 7:58:10 UTC
Workunits waiting for validation	298
ID: 34241 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,020,469
RAC: 136,235
Message 34242 - Posted: 3 Feb 2018, 8:18:20 UTC - in response to Message 34239.  

... just in time for retirement ...

:'(
ID: 34242 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,469,604
RAC: 104,060
Message 34243 - Posted: 3 Feb 2018, 8:39:23 UTC - in response to Message 34241.  

See:
https://lhcathome.cern.ch/lhcathome/server_status.php
Task data as of 3 Feb 2018, 7:58:10 UTC
Workunits waiting for validation	298
it's hard to image that only 298 tasks are waiting for validation. If you figure how many Sixtrack tasks are being uploaded permanently, and are then waiting for validation over days and even weeks; so, the figure "298" is definitely wrong.
ID: 34243 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,167,395
RAC: 16,137
Message 34244 - Posted: 3 Feb 2018, 9:00:18 UTC

I agree with Erich56, the server status shows now pending as 327 and I have on my 3 hosts about 650 pending validations.
ID: 34244 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,020,469
RAC: 136,235
Message 34245 - Posted: 3 Feb 2018, 9:13:21 UTC - in response to Message 34243.  

See:
https://lhcathome.cern.ch/lhcathome/server_status.php
Task data as of 3 Feb 2018, 7:58:10 UTC
Workunits waiting for validation	298
it's hard to image that only 298 tasks are waiting for validation. If you figure how many Sixtrack tasks are being uploaded permanently, and are then waiting for validation over days and even weeks; so, the figure "298" is definitely wrong.

Well, at the end it's just a status flag in the server's database that is not yet set for whatever reason (guess due to the high load you mentioned). Nothing you should be worried about as the number is not very large nor can you do anything on your side to speed up the process.
ID: 34245 · Report as offensive     Reply Quote
Hona

Send message
Joined: 29 Sep 04
Posts: 5
Credit: 3,043,759
RAC: 0
Message 34246 - Posted: 3 Feb 2018, 9:41:41 UTC

ATM the SSP shows a Transitioner backlog of about 16 h.
So i think that is the reason for the wrong WU Status shown on the task list.
I see these false status flags also on my task list for sixtrack and Atlas.
ID: 34246 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,189,584
RAC: 104,309
Message 34247 - Posted: 3 Feb 2018, 10:17:41 UTC

The good news is:
Sixtrack tasks are downgrading to ZERO:
At the moment:177k.
So, if this is realy and no new Sixtrack tasks for the pipeline, we will see tomorrow a better performance for the VM-Projects :-))
ID: 34247 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 998
Credit: 6,264,307
RAC: 71
Message 34249 - Posted: 3 Feb 2018, 11:40:33 UTC - in response to Message 34247.  

The good news is:
Sixtrack tasks are downgrading to ZERO:
At the moment:177k.
So, if this is realy and no new Sixtrack tasks for the pipeline, we will see tomorrow a better performance for the VM-Projects :-))

The bad news is:
The WMAgent is having problems.
My monitor shows only about another 40 minutes before the Condor job queue is empty.
Messages sent, I hope someone at CERN is checking their e-mails.
ID: 34249 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,189,584
RAC: 104,309
Message 34250 - Posted: 3 Feb 2018, 12:09:32 UTC - in response to Message 34249.  

Ivan,
thank you, your work seem 24/7 ;-).
ID: 34250 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 998
Credit: 6,264,307
RAC: 71
Message 34251 - Posted: 3 Feb 2018, 12:31:17 UTC - in response to Message 34249.  

The job queue is dry. Set No New Tasks to avoid BOINC errors.
ID: 34251 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,020,469
RAC: 136,235
Message 34252 - Posted: 3 Feb 2018, 12:50:49 UTC - in response to Message 34251.  

The job queue is dry.

Confirmed.
Thanks Ivan.
ID: 34252 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 . . . 20 · Next

Message boards : CMS Application : no new WUs available


©2024 CERN