Message boards : Theory Application : [ERROR] No jobs were available to run.
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 36962 - Posted: 6 Oct 2018, 0:47:56 UTC

I see we have hundreds of these again today with Theory and CMS tasks

I have over 20 so far and I checked another members who runs lots of these too and see the same thing so I have to go suspend all of mine so they don't all end up doing this.
ID: 36962 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 36963 - Posted: 6 Oct 2018, 1:08:08 UTC - in response to Message 36962.  

You may as well suspend your LHCb tasks too because they're not gonna get any sub-tasks either unless they hooked up with Condor before this latest problem started.
ID: 36963 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 36964 - Posted: 6 Oct 2018, 1:17:13 UTC - in response to Message 36963.  

Yeah I would have done that but the only LHCb's I have running are the beta version and they are still working.
(version 1.07)
ID: 36964 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 36965 - Posted: 6 Oct 2018, 5:58:56 UTC

I, too, had a few Theory tasks without jobs last night. Now it seems to work again.
With LHCb no problem so far.

And what concerns CMS: yesterday, this sub-project was removed from the list in the Project Status Page. So, obviously, it's dead for the time being (which is too bad).
ID: 36965 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 36970 - Posted: 7 Oct 2018, 4:48:47 UTC - in response to Message 36965.  

last night, same thing: there were a few Theory tasks not getting jobs.

Why is it that always during the night hours, we temporarily run out of jobs?
ID: 36970 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 37027 - Posted: 15 Oct 2018, 0:01:00 UTC

And again.........8 in a row so far
ID: 37027 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 37043 - Posted: 15 Oct 2018, 19:04:15 UTC

I had a few ones again yesterday afternoon and today afternoon.
ID: 37043 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,859,024
RAC: 126,466
Message 37046 - Posted: 16 Oct 2018, 11:22:51 UTC - in response to Message 37043.  

Since 11.00 UTC no new tasks.
ID: 37046 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 37299 - Posted: 10 Nov 2018, 18:45:58 UTC

No new tasks again and the ones we do have are once again......[ERROR] Condor exited after 4751s without running a job.

They had been running good for almost a week but tend to do this on a saturday.
So far I have 20 in a row doing this and another 84 to run.
ID: 37299 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 37334 - Posted: 14 Nov 2018, 0:20:50 UTC

ID: 37334 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 37335 - Posted: 14 Nov 2018, 5:48:43 UTC - in response to Message 37334.  

https://lhcathome.cern.ch/lhcathome/results.php?hostid=9930008&offset=0&show_names=0&state=6&appid=13
Again today we have thousands of these.
Again, I really can't understand why no-one at LHC takes care of these recurring problems :-(
ID: 37335 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 167
Credit: 14,938,551
RAC: 191
Message 37336 - Posted: 14 Nov 2018, 9:21:43 UTC - in response to Message 37335.  
Last modified: 14 Nov 2018, 9:27:23 UTC

https://lhcathome.cern.ch/lhcathome/results.php?hostid=9930008&offset=0&show_names=0&state=6&appid=13 Again today we have thousands of these.
Isn't selecting only failed tasks cheating a bit? https://lhcathome.cern.ch/lhcathome/results.php?hostid=9930008&offset=0&show_names=0&state=4&appid=13 shows that there are some valid subtasks out there.

Again, I really can't understand why no-one at LHC takes care of these recurring problems :-(
OK: so what's your suggestion for when a project has only a small amount of work available? Should it just give up on BOINC and run the work privately? I suppose they could drastically reduce the number of pilots so that each is more likely to actually get a sub-task - but there'd still be whining here about the lack of WUs instead.

Edit: although telling us what's going on wouldn't do any harm!
ID: 37336 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 37338 - Posted: 14 Nov 2018, 9:44:14 UTC - in response to Message 37336.  

Edit: although telling us what's going on wouldn't do any harm!
yes, at least this could be done and would be nice for us crunchers.
ID: 37338 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 37340 - Posted: 14 Nov 2018, 11:40:04 UTC

Henry Nebrensky nothing you said makes any sense and it doesn't even belong here..
ID: 37340 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 37349 - Posted: 15 Nov 2018, 11:20:11 UTC - in response to Message 37335.  

https://lhcathome.cern.ch/lhcathome/results.php?hostid=9930008&offset=0&show_names=0&state=6&appid=13
Again today we have thousands of these.
Again, I really can't understand why no-one at LHC takes care of these recurring problems :-(
this morning: same thing: all tasks without jobs :-(

And it is rather annoying by now. Why do the LHC@Home people not solve this never-ending problem?
ID: 37349 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 37352 - Posted: 15 Nov 2018, 18:16:47 UTC - in response to Message 37349.  

Yes same thing here Erich

And as always I checked other members stats who do lots of these and they had the same problem so that tells us it is at the Cern server end again.
But with a couple tries I reloaded all my hosts again for a another run.

But I think once in a while all we need to do is mention it here and they usually take care of the problem.
ID: 37352 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1687
Credit: 103,100,237
RAC: 127,243
Message 37358 - Posted: 16 Nov 2018, 6:47:11 UTC - in response to Message 37352.  

Yes same thing here Erich ...
But I think once in a while all we need to do is mention it here and they usually take care of the problem.
so far they havn't though. From what I could see this morning: tons of failed tasks due to lack of jobs.
These tasks run for about 20 minutes, then they fail - what a waste :-(

Something is running very wrong over there, and obvioulsly they don't have the experts to get that fixed.
ID: 37358 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 373
Credit: 238,712
RAC: 0
Message 37359 - Posted: 16 Nov 2018, 8:45:13 UTC - in response to Message 37358.  

The number of queued jobs has been increased. This should resolve any issues with no sub tasks.
ID: 37359 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,723,876
RAC: 13,891
Message 37365 - Posted: 16 Nov 2018, 19:12:27 UTC - in response to Message 37359.  
Last modified: 16 Nov 2018, 19:22:15 UTC

Thanks Laurence

So far today 23 Valids and many more running with no problems.

( Erich I am sending you a pm about this )
ID: 37365 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2090
Credit: 158,859,024
RAC: 126,466
Message 37366 - Posted: 17 Nov 2018, 9:52:32 UTC - in response to Message 37358.  

Something is running very wrong over there, and obvioulsly they don't have the experts to get that fixed.

Erich,
please more respect for the Cern-IT and project-Teams!
ID: 37366 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Theory Application : [ERROR] No jobs were available to run.


©2024 CERN