Message boards : CMS Application : EXIT_NO_SUB_TASKS
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 · Next

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1982
Credit: 142,944,074
RAC: 99,435
Message 44382 - Posted: 23 Feb 2021, 16:00:17 UTC - in response to Message 44381.  

Don't know if this helps to identify the problem.

ATM I don't have any CMS tasks but this morning I had some still running from last night.
I noticed that those tasks got fresh jobs after they finished the one before.
At the same time all fresh tasks failed with EXIT_NO_SUB_TASKS.

Might be that the Condor server doesn't accept new clients.
ID: 44382 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 720
Message 44384 - Posted: 24 Feb 2021, 11:19:37 UTC
Last modified: 24 Feb 2021, 11:32:06 UTC

OK, finally found a condor problem. The negotiator service had to be manually restarted; it's still unclear what caused the failure.
Jobs are running again.
[Edit] There was a fault introduced into a configuration file. It has been corrected and should not recur. [/Edit]
ID: 44384 · Report as offensive     Reply Quote
bozz4science

Send message
Joined: 3 May 20
Posts: 8
Credit: 494,285
RAC: 56
Message 44385 - Posted: 24 Feb 2021, 23:26:13 UTC
Last modified: 24 Feb 2021, 23:26:47 UTC

I just checked my work results, and it seems that I have 70 CMS faulty tasks that all errored out after ~1000 sec. So can I assume this is connected to the said condor problem?

If so, why weren't jobs cancelled then?

Still, happy to see that the issue has been resolved meanwhile and CMS WU are crunching fine again.
ID: 44385 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1982
Credit: 142,944,074
RAC: 99,435
Message 44387 - Posted: 25 Feb 2021, 7:25:01 UTC - in response to Message 44385.  

So can I assume this is connected to the said condor problem?

Most likely yes, but since your computers are hidden the logfiles can't be checked for other reasons.


If so, why weren't jobs cancelled then?

See here:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5090&postid=44381
ID: 44387 · Report as offensive     Reply Quote
bozz4science

Send message
Joined: 3 May 20
Posts: 8
Credit: 494,285
RAC: 56
Message 44392 - Posted: 26 Feb 2021, 1:18:52 UTC - in response to Message 44387.  

Thanks for the pointer!

Didn't realize mine were hidden. I usually never do this. Now they should be visible. I'll keep running CMS tasks in the meantime :)
ID: 44392 · Report as offensive     Reply Quote
NOGOOD

Send message
Joined: 18 Nov 17
Posts: 118
Credit: 40,690,401
RAC: 13,371
Message 44510 - Posted: 18 Mar 2021, 7:15:25 UTC

EXIT_NO_SUB_TASKS again
ID: 44510 · Report as offensive     Reply Quote
Ruud van der Kroef

Send message
Joined: 16 Aug 05
Posts: 5
Credit: 2,795,425
RAC: 0
Message 44513 - Posted: 18 Mar 2021, 8:46:16 UTC

Same here.

Question: can I just put the tasks in my queue on hold, and wait for better times?
ID: 44513 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1982
Credit: 142,944,074
RAC: 99,435
Message 44514 - Posted: 18 Mar 2021, 9:51:18 UTC - in response to Message 44513.  

Yes, but you may be aware that if just 1 task from a project is set on hold your BOINC client will not download any new task from that project.
ID: 44514 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 720
Message 44515 - Posted: 18 Mar 2021, 13:35:04 UTC - in response to Message 44510.  

EXIT_NO_SUB_TASKS again

Sorry, I misjudged when the last batch of jobs would end.
ID: 44515 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 727
Credit: 479,178,114
RAC: 268,413
Message 44601 - Posted: 29 Mar 2021, 18:06:41 UTC

Looks like this time again.

Is the effort to link the back end to BOINC job generation so much effort cf manually filling the queue and the server load when tasks run out?
ID: 44601 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 720
Message 44767 - Posted: 19 Apr 2021, 10:43:13 UTC

There's another WMAgent update in the pipeline. I'm letting the current job queue drain. Jobs will start to become scarce some time Wednesday
if things continue as they are. Fingers crossed we'll be back up again by Thursday night (perhaps earlier...)
ID: 44767 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 720
Message 44789 - Posted: 22 Apr 2021, 14:16:32 UTC - in response to Message 44767.  

OK, we are up again after the upgrade, and jobs are getting out the door.
ID: 44789 · Report as offensive     Reply Quote
NOGOOD

Send message
Joined: 18 Nov 17
Posts: 118
Credit: 40,690,401
RAC: 13,371
Message 44854 - Posted: 2 May 2021, 13:48:36 UTC

Looks like EXIT_NO_SUB_TASKS again.
ID: 44854 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1511
Credit: 43,801,435
RAC: 43,066
Message 44855 - Posted: 2 May 2021, 15:28:01 UTC - in response to Message 44854.  

Looks like EXIT_NO_SUB_TASKS again.
yes, and the download of new tasks has stopped automatically, which makes sense.
So let's hope that CMS will soon be running again
ID: 44855 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 720
Message 44856 - Posted: 2 May 2021, 17:35:45 UTC

(I must not have hit the right "post" button earlier...)

Yeah, sorry, it's a holiday weekend here and I forgot to check on the queue status earlier today. Then I got distracted by the Portuguese F1 GP!
ID: 44856 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1006
Credit: 47,312,304
RAC: 2,791
Message 44857 - Posted: 2 May 2021, 19:22:23 UTC - in response to Message 44856.  

(I must not have hit the right "post" button earlier...)
Yeah, sorry, it's a holiday weekend here and I forgot to check on the queue status earlier today. Then I got distracted by the Portuguese F1 GP!


Not quite the same as the Kentucky Derby (glad I didn't make a bet)
I have just been running the CMS here one at a time lately and so far all have been Valid.
But I did have back to back time wasters over at -dev ( same version as far as I know)
Back to running one again so I think it was just the usual slow internet speed and mine tried to start all 3 CMS tasks I have running at the same time.
( I hope you get that F1 GP in HDTV Ivan)
ID: 44857 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1511
Credit: 43,801,435
RAC: 43,066
Message 44858 - Posted: 2 May 2021, 19:50:40 UTC - in response to Message 44857.  

now though all downloaded tasks error out after a few minutes with:

2021-05-02 20:17:03 (1516): VM Completion Message: Could not connect to Condor server on port 9618
.
for example, see: https://lhcathome.cern.ch/lhcathome/result.php?resultid=315934632
ID: 44858 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1006
Credit: 47,312,304
RAC: 2,791
Message 44859 - Posted: 2 May 2021, 20:12:40 UTC - in response to Message 44858.  
Last modified: 2 May 2021, 20:29:01 UTC

Looks like the condor is taking sunday off again.
I have one here that has been running 20% so far (3hrs 42mins) and this host doesn't have enough ram to try a new one and the ones I have this CMS version are -dev hosts so I would have to mess it all up by d/ling this version over here for hours but when one of the 2 CMS I have running is finished I will give this one a try again ( I have one of these on this host running from both here and at -dev at the same time)
But both of them started 25 mins apart on this one and are actually running.
Maybe Ivan will get this back to normal on monday.
http://localhost:50606/logs/running.log
ID: 44859 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1006
Credit: 47,312,304
RAC: 2,791
Message 44861 - Posted: 3 May 2021, 7:03:48 UTC - in response to Message 44859.  
Last modified: 3 May 2021, 7:04:21 UTC

Well I finally finished the 2 I had running .....one from here and one from over at -dev and started 2 more of the same 2 hours ago and they are running as they should so maybe if there was a problem it was fixed.
(it is monday here for 3 minutes so that means it is the start of a new day in Geneva and London)
ID: 44861 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 879
Credit: 5,850,679
RAC: 720
Message 44862 - Posted: 3 May 2021, 12:38:01 UTC - in response to Message 44857.  

( I hope you get that F1 GP in HDTV Ivan)

I'm not that much of a fan, besides my broadband is lousy... Tuned my TV to Radio 5 Sports Extra for the sound, and my laptop to bbc.co.uk/sports for the comments and leaderboard. As an ex-racer myself, I do wish that the motorcycling GPs were available free-to-air here, I haven't seen one in 5 or 6 years (last time I visited my brother in Oz).
ID: 44862 · Report as offensive     Reply Quote
Previous · 1 . . . 9 · 10 · 11 · 12 · 13 · 14 · 15 · Next

Message boards : CMS Application : EXIT_NO_SUB_TASKS


©2022 CERN