Message boards : LHCb Application : No more job
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 29937 - Posted: 16 Apr 2017, 9:01:39 UTC

It appears that no jobs are available for LHCb.
And the status project doesn't show this fact...

"206 (0x000000CE) EXIT_INIT_FAILURE"

2017-04-16 10:49:00 (8004): Guest Log: [ERROR] Condor exited after 607s without running a job.


I think it's better to switch to other subprojects untill it is fixed.
ID: 29937 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 29947 - Posted: 17 Apr 2017, 8:23:22 UTC - in response to Message 29937.  

I found two ways to see the real activity of LHCb :

1°) LHCb Job Activities :





You have to merge the two diagrams to understand if the jobs running are failing or not.

2°) Top participants by application :

this shortcut shows the list of top 20 participants on LHCb,sorted by RAC.
Select a user , then a host , and you can considerate the results it has in operating LHCb 's wus.
So you can guess the probability of jobs available , seeing the behavior of one of these hosts which runs plenty of LHCb jobs according to their recent average credit.
----------------------------------------------------------------------------------------------------------------------------------------------------------------
Hope it gives you a better point of view on how to decide if this is worth running or not LHCb jobs at a particular moment.
I think it 's important to optimize the use of computer's volunteers (for them ,for the project , and for the planet...) ୧[ ˵ ͡ᵔ ͜ʟ ͡ᵔ ˵ ]୨
ID: 29947 · Report as offensive     Reply Quote
Cinzia

Send message
Joined: 3 Mar 16
Posts: 5
Credit: 157,749
RAC: 0
Message 30036 - Posted: 25 Apr 2017, 7:02:44 UTC

Dear Philippe,

We have waiting jobs again. Sorry for the break.

Cheers
Cinzia
ID: 30036 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 30047 - Posted: 25 Apr 2017, 17:24:32 UTC - in response to Message 30036.  

Dear Cinzia ,

Satisfied to hear again news about this sub-project.
Hoping , it will return to a normal status as soon as possible.

Regards.
ID: 30047 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1260
Credit: 22,998,247
RAC: 2,880
Message 30152 - Posted: 2 May 2017, 10:26:21 UTC

Any idea when new Jobs will be available?
I was so eager to try LHCb - no chance so far :-(
ID: 30152 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 423
Credit: 22,582,261
RAC: 5,699
Message 30153 - Posted: 2 May 2017, 12:23:36 UTC - in response to Message 30152.  

I think there are jobs available. I just finished one 10 minutes after your post (runtime about 44 000 seconds). 6 more has been downloaded by Boinc but I am unable to see how they are doing because I am not currently at that host.
ID: 30153 · Report as offensive     Reply Quote
Luca Tomassetti

Send message
Joined: 26 Apr 17
Posts: 7
Credit: 22,463
RAC: 0
Message 30158 - Posted: 2 May 2017, 13:32:35 UTC - in response to Message 30152.  

Dear Erich56 (and all),

you should be able to crunch LHCb jobs so far.
Just started three a couple of minutes ago...

Please let me know if the issue persists.
(no need to tick the test job checkbox now)

Cheers,
Luca
ID: 30158 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1428
Credit: 73,041,244
RAC: 106,143
Message 30159 - Posted: 2 May 2017, 13:52:10 UTC - in response to Message 30158.  

After a successful WU a couple of days ago last weekend I had 4 in a row that failed with the typical "206 (0x000000CE) EXIT_INIT_FAILURE" as if there were no jobs available for the running VM.

See:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10459491&offset=0&show_names=0&state=0&appid=12
ID: 30159 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1260
Credit: 22,998,247
RAC: 2,880
Message 30163 - Posted: 2 May 2017, 15:54:38 UTC - in response to Message 30158.  

Dear Erich56 (and all),

you should be able to crunch LHCb jobs so far.
Just started three a couple of minutes ago...

Please let me know if the issue persists.
(no need to tick the test job checkbox now)

Cheers,
Luca


Right now all my hosts are inmidst of crunching CMS, so I don't want to interrupt at the moment.
However, from what is shown here:

http://lhcathomedev.cern.ch/lhcathome-dev/lhcb_job.php

it looks like as if there are no Jobs currently.
ID: 30163 · Report as offensive     Reply Quote
Luca Tomassetti

Send message
Joined: 26 Apr 17
Posts: 7
Credit: 22,463
RAC: 0
Message 30203 - Posted: 4 May 2017, 12:26:42 UTC - in response to Message 30163.  

dear all,

the issue with 206 error should be fixed.
Jobs have always been available but the batch system was preventing the submission to the VMs in some cases.
Please, try to run LHCb jobs now!
ID: 30203 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1260
Credit: 22,998,247
RAC: 2,880
Message 30241 - Posted: 6 May 2017, 7:04:31 UTC

according to this:

http://lhcathomedev.cern.ch/lhcathome-dev/lhcb_job.php

no jobs available since yesterday. When will there be new ones for download and crunching?
ID: 30241 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 589
Credit: 371,136,775
RAC: 19,577
Message 30242 - Posted: 6 May 2017, 9:03:20 UTC

Not all of mine are idle, some are at 2% others are 98%.
ID: 30242 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1428
Credit: 73,041,244
RAC: 106,143
Message 30772 - Posted: 14 Jun 2017, 4:48:23 UTC

Since yesterday there are no jobs available for LHCb WUs.
https://lhcathomedev.cern.ch/lhcathome-dev/lhcb_job.php

Nontheless the project server still fills up the WU queue.
https://lhcathome.cern.ch/lhcathome/server_status.php

It seems that Laurence's emergency break doesn't work, does it?
ID: 30772 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 30774 - Posted: 14 Jun 2017, 8:00:08 UTC - in response to Message 30772.  

LHCb uses pilots jobs. I have just stopped submitting these and we should see the emergency brake kick in. We need to add in the additional query for the real LHCb jobs.
ID: 30774 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1428
Credit: 73,041,244
RAC: 106,143
Message 30775 - Posted: 14 Jun 2017, 8:21:19 UTC - in response to Message 30774.  

Thank you.

I forgot to mention that the last job
https://lhcathome.cern.ch/lhcathome/result.php?resultid=145391330
did not shutdown automatically after 25 idle minutes.
I had to cancel it manually.
ID: 30775 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 30777 - Posted: 14 Jun 2017, 9:32:07 UTC - in response to Message 30775.  

Message from LHCb

There are some issue with the setup, the LHCb offline people are investigating
ID: 30777 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 30780 - Posted: 14 Jun 2017, 15:17:09 UTC - in response to Message 30777.  

LHCb tasks have been started again.
ID: 30780 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1428
Credit: 73,041,244
RAC: 106,143
Message 31068 - Posted: 26 Jun 2017, 7:03:34 UTC

ID: 31068 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1428
Credit: 73,041,244
RAC: 106,143
Message 31200 - Posted: 30 Jun 2017, 14:47:01 UTC - in response to Message 31068.  

Again no jobs?
See: https://lhcathomedev.cern.ch/lhcathome-dev/lhcb_job.php

Again.
Same reason as last friday?
ID: 31200 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1260
Credit: 22,998,247
RAC: 2,880
Message 31243 - Posted: 3 Jul 2017, 5:43:15 UTC

LHCb still down? What's the reason?
ID: 31243 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : LHCb Application : No more job


©2020 CERN