Message boards : LHCb Application : Computation error 206
Message board moderation

To post messages, you must log in.

AuthorMessage
bamamath

Send message
Joined: 21 Jan 06
Posts: 6
Credit: 2,269,010
RAC: 0
Message 29870 - Posted: 8 Apr 2017, 0:54:31 UTC

Most of the LHCb jobs that run on my machine end with the subject error. I upgraded to the latest Virtual Box, so have no further clue what's apparently going haywire. Thoughts please on anything I can do to fix this apparent problem.
ID: 29870 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 942
Credit: 6,295,475
RAC: 1,091
Message 29874 - Posted: 8 Apr 2017, 7:10:59 UTC - in response to Message 29870.  

Obviously there are no LHCb-jobs at the moment.

Exit status 206 (0x000000CE) EXIT_INIT_FAILURE

and in the stderr output

VM Completion Message: Condor exited after 620s without running a job.
ID: 29874 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 29875 - Posted: 8 Apr 2017, 7:18:58 UTC - in response to Message 29870.  
Last modified: 8 Apr 2017, 7:19:23 UTC

It seems that the problem comes from the server side.
I switched to cms untill it is fixed.
Apparently , even the wlcg performance test cluster encountered the same troubles.
It could be interesting when troubles occurs to have information on this page , using a colour code on the name of the application specified.
Example : a traffic light
name in :
Green when all is correct.
Yellow when troubles can be solved quickly.(no wu or upgrade server)
Red when the fail is permanent and admins are looking after the person which has the solution.(outage or overload)

So , crunchers would be able to know what choice to be made to not waste their time computer and select a sub-project with a chance to be useful for the community.

Is it possible ?
ID: 29875 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 597
Credit: 371,397,369
RAC: 28,643
Message 29878 - Posted: 8 Apr 2017, 9:07:54 UTC

I have my computer enabled to receive task from all sub-projects. Typically it receives all LHCb tasks.

It would be great if the system knew there was an issue and gave me tasks from other sub-projects.
ID: 29878 · Report as offensive     Reply Quote
bamamath

Send message
Joined: 21 Jan 06
Posts: 6
Credit: 2,269,010
RAC: 0
Message 29879 - Posted: 8 Apr 2017, 11:36:47 UTC - in response to Message 29870.  

I have deselected LHCb tasks from my LHC@home preferences and Updated the project. Received one Atlas 8 Core, one CMS, and two Theory Simulation applications. Hope this works for others. Happy computing.
ID: 29879 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 221
Credit: 4,950,597
RAC: 4,643
Message 29891 - Posted: 10 Apr 2017, 9:08:12 UTC

The HTCondorCE machine that sends LHCb simulations to the VM was sick since Friday, leading to these timeouts from the application within the VM.

LHCb simulations should be operational again now. Sorry for the trouble.
ID: 29891 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 29895 - Posted: 10 Apr 2017, 20:15:42 UTC - in response to Message 29891.  
Last modified: 10 Apr 2017, 20:16:14 UTC

Is there a quota for daily tasks in LHCb ?(-1 for tasks with errors)
If this is real , it would be better to reset all the hosts which had lots of errored wus during the past week-end in order to use more quickly the big hosts to come back and run LHCb tasks.
ID: 29895 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 597
Credit: 371,397,369
RAC: 28,643
Message 29896 - Posted: 10 Apr 2017, 20:54:23 UTC

I feel not, on my computers it will just keep getting tasks, waiting 10min to fail then getting another set.

It would better if there was then it would kick you over to another set of task as LHCb tends to hog the computer if given choice of all projects.
ID: 29896 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 29905 - Posted: 11 Apr 2017, 16:43:55 UTC - in response to Message 29896.  
Last modified: 11 Apr 2017, 16:54:03 UTC

I came back to LHCb this morning and after solving some troubles on my side because a vm was stuck in virtualBox Manager , it occured the same behaviour ,you spoke about in your post(10 min working and then failing).
But after deleting this vm in virtualbox manager , the next task seems to run normaly (9h00 spent and still running).
I think it's better to have the pack extension same version virtualbox installed , so you can see the vm unreachable in virtualbox ,cleaning up the environment free the link between boinc and virtualbox.
Now i cross fingers , but it is working as expected.
---------------------------------------------------------------------------------
I believe that LHCb tasks hog the computer because they are not yet in production as other sub-projects.
Maybe the priority given to test application if "run test application" is checked, has to be modified ?
For instance (75 % to test application and 25 % to normal application , so the balancing between sub-project could be effective when LHCb is out of order...
---------------------------------------------------------------------------------
What are the difficulties which prevent LHCb to be like other sub-projects ?
ID: 29905 · Report as offensive     Reply Quote

Message boards : LHCb Application : Computation error 206


©2020 CERN