Message boards : Theory Application : Error: process exited with code 194 (0xc2, -62)
Message board moderation

To post messages, you must log in.

AuthorMessage
Trotador

Send message
Joined: 14 May 15
Posts: 17
Credit: 11,150,705
RAC: 1
Message 37957 - Posted: 9 Feb 2019, 9:52:24 UTC

Hi, many units failing with this error message in Ubuntu 18.01 with Virtualbox (5.2.18r124319). I've seen it in other linux hosts but I did not find a discussion of it in the forums. Sorry if I've missed it. Any hint?, they used to happen happen almost at the end of the process so many processing hours wasted. Many times all running units crash at a time with this error.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10564024&offset=0&show_names=0&state=6&appid=


thanks!
ID: 37957 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 965
Credit: 40,925,070
RAC: 2,593
Message 37958 - Posted: 9 Feb 2019, 10:28:41 UTC - in response to Message 37957.  

Hi, many units failing with this error message in Ubuntu 18.01 with Virtualbox (5.2.18r124319). I've seen it in other linux hosts but I did not find a discussion of it in the forums. Sorry if I've missed it. Any hint?, they used to happen happen almost at the end of the process so many processing hours wasted. Many times all running units crash at a time with this error.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10564024&offset=0&show_names=0&state=6&appid=


thanks!



https://lhcathome.cern.ch/lhcathome/hosts_user.php?userid=352874

You do have a lot of processors and just a quick look (do you mean on one computer or all?) but I see lots of tasks saying they are aborted by user.
ID: 37958 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 37959 - Posted: 9 Feb 2019, 11:31:35 UTC - in response to Message 37958.  

You do have a lot of processors and just a quick look (do you mean on one computer or all?) but I see lots of tasks saying they are aborted by user.

It might say "aborted by user" in the tasks list but in stderr report they seem to mostly say "194 aborted by client".
ID: 37959 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 598
Credit: 378,146,179
RAC: 31,977
Message 37963 - Posted: 9 Feb 2019, 11:41:19 UTC

Looks like vitualbox crashed in while the task was running then it was aborted by the client.

I would try reinstalling virtualbox and see if that helps.
ID: 37963 · Report as offensive     Reply Quote
Trotador

Send message
Joined: 14 May 15
Posts: 17
Credit: 11,150,705
RAC: 1
Message 37965 - Posted: 9 Feb 2019, 12:41:04 UTC

Hi,

Yes I've aborted tasks but the ones I'm referring to are the tasks finished as "Error while Computing" with the error in the thread title.

I've included the link to the computer mostly affected

thanks!
ID: 37965 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 965
Credit: 40,925,070
RAC: 2,593
Message 37966 - Posted: 9 Feb 2019, 18:11:16 UTC - in response to Message 37965.  
Last modified: 9 Feb 2019, 18:36:49 UTC

Hi,

Yes I've aborted tasks but the ones I'm referring to are the tasks finished as "Error while Computing" with the error in the thread title.
I've included the link to the computer mostly affected
thanks!


I posted the link to all your computers for you because the one you posted is not one any of us can use here.
You have to go to the particular task you are talking about and post the URL to the task where the stderr is below it.

It will look like this
https://lhcathome.cern.ch/lhcathome/result.php?resultid=215630542

A URL ending with a = will not show us anything. and as I said you have LOTS of cores in your list and members here prefer not having to look until we find what you are talking about.

Btw if you do need a new version of VB you can get that HERE

I see this on the first one I look at Required extension pack not installed, remote desktop not enabled. ....now some say they don't ever install the Extension Pack but I always do since it takes about 60 seconds and it is required so why not?
Then you could even watch the VM Console on the Boinc Manager and you can also watch the Log for each task in the VB Manager to watch for problems as it happens.

I would install the 5.2.26 version update and the Ext. Pack and reboot and try again and they will probably work.
ID: 37966 · Report as offensive     Reply Quote
Trotador

Send message
Joined: 14 May 15
Posts: 17
Credit: 11,150,705
RAC: 1
Message 38013 - Posted: 14 Feb 2019, 20:34:41 UTC - in response to Message 37966.  

[quote]

.......

I would install the 5.2.26 version update and the Ext. Pack and reboot and try again and they will probably work.


I followed your advise and so far so good, no new 194 errors.

Thanks
ID: 38013 · Report as offensive     Reply Quote
Trotador

Send message
Joined: 14 May 15
Posts: 17
Credit: 11,150,705
RAC: 1
Message 38025 - Posted: 17 Feb 2019, 20:21:40 UTC - in response to Message 38013.  

[quote]

.......

I would install the 5.2.26 version update and the Ext. Pack and reboot and try again and they will probably work.


I followed your advise and so far so good, no new 194 errors.

Thanks


Nah, errors continue happening
ID: 38025 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 461
Credit: 12,902,175
RAC: 12,769
Message 38027 - Posted: 18 Feb 2019, 14:06:53 UTC
Last modified: 18 Feb 2019, 15:03:46 UTC

I have gotten a couple of these too recently, on a very reliable i7-3770 running Ubuntu 16.04.5 and VB 5.1.38. They were running on two cores each. Note that they failed at the same time.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=217131643
https://lhcathome.cern.ch/lhcathome/result.php?resultid=217137889

But the curious part is that a Cosmology "camb_boinc2docker" failed at the same time, and it was running on five cores. So effectively nine cores were in use on an eight core machine.
http://www.cosmologyathome.org/result.php?resultid=6116532

I have seen that happen before, though usually without error. An extra Theory will start up, and then after a few seconds one of them will exit. In fact, I have seen two extra theories start up, so they are trying to run eleven cores for a few seconds.

So they apparently tripped each other up this time.

NB: I reserve one core (via "use at most 90% of the processors") to support a GTX 980 on Folding, so normally seven cores should be in use.
ID: 38027 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 252
Credit: 11,225,577
RAC: 2
Message 38034 - Posted: 19 Feb 2019, 19:57:51 UTC - in response to Message 38027.  

NB: I reserve one core (via "use at most 90% of the processors") to support a GTX 980 on Folding, so normally seven cores should be in use.

Boinc might be trying to round that up to 100% as it is not an integer multiple of cores-to-use versus cores-present. To make sure it leaves 1 core free, set it to 7/8 = 87.5%.
ID: 38034 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 461
Credit: 12,902,175
RAC: 12,769
Message 38035 - Posted: 19 Feb 2019, 20:11:44 UTC - in response to Message 38034.  
Last modified: 19 Feb 2019, 20:12:58 UTC

Boinc might be trying to round that up to 100% as it is not an integer multiple of cores-to-use versus cores-present. To make sure it leaves 1 core free, set it to 7/8 = 87.5%.

Thanks, but I have used that for years on all my machines with no problems for single-thread work units. But I should have remembered that BOINC does not always deal well with multi-thread work units, so running them from multiple projects is just asking for trouble.

I have "fixed" the problem (more or less) but just using an app_config.xml file to limit Theory to running one at a time. That will prevent the multiple startups at least, and I may not see the problem again.
ID: 38035 · Report as offensive     Reply Quote

Message boards : Theory Application : Error: process exited with code 194 (0xc2, -62)


©2020 CERN