Message boards :
Theory Application :
Error: process exited with code 194 (0xc2, -62)
Message board moderation
Author | Message |
---|---|
Send message Joined: 14 May 15 Posts: 17 Credit: 11,627,311 RAC: 0 |
Hi, many units failing with this error message in Ubuntu 18.01 with Virtualbox (5.2.18r124319). I've seen it in other linux hosts but I did not find a discussion of it in the forums. Sorry if I've missed it. Any hint?, they used to happen happen almost at the end of the process so many processing hours wasted. Many times all running units crash at a time with this error. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10564024&offset=0&show_names=0&state=6&appid= thanks! |
Send message Joined: 24 Oct 04 Posts: 1127 Credit: 49,750,905 RAC: 9,376 |
Hi, many units failing with this error message in Ubuntu 18.01 with Virtualbox (5.2.18r124319). I've seen it in other linux hosts but I did not find a discussion of it in the forums. Sorry if I've missed it. Any hint?, they used to happen happen almost at the end of the process so many processing hours wasted. Many times all running units crash at a time with this error. https://lhcathome.cern.ch/lhcathome/hosts_user.php?userid=352874 You do have a lot of processors and just a quick look (do you mean on one computer or all?) but I see lots of tasks saying they are aborted by user. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
You do have a lot of processors and just a quick look (do you mean on one computer or all?) but I see lots of tasks saying they are aborted by user. It might say "aborted by user" in the tasks list but in stderr report they seem to mostly say "194 aborted by client". |
Send message Joined: 27 Sep 08 Posts: 807 Credit: 652,392,960 RAC: 282,047 |
Looks like vitualbox crashed in while the task was running then it was aborted by the client. I would try reinstalling virtualbox and see if that helps. |
Send message Joined: 14 May 15 Posts: 17 Credit: 11,627,311 RAC: 0 |
Hi, Yes I've aborted tasks but the ones I'm referring to are the tasks finished as "Error while Computing" with the error in the thread title. I've included the link to the computer mostly affected thanks! |
Send message Joined: 24 Oct 04 Posts: 1127 Credit: 49,750,905 RAC: 9,376 |
Hi, I posted the link to all your computers for you because the one you posted is not one any of us can use here. You have to go to the particular task you are talking about and post the URL to the task where the stderr is below it. It will look like this https://lhcathome.cern.ch/lhcathome/result.php?resultid=215630542 A URL ending with a = will not show us anything. and as I said you have LOTS of cores in your list and members here prefer not having to look until we find what you are talking about. Btw if you do need a new version of VB you can get that HERE I see this on the first one I look at Required extension pack not installed, remote desktop not enabled. ....now some say they don't ever install the Extension Pack but I always do since it takes about 60 seconds and it is required so why not? Then you could even watch the VM Console on the Boinc Manager and you can also watch the Log for each task in the VB Manager to watch for problems as it happens. I would install the 5.2.26 version update and the Ext. Pack and reboot and try again and they will probably work. |
Send message Joined: 14 May 15 Posts: 17 Credit: 11,627,311 RAC: 0 |
[quote] I followed your advise and so far so good, no new 194 errors. Thanks |
Send message Joined: 14 May 15 Posts: 17 Credit: 11,627,311 RAC: 0 |
[quote] Nah, errors continue happening |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
I have gotten a couple of these too recently, on a very reliable i7-3770 running Ubuntu 16.04.5 and VB 5.1.38. They were running on two cores each. Note that they failed at the same time. https://lhcathome.cern.ch/lhcathome/result.php?resultid=217131643 https://lhcathome.cern.ch/lhcathome/result.php?resultid=217137889 But the curious part is that a Cosmology "camb_boinc2docker" failed at the same time, and it was running on five cores. So effectively nine cores were in use on an eight core machine. http://www.cosmologyathome.org/result.php?resultid=6116532 I have seen that happen before, though usually without error. An extra Theory will start up, and then after a few seconds one of them will exit. In fact, I have seen two extra theories start up, so they are trying to run eleven cores for a few seconds. So they apparently tripped each other up this time. NB: I reserve one core (via "use at most 90% of the processors") to support a GTX 980 on Folding, so normally seven cores should be in use. |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,859,285 RAC: 0 |
NB: I reserve one core (via "use at most 90% of the processors") to support a GTX 980 on Folding, so normally seven cores should be in use. Boinc might be trying to round that up to 100% as it is not an integer multiple of cores-to-use versus cores-present. To make sure it leaves 1 core free, set it to 7/8 = 87.5%. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Boinc might be trying to round that up to 100% as it is not an integer multiple of cores-to-use versus cores-present. To make sure it leaves 1 core free, set it to 7/8 = 87.5%. Thanks, but I have used that for years on all my machines with no problems for single-thread work units. But I should have remembered that BOINC does not always deal well with multi-thread work units, so running them from multiple projects is just asking for trouble. I have "fixed" the problem (more or less) but just using an app_config.xml file to limit Theory to running one at a time. That will prevent the multiple startups at least, and I may not see the problem again. |
©2024 CERN