Message boards : Number crunching : VM Applications Errors
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 5 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 372
Credit: 238,712
RAC: 0
Message 30685 - Posted: 7 Jun 2017, 13:37:09 UTC
Last modified: 26 Sep 2017, 8:18:50 UTC

Here is a list of the common VM applications errors (exit status codes), causes and solutions. For all other errors first try a project reset. If there is still a problem, post a message with a link to the failed task.

  • EXIT_ABORTED_BY_CLIENT (194)
    This exit status is caused when the VM heartbeat is not detected. This mainly occurs when the VM fails to boot. First try a project reset and the try re-installing a recent version of VirtualBox. It can also occur when hardware virtualization is not enabled in the BIOS.
  • STATUS_ACCESS_VIOLATION (-1073741819)
    This occurs when using an old VirtualBox version with Windows 10. Upgrade VirtualBox to a more recent version.
  • EXIT_INIT_FAILURE (206)
    This error happens when the application is starting. There are three main reasons that are reported in the stderr_out log as the VM Completion Message:

    • Could not ping HTCondor
      This is typically a network related issue
    • The x509 proxy creation failed
      This is typically a network related issue
    • Condor exited after 107582s without running a job
      This usually occurs when the VM was stopped (not suspended) before the first job finished and was restarted after 18 hours. The subsequent task should run fine.


  • ERR_NO_NETWORK_CONNECTION (-203)
    This occurs when the VM does not have a network connection. As there is a task, it suggests that the host machines does have a connection. Check the BOINC preferences relating to the use of the network.
  • -152 ERR_NETOPEN
    This occurs when port 3125 or 9618 is closed to outgoing network traffic. Check your firewall setting.

ID: 30685 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,936,497
RAC: 137,523
Message 30686 - Posted: 7 Jun 2017, 13:56:04 UTC - in response to Message 30685.  

This one is missing:
EXIT_INIT_FAILURE (206) usually occurs when the task queue is empty.
ID: 30686 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1114
Credit: 49,501,728
RAC: 4,157
Message 30687 - Posted: 7 Jun 2017, 14:12:26 UTC

(I sent you a pm)

BUT you have to remember we all are not living in western Europe and we all do not have the fastest DSL on the planet and that is just the way it goes......we have no choice and I pay $100 per month for mine but many times (the only Invalids I ever get) are because of the DSL and NOT my computers or the versions of VB or Boinc.

I have been running thousands of these since March 1st 2011 and the DSL speed at any given time is my only problem and since I don't own Centurylink there is nothing that can be done.
ID: 30687 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 372
Credit: 238,712
RAC: 0
Message 30688 - Posted: 7 Jun 2017, 14:14:35 UTC - in response to Message 30686.  

It should be EXIT_NO_SUB_TASKS (207) when there are no jobs. Will investigate ...
ID: 30688 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,376,395
RAC: 102,054
Message 30690 - Posted: 7 Jun 2017, 16:23:53 UTC - in response to Message 30688.  

This one is missing:
EXIT_INIT_FAILURE (206) usually occurs when the task queue is empty.

It should be EXIT_NO_SUB_TASKS (207) when there are no jobs. Will investigate ...

Here, it is also EXIT_INIT_FAILURE (206) whenever no tasks are available. Occurs after about 10 minutes.
ID: 30690 · Report as offensive     Reply Quote
PFLIEGER Guy

Send message
Joined: 22 Jun 17
Posts: 12
Credit: 272,799
RAC: 0
Message 30959 - Posted: 23 Jun 2017, 5:03:45 UTC

5CPU WU couldn't be calculated because the application exced the memory space and I have 8 GB memory space and this is not enough
A lot of work unit write Hypervisor Failed
Since the begining of Formula Boinc the 22/06/2017

Guy PFLIEGER
Masevaux-Niederbruck
ID: 30959 · Report as offensive     Reply Quote
PFLIEGER Guy

Send message
Joined: 22 Jun 17
Posts: 12
Credit: 272,799
RAC: 0
Message 30961 - Posted: 23 Jun 2017, 6:03:08 UTC - in response to Message 30959.  

The last 5 CPU task couldn't run longer as 1:27 Forward the t5cPU tasks could run only 0:06 to 0:08. It is a little better but not enough to run completly the 5CPU tasks
I hope the Hypervisor could sleep this night because he must today solve the problems.
I Have 8 GB memory space. I don't know how much need the minimal memory space to run the 5cpu WU
Drink a lot and when the wetter is warm then slow down your processors with the tool box of the boinc manager to protect your computers and for having a better cooling of the system. Use, if you have it, the AEGIS 2 thermomether to Survey the cooling of the processors

Guy PFLIEGER
Masevaux-Niederbruck
ID: 30961 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,936,497
RAC: 137,523
Message 30964 - Posted: 23 Jun 2017, 7:03:57 UTC - in response to Message 30961.  

In short words: Your computer is overstrained with 5-core-WUs.

You may limit the #cores per WU to 2 via the project's website (your personal preferency page) and use an app_config.xml that includes the following RAM setting:
<cmdline>--memory_size_mb 4600</cmdline>
ID: 30964 · Report as offensive     Reply Quote
nikogianna

Send message
Joined: 30 Jan 17
Posts: 7
Credit: 132,213
RAC: 0
Message 31365 - Posted: 12 Jul 2017, 9:32:45 UTC - in response to Message 31364.  
Last modified: 12 Jul 2017, 9:37:30 UTC

This sounds a lot like a problem that a previous version of VboxWrapper had. Please, could you confirm you are using the latest version (by detaching and reattaching to the project)? Also, what application have you selected ? If something different than Theory, please try selecting it and see if you complete a task without problems.

Thank you for your support.
ID: 31365 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 31366 - Posted: 12 Jul 2017, 9:57:23 UTC - in response to Message 31364.  

when the VM suspends because of processor usage then the vm will not run next time in the same user session !

in addition on reboot progress has to start from the beginning !

You should look into my Checklist, especially Nr. 2 regarding Win10


Supporting BOINC, a great concept !
ID: 31366 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 33434 - Posted: 18 Dec 2017, 11:32:08 UTC

I would like very much to know while all LHC tasks fail except ATLAS and SixTrack. This happens both on a Windows `10 PC with 22 GB RAM and on a Linux box with 8 GB RAM running SuSE Leap 42.2. Then it does not depend on a badly configured PC. VirtualBox is 5.2.2 on all PCs.
Tullio
ID: 33434 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 33436 - Posted: 18 Dec 2017, 13:35:51 UTC - in response to Message 33434.  

I would like very much to know while all LHC tasks fail except ATLAS and SixTrack. This happens both on a Windows `10 PC with 22 GB RAM and on a Linux box with 8 GB RAM running SuSE Leap 42.2. Then it does not depend on a badly configured PC. VirtualBox is 5.2.2 on all PCs.
Tullio

Your tasks seem to be unable to collect jobs from the Condor servers, and eventually time out. Unfortunately I'm not expert enough to say why that might be. You seem to contact the servers OK -- it's suspiciously like a firewall problem of some sort.
ID: 33436 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,126,074
RAC: 105,437
Message 33437 - Posted: 18 Dec 2017, 13:37:52 UTC - in response to Message 33434.  
Last modified: 18 Dec 2017, 13:38:54 UTC

Tullio,
when you use the same Computer with the same preferences:
1. Atlas is running with two CPU's and finished successful. Saw this in a task.
2. Theory, CMS and LHCb can therefore not run with the same preferences, because they are using only ONE CPU.
In LHC-dev is multicore possible for Theory, CMS or LHCb!
ID: 33437 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 33566 - Posted: 29 Dec 2017, 20:12:28 UTC - in response to Message 33437.  

I am using 2 CPUs on Atlas tasks on Windows 10 PC, because the CPU has 4 cores, or 4 logical processors according to the Task Manager. I am using only one CPU on the Linux boxen and, again, all LHC tasks fail except Atlas and SixTrack. I have a new 30 Mbit connection via fiber on my modem, but it goes up to 40 Mbit in download and 10 Mbit in upload. The Windows PC has 22 GB RAM, mostly unused, the two Linux boxen 8 GB RAM All other BOINC projects run happily both on CPUs and GPUs (SETI, SETI Beta, Einstein, Climateprediction.net).
Tullio
ID: 33566 · Report as offensive     Reply Quote
Ernst van Loon

Send message
Joined: 11 Feb 14
Posts: 2
Credit: 2,173,775
RAC: 0
Message 36136 - Posted: 30 Jul 2018, 17:21:09 UTC

On an 8-core i7-4770 CPU on Windows10 I have this consistent error "Communication with VM Hypervisor failed. (6 CPU's)" since a couple of weeks. I tried re-installing VirtualBox (latest stand-alone version and the bundled version of BOINC+VirtualBox) a couple of times but nothing helps...
What can I do further to perform ATLAS simulations?

Best regards, Ernst
ID: 36136 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 36139 - Posted: 30 Jul 2018, 17:38:48 UTC - in response to Message 36136.  

What can I do further to perform ATLAS simulations?

Go through my checklist

and don't use VB 5.2.x, stay with latest 5.1.x (I'm on 5.1.30 and it works fine)


Supporting BOINC, a great concept !
ID: 36139 · Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 2 Jun 07
Posts: 32
Credit: 1,583,340
RAC: 0
Message 41018 - Posted: 20 Dec 2019, 3:59:44 UTC
Last modified: 20 Dec 2019, 4:08:08 UTC

I have a Dell Work Station that after enabling Virtualization I have been able to complete an Atlas WU. In my BIOS there were two settings one for CPU Virtualization and one for VM Direct I/O control. I enabled only the first setting for the CPU and not the second as I saw a Forum posting that it could cause issues with the Suspend function.

Should I have enabled both ?

On WU / Task 128721484 there were errors at the bottom (the WU validated and credit was issued)

Thank you in advance for any advice.

Bill F
In October 1969 I took an oath to support and defend the Constitution of the United States against all enemies, foreign and domestic;
There was no expiration date.


ID: 41018 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,936,497
RAC: 137,523
Message 41019 - Posted: 20 Dec 2019, 7:17:52 UTC - in response to Message 41018.  

Your WU succeeded.
This can be seen here:
2019-12-19 20:54:22 (13808): Guest Log: HITS file was successfully produced

This means you are using the correct BIOS setup.



Nonetheless there are some suggestions based on other logfile entries:
2019-12-19 10:36:56 (13808): Guest Log: BIOS: VirtualBox 5.2.8

Inside the VM Guest Additions 5.2.32 are used.
You may consider to either use the same version or the most resent VirtualBox version:
https://www.virtualbox.org/wiki/Downloads



2019-12-19 10:36:56 (13808): Setting CPU throttle for VM. (85%)

If this is set too low it sometimes causes WUs to fail.
You may consider to set it to a higher value.


Pausing/Resuming the VM at that high rates often results in a failed WU and should be avoided:
2019-12-19 12:30:55 (13808): VM state change detected. (old = 'Running', new = 'Paused')
2019-12-19 12:34:46 (13808): VM state change detected. (old = 'Paused', new = 'Running')
2019-12-19 12:37:54 (13808): VM state change detected. (old = 'Running', new = 'Paused')
2019-12-19 12:39:55 (13808): VM state change detected. (old = 'Paused', new = 'Running')
2019-12-19 12:42:30 (13808): VM state change detected. (old = 'Running', new = 'Paused')
ID: 41019 · Report as offensive     Reply Quote
benefique pour tous

Send message
Joined: 9 Sep 19
Posts: 32
Credit: 2,856,470
RAC: 0
Message 41205 - Posted: 8 Jan 2020, 18:31:28 UTC

I used an HP omen with RTX. i did my best to be successful
2times they get vrong
The error is occured in the latest computing second
Now this took place 3 times
Something must be wrong
I don'tknow on which computer you can solve it because it failed on 2 big I7 and on a good amd Ryzen 5
Sorry
Bhind you find the list of occured errors
Theory_2279-784485-196
applications
Theory Simulation
créé
19 Dec 2019, 17:43:43 UTC

erreurs
Trop de résultats totaux

Temps de fonctionnement
(sec)
Temps de CPU
(sec)
Crédit
Application
255962069
10536006
19 Dec 2019, 19:15:11 UTC
24 Dec 2019, 9:30:26 UTC
Erreur lors des calculs
44,307.94
19,950.05
---
Theory Simulation v300.02 (vbox64_theory)
windows_x86_64
256748413
10623629
24 Dec 2019, 9:38:14 UTC
5 Jan 2020, 1:05:58 UTC
Erreur lors des calculs
357,010.95
352,895.60
---
Theory Simulation v300.02 (vbox64_theory)
x86_64-pc-linux-gnu
257549284
10621995
4 Jan 2020, 9:38:20 UTC
8 Jan 2020, 18:19:04 UTC
Erreur lors des calculs
376,418.99
376,092.60
---
Theory Simulation v300.02 (vbox64_theory)
windows_x86_64



©2020 CERN
ID: 41205 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 43098 - Posted: 23 Jul 2020, 21:31:02 UTC - in response to Message 43097.  
Last modified: 23 Jul 2020, 21:33:11 UTC

Don't know. But since you are on Ubuntu, just install CVMFS. This is a fairly foolproof procedure:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5342&postid=41861

Then, in your profile, set "Run native if available".
It usually works these days, and if it fails, it will do so faster.

NOTE: This works as-is for Theory, but native ATLAS is another matter.
You need singularity for that; best left for another discussion (or just use VBox).
ID: 43098 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 5 · Next

Message boards : Number crunching : VM Applications Errors


©2024 CERN