Message boards : Number crunching : VM Applications Errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · Next

AuthorMessage
[AF>France>Est>Alsace]PFLIEGER...

Send message
Joined: 30 Nov 20
Posts: 9
Credit: 950,380
RAC: 0
Message 45577 - Posted: 31 Oct 2021, 1:31:38 UTC

I had 4 computing errors this night but i think that my computing balance was wrong because the computerswhere processing with full processors
The true choice that i forgotted to apply is usually 50% of processors like if i compute with heart and not on full processors
2 processors= 1 heart
This was a mistake
Sorry
ID: 45577 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45579 - Posted: 31 Oct 2021, 2:40:43 UTC - in response to Message 45577.  

This is every year between 2 and 2 (Change Summertime CEST to Wintertime CET) for CMS.
ID: 45579 · Report as offensive     Reply Quote
Drago75

Send message
Joined: 22 Jan 21
Posts: 5
Credit: 270,750
RAC: 32
Message 45591 - Posted: 2 Nov 2021, 9:59:34 UTC

Why can't LHC offer a version of Virtualbox on their website which already includes all the right settings? This would be so much more helpful then reading through forums for hours on end and still only receiving calculation errors.
ID: 45591 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1417
Credit: 9,441,051
RAC: 885
Message 45592 - Posted: 2 Nov 2021, 10:17:49 UTC - in response to Message 45591.  

Why can't LHC offer a version of Virtualbox on their website which already includes all the right settings? This would be so much more helpful then reading through forums for hours on end and still only receiving calculation errors.

Maybe you have Hyper-V of Windows 10 Pro features enabled on your system.
If so, disable Hyper-V and reboot.
ID: 45592 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 41,981
Message 45594 - Posted: 2 Nov 2021, 11:36:16 UTC - in response to Message 45591.  

This is a message from one of your logs:
00:00:00.987093 AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)}, preserve=false aResultDetail=-4053

Nobody but you would be able to correctly configure your BIOS.
ID: 45594 · Report as offensive     Reply Quote
David M

Send message
Joined: 2 Dec 13
Posts: 5
Credit: 3,494,287
RAC: 0
Message 45713 - Posted: 18 Nov 2021, 5:46:53 UTC
Last modified: 18 Nov 2021, 5:53:58 UTC

I'm not sure quite what is going on but I just installed the new BOINC 7.16.20 package with VBox 6.1.12. After the combined install I am getting task failures after 17 secs. The job log for on task is below. Several tasks seem to have the same error message set. Am I right that the VBox didn't install correctly? Or is there some other issue with the BOINC package?

I'm running Windows 11 on an AMD based, home-built system. System is several months old and stable. As is the Windows 11 install. Virtual tasks were running successfully
until the new BOINC package was loaded. I run a few different projects and no virtual tasks are running successfully.

Any thoughts?

<core_client_version>7.16.20</core_client_version>
<![CDATA[
<message>
Incorrect function.
(0x1) - exit code 1 (0x1)</message>
<stderr_txt>
2021-11-18 00:13:39 (17692): Detected: vboxwrapper 26202
2021-11-18 00:13:39 (17692): Detected: BOINC client v7.16.20
2021-11-18 00:13:47 (17692): Error in guest additions for VM: -2147467262
Command:
VBoxManage -q list systemproperties
Output:
VBoxManage.exe: error: Failed to create the VirtualBox object!
VBoxManage.exe: error: Completely failed to instantiate CLSID_VirtualBox: E_NOINTERFACE
VBoxManage.exe: error: Details: code E_NOINTERFACE (0x80004002), component VirtualBoxClientWrap, interface IVirtualBoxClient

2021-11-18 00:13:47 (17692): Detected: VirtualBox VboxManage Interface (Version: 6.1.12)
2021-11-18 00:13:53 (17692): Error in host info for VM: -2147467262
Command:
VBoxManage -q list hostinfo
Output:
VBoxManage.exe: error: Failed to create the VirtualBox object!
VBoxManage.exe: error: Completely failed to instantiate CLSID_VirtualBox: E_NOINTERFACE
VBoxManage.exe: error: Details: code E_NOINTERFACE (0x80004002), component VirtualBoxClientWrap, interface IVirtualBoxClient

2021-11-18 00:13:53 (17692): WARNING: Communication with VM Hypervisor failed.
2021-11-18 00:13:53 (17692): ERROR: VBoxManage list hostinfo failed
00:13:53 (17692): called boinc_finish(1)

</stderr_txt>
]]>
ID: 45713 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 41,981
Message 45714 - Posted: 18 Nov 2021, 7:14:43 UTC - in response to Message 45713.  

Your computer details page claims you are running VirtualBox 6.0.14.
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10680644
Your task logfiles claim you are running VirtualBox 6.1.12.

The suggestion would be to upgrade to 6.1.26 directly from VirtualBox:
https://www.virtualbox.org/wiki/Download_Old_Builds_6_1
I read somewhere that the more recent 6.1.28 on Windows might have a bug but can't find that source ATM.


Ensure all keys from older versions are removed from the Windows registry.
The computer should also be rebooted.


This is from one of your logfiles:
2021-11-16 16:41:36 (18236): Setting CPU throttle for VM. (45%)

CPU throttle should not be set that low, best would be to leave it at 100%.
The reason is that the OS forces the complete VM plus vboxwrapper into a stop even if timing critical processes are running.
This sooner or later leads to errors.
ID: 45714 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45715 - Posted: 18 Nov 2021, 7:21:57 UTC

2021-11-18 00:13:47 (17692): Error in guest additions for VM: -2147467262
Do you have installed the extension pack for your Virtualbox-Version?
Do you have hyper-V active, this must be disabled?
ID: 45715 · Report as offensive     Reply Quote
David M

Send message
Joined: 2 Dec 13
Posts: 5
Credit: 3,494,287
RAC: 0
Message 45723 - Posted: 18 Nov 2021, 19:29:49 UTC - in response to Message 45715.  

I'm running Win11 on an AMD processor. I believe Hyper-V is Intel specific and I never install anything but what is in the Boinc releases and all has worked fine for about a decade.
ID: 45723 · Report as offensive     Reply Quote
David M

Send message
Joined: 2 Dec 13
Posts: 5
Credit: 3,494,287
RAC: 0
Message 45724 - Posted: 18 Nov 2021, 19:36:19 UTC - in response to Message 45714.  

I'll try the download from VirtualBox but I haven't had good luck with those installs when I have tried in the past. BOINC doesn't seem to know how to communicate with it.

On the CPU throttling, I found that some projects (Rosetta, particularly) were causing my CPU to heat to close to maximum acceptable temperature which was unacceptable to me and this has been my solution for the last few months. I should probably revisit my cooling design, but have not had another round of time for that. It's been working fine for the last few months (since May, I believe).
ID: 45724 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45725 - Posted: 18 Nov 2021, 19:53:40 UTC - in response to Message 45723.  

I'm running Win11 on an AMD processor. I believe Hyper-V is Intel specific and I never install anything but what is in the Boinc releases and all has worked fine for about a decade.

Hyper-V is a Virtualisation feature from Windows.
The stability of the Hardware is first thing of work.
Hoping you find the solution.
ID: 45725 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 41,981
Message 45726 - Posted: 18 Nov 2021, 19:59:39 UTC - in response to Message 45724.  

I'll try the download from VirtualBox but I haven't had good luck with those installs when I have tried in the past. BOINC doesn't seem to know how to communicate with it.

Whatever you use - it's important to keep the version numbers in sync.
So, as long as the environment or the registry report different version numbers this will sooner or later cause problems.



I believe Hyper-V is Intel specific

It's a Microsoft hypervisor included in Windows.



On the CPU throttling ... were causing my CPU to heat to close to maximum acceptable temperature

Indeed, if you have temperature trouble when you throttle above 45% you really should revise the design.

As a temporary workaround you may set <ncpus>7</ncpus> in BOINC's cc_config.xml and leave the CPU throttle at 100%.
This results in an average CPU load of 43.75% but leaves the timing of VM tasks at 100%.
Regarding cc_config.xml see:
https://boinc.berkeley.edu/wiki/Client_configuration
ID: 45726 · Report as offensive     Reply Quote
David M

Send message
Joined: 2 Dec 13
Posts: 5
Credit: 3,494,287
RAC: 0
Message 45731 - Posted: 22 Nov 2021, 5:59:36 UTC

I downloaded/updated VirtualBox to 6.1.26 and that appears to have solved my problem. You are certainly correct about needing all pointers to be consistent. And thank you for the correction on Hyper-V.

I have used preferences to play with the number of available threads and the percentage of processing time. At the moment (and for the last 2 months or so) I have ended up with 8 CPU threads plus a GPU running at 45%. This has the largest proportion of processing power that has provided stable temps across all projects and has not evidenced any issues, at least as far as I have noticed while monitoring production across projects. Raising the runtime percentage consistently leads to temperature elevations in Rosetta and WCG. I assume they pound the processor which causes the problem. This is my first AMD build so it's all new to me but I have already spent my time limit on trouble shooting the temps issue and my current solution seems to have been working for the last 6 months. Should I notice it failing, I'll revisit the issue. This is a secondary app on the box so as long as it works reliably and consistently, it's not worth another 6-10 hours for an improved resolution.

Thanks for your help in getting LHC and Cosmology@Home back up and running. They can now rejoin Einstein, Rosetta and Climate (when they have processing). At least I can get a few more results for those projects.
ID: 45731 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 45732 - Posted: 22 Nov 2021, 10:57:28 UTC - in response to Message 45731.  
Last modified: 22 Nov 2021, 11:25:22 UTC

I downloaded/updated VirtualBox to 6.1.26 and that appears to have solved my problem.

You may have jumped from the frying pan into the fire. After a while, you will probably see "Vm job unmanageable" suspensions where the tasks don't run.
If so, you could try downgrading to VirtualBox 5.2.44. At least it works on Windows 10. It may not be compatible with Windows 11.
https://www.virtualbox.org/wiki/Download_Old_Builds_5_2

Otherwise, you have to wait about a day for the tasks to start running again, or else you have to reboot.

PS - LHC is much better than most. I have seen it more often on Cosmology and some others.
ID: 45732 · Report as offensive     Reply Quote
David M

Send message
Joined: 2 Dec 13
Posts: 5
Credit: 3,494,287
RAC: 0
Message 45739 - Posted: 22 Nov 2021, 19:18:17 UTC - in response to Message 45732.  

I have occasionally seen "Job Unmanageable" messages, but they usually occurred when there were reboots before the VBox app had time to close the jobs and settle down. Hopefully, I won't find new ones occurring. I believe that the current BOINC package includes VBox 6.1.12, so hopefully a downgrade to a Vbox 5 version won't be necessary. Let's hope it all continues well.
ID: 45739 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45740 - Posted: 22 Nov 2021, 19:24:41 UTC - in response to Message 45739.  

Yes, went also not back to 5.2.44.
Have now the upgrade Win11pro and live also with some unmanageable (1 or 2 in a day).
ID: 45740 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45742 - Posted: 24 Nov 2021, 1:15:03 UTC - in response to Message 45740.  
Last modified: 24 Nov 2021, 2:08:52 UTC

Virtualbox App can not be started again:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10548292
2021-11-24 01:56:31 (17572): Detected: vboxwrapper 26197
2021-11-24 01:56:31 (17572): Detected: BOINC client v7.7
2021-11-24 01:56:32 (17572): Detected: VirtualBox VboxManage Interface (Version: 6.1.12)
2021-11-24 01:56:32 (17572): Successfully copied 'init_data.xml' to the shared directory.
2021-11-24 01:57:19 (17572): Error: Timeout
2021-11-24 01:57:19 (17572): ERROR: VM failed to start
2021-11-24 01:57:24 (17572):
BOINC will be notified that it needs to clean up the environment.
This is a temporary problem and so this job will be rescheduled for another time.

The OS is Win11pro.
Edit: Today will upgrading Virtualbox 6.1.28 to 6.1.30 and updating OS Win11pro.
ID: 45742 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45743 - Posted: 24 Nov 2021, 10:20:26 UTC - in response to Message 45742.  

The Error was because of a yellow Triangle from a disruppted Atlas-Task.
After deleting this vm_image.vdi in Virtualbox App, other Atlas-Tasks started well.
ID: 45743 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1417
Credit: 9,441,051
RAC: 885
Message 45744 - Posted: 24 Nov 2021, 10:26:13 UTC - in response to Message 45742.  

Is this on your RAM-disk BOINC and only happening with ATLAS-jobs?

Btw: It would be good, when the LHC-project would use the same newest vboxwrapper for ATLAS and Theory like it did for CMS.
ID: 45744 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,901,170
RAC: 2,804
Message 45745 - Posted: 24 Nov 2021, 11:11:44 UTC - in response to Message 45744.  

No Crystal,
have no RAM-Disk in use. Atlas is running with WCG-MCM tasks and Einstein-GPU tasks.
This crash of Atlas-Tasks was at 22 UTC.
Saw at 2 UTC ;-) some disconnects of the Network on this PC, and have no explanation.
One other PC (RYZENMP) had also this morning (after some sleep) two vm_image.vdi with yellow triangle.
Yes, it would be a good idea to use the best vboxwrapper for ALL Virtualbox- Tasks at Cern.
ID: 45745 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · Next

Message boards : Number crunching : VM Applications Errors


©2024 CERN