Message boards : Number crunching : computation error
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Fenshome

Send message
Joined: 1 Mar 15
Posts: 14
Credit: 833,756
RAC: 0
Message 46255 - Posted: 16 Feb 2022, 20:38:42 UTC
Last modified: 16 Feb 2022, 20:40:47 UTC

Hi

I had to rebuild my PC, and since then then the LHC runs for 10 sec and then says computation error. the Event Log has no useful info. I've reinstalled VM too. Can you point me to a fix.
ID: 46255 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2099
Credit: 161,782,826
RAC: 131,719
Message 46256 - Posted: 16 Feb 2022, 21:00:21 UTC - in response to Message 46255.  

I had to rebuild my PC, ... then says computation error. the Event Log has no useful info.

The log clearly points out what's wrong.
LHC VMs require hardware virtualization which is either switched off in your BOIS or you did not completely disable Hyper-V (+ all related components).
VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VBoxManage.exe: error: VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED)
ID: 46256 · Report as offensive     Reply Quote
metalius
Avatar

Send message
Joined: 3 Oct 06
Posts: 101
Credit: 8,896,547
RAC: 48
Message 46306 - Posted: 21 Feb 2022, 11:17:44 UTC - in response to Message 46255.  

VBox tasks of all types ending with "Computation Error".
Any ideas or useful links?
ID: 46306 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1143
Credit: 6,989,294
RAC: 1,081
Message 46307 - Posted: 21 Feb 2022, 12:05:37 UTC - in response to Message 46306.  

VBox tasks of all types ending with "Computation Error".
Any ideas or useful links?
In your result:

VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)
ID: 46307 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 277
Credit: 11,518,630
RAC: 2,290
Message 46308 - Posted: 21 Feb 2022, 13:44:08 UTC

With WCG shutting down for a month or more I have moved back to LHC but I am getting similar problems on my 2 laptops which previously worked faultlessly.
The VM is created but instantly fails on start and I get the same VT-x disabled message. I have checked that it is enabled in the Bios and Leomoon gives 2 ticks.
I had updated VBox so will revert to the previous version in case that is to blame.
There have also been a couple of Windows updates so one of them may be the culprit.
ID: 46308 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1663
Credit: 94,399,269
RAC: 316,588
Message 46309 - Posted: 21 Feb 2022, 15:09:32 UTC - in response to Message 46308.  
Last modified: 21 Feb 2022, 15:10:59 UTC

The good news is - WCG starting at 28th this month again with a stresstest.
Yes, have also a few days worked to let LHC-Tasks better running.
Your PC show no active VT-X, whatever the reason is.
ID: 46309 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1143
Credit: 6,989,294
RAC: 1,081
Message 46310 - Posted: 21 Feb 2022, 15:09:46 UTC - in response to Message 46308.  
Last modified: 21 Feb 2022, 15:11:33 UTC

@Ray: Is <p_vm_extensions_disabled> set to yes >1< in client_state.xml?
ID: 46310 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 277
Credit: 11,518,630
RAC: 2,290
Message 46311 - Posted: 21 Feb 2022, 15:15:35 UTC - in response to Message 46310.  
Last modified: 21 Feb 2022, 15:19:18 UTC

I didn't check that as there should have been no reason for that to change but I'll look when I get home and set it to 0 again if it has changed.
Thanks
ID: 46311 · Report as offensive     Reply Quote
metalius
Avatar

Send message
Joined: 3 Oct 06
Posts: 101
Credit: 8,896,547
RAC: 48
Message 46313 - Posted: 21 Feb 2022, 21:53:33 UTC - in response to Message 46307.  

In your result:
VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)

So, my machine is not supported... Right?
ID: 46313 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 277
Credit: 11,518,630
RAC: 2,290
Message 46314 - Posted: 21 Feb 2022, 22:41:50 UTC
Last modified: 21 Feb 2022, 23:33:34 UTC

No joy.
Virtualisation enabled in Bios and Task Manager and CPU-V both confirm that.
vm_extensions_disabled>0< (unchanged)
No Hyper-V as both hosts are Win10Home but I have made sure
Virtual Machine Platform and
Windows Hypervisor Platform are both unchecked.

I'm currently doing a System Restore on the other host to see if removing the last 2 Windows updates will resolve the problem but it is taking ages. Might have to leave it doing that overnight so I won't know if it's fixed until 2moro. If that does work, I'll get an Update Blocker as neither machine is compatible with Win11 so I'll run then as they are until they die.

The Restore didn't fix it so I don't know what's going on.
Hopefully the new XTrack, Sixtrack upgrade won't be far away and my machines can concentrate on that instead.
ID: 46314 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1143
Credit: 6,989,294
RAC: 1,081
Message 46315 - Posted: 22 Feb 2022, 7:07:57 UTC - in response to Message 46314.  

The Restore didn't fix it so I don't know what's going on.
@Ray: Could you try 1 CMS-task. CMS is using the newest vboxwrapper. I don't think it helps, but it's better to have tested that.
ID: 46315 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1542
Credit: 52,016,061
RAC: 37,169
Message 46316 - Posted: 22 Feb 2022, 7:57:47 UTC - in response to Message 46309.  
Last modified: 22 Feb 2022, 7:58:54 UTC

maeax wrotte:
The good news is - WCG starting at 28th this month again with a stresstest.
maeax, are you sure? They are supposed to be closed down until about Mid-April being in final phase of transition from IBM to Krembil.
ID: 46316 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1143
Credit: 6,989,294
RAC: 1,081
Message 46317 - Posted: 22 Feb 2022, 9:08:13 UTC - in response to Message 46316.  

maeax wrotte:
The good news is - WCG starting at 28th this month again with a stresstest.
maeax, are you sure? They are supposed to be closed down until about Mid-April being in final phase of transition from IBM to Krembil.
WCG wrote:
Ensuring a smooth and safe transition is our first priority.
As a result, we will need to pause new work units on February 14, to gracefully shut down all computation by February 28.
We will then backup and stress test the new system before proceeding into the final phase of the transition.


Not sure how that stresstest is performed and whether BOINC-users, -hosts and -data are involved or not.
ID: 46317 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 277
Credit: 11,518,630
RAC: 2,290
Message 46318 - Posted: 22 Feb 2022, 13:29:40 UTC - in response to Message 46315.  
Last modified: 22 Feb 2022, 13:48:53 UTC

Same result, sadly.
It gets as far as creating the VM and all looks well until it Starts the VM and instantly stops it again and throws the error.
So for some reason, as yet unknown, virtualization is not being seen as available whereas it IS shown as available and active in Task Manager and CPU-V.

Thanks for the PM, Samson.
I suspended the CMS during download so that I could resume it when I had a chance to observe what it would do
but, alas, it didn't go well. I also ran it on its own so there would be no other contention with memory or processor usage.

I skived off work to do that test so I better get back to that and will investigate further this evening.

I am puzzled that both hosts exhibit the same symptoms, having previously been working fine.
ID: 46318 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2099
Credit: 161,782,826
RAC: 131,719
Message 46319 - Posted: 22 Feb 2022, 13:53:04 UTC - in response to Message 46318.  

Hi Ray,

This is from your logfile here:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=345405352
2022-02-22 12:56:37 (4696): 
Command: VBoxManage -q startvm "boinc_bcabce4bc5f50f29" --type headless
Exit Code: -2147467259
Output:
Waiting for VM "boinc_bcabce4bc5f50f29" to power on...
VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VBoxManage.exe: error: VT-x is disabled in the BIOS for all CPU modes (VERR_VMX_MSR_ALL_VMX_DISABLED)
VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component ConsoleWrap, interface IConsole



BOINC correctly copies all it needs to the worker slot and vboxwrapper correctly configures a VM.
The failure happens when vboxwrapper starts the VM via VBoxManage.


Have you searched for the vbox error messages given there?
VERR_NEM_NOT_AVAILABLE
VERR_VMX_MSR_ALL_VMX_DISABLED

There are lots of posts everywhere in the internet like:
https://techsupportwhale.com/not-in-a-hypervisor-partition/
https://stackoverflow.com/questions/33304393/vt-x-is-disabled-in-the-bios-for-both-all-cpu-modes-verr-vmx-msr-all-vmx-disabl

Have you tried to start VirtualBox Manager and created/started a (dummy; diskless) VM using the same user account you use for BOINC?
This sometimes show an error message you don't see when you (or BOINC) start a headless VM.
ID: 46319 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1143
Credit: 6,989,294
RAC: 1,081
Message 46320 - Posted: 22 Feb 2022, 14:38:42 UTC

Another simple reason come to my mind from the past.
Did you re-ïnstalled BOINC lately and maybe did the service install?
ID: 46320 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 277
Credit: 11,518,630
RAC: 2,290
Message 46321 - Posted: 22 Feb 2022, 20:55:58 UTC
Last modified: 22 Feb 2022, 21:07:40 UTC

I think I'm getting sucked further down the rabbit hole, now.

Virtualization is definitely Enabled in Bios and confirmed in Task Manager and CPU-V but for some unknown reason is not being detected by VBox.
Boinc was last updated in November so not that, and I downgraded VBox to 6.1.30 in case that was the issue, even though earlier tasks had worked with 6.1.32.

In Turn Windows Features on/off, checking Virtual Machine Platform then rebooting to get that to stick, allows an empty VM to boot up after attaching the Guest Additions iso. However, in that configuration Boinc says No Tasks available. Turning that back off, and rebooting, the empty VM fails on start as before, BUT Boinc gets a task, which promptly fails as before. Bizarre behaviour.

Thanks for the input, guys but my brain's getting a bit frazzled and I'm starting to forget what I did or didn't do so I'm going to give up for now and revisit it on Sunday when I will have more time to investigate.
ID: 46321 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1663
Credit: 94,399,269
RAC: 316,588
Message 46322 - Posted: 23 Feb 2022, 5:56:26 UTC

Windows key Settings
Choose Update & Security
Select Recovery
Advanced Section Restart now
Troubleshoot > Advanced options > UEFI Firmware Settings
Click Restart now
Activating Virtualisation (AMD-V - is SVM Parameter or VT-X)
Hoping it works, when you have time at Sunday.
ID: 46322 · Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,100,100
RAC: 2,891
Message 46328 - Posted: 24 Feb 2022, 0:37:34 UTC
Last modified: 24 Feb 2022, 1:31:23 UTC

I'm getting the same problem on 6 of my 7 machines with CMS and ATLAS. I have none of the problems suggested earlier. The only things changed on these machines since I last ran LHC ok is Boinc being upgraded, and the OS changed from Windows 10 to Windows 11. I have checked it didn't put HyperV on.

Rosetta Python VB and LHC are running ok on only 1 machine, the only difference I can think of is this is the only one with TPM, "required" for Windows 11. The other 6 I bypassed that check to force Windows 11 to install. Not sure why that would affect VB.

These are my machines:
https://lhcathome.cern.ch/lhcathome/hosts_user.php?sort=rpc_time&rev=0&show_all=0&userid=55945

The only one that works is ID 10772275 (the i5).
ID: 46328 · Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,100,100
RAC: 2,891
Message 46329 - Posted: 24 Feb 2022, 1:49:22 UTC - in response to Message 46328.  
Last modified: 24 Feb 2022, 2:13:13 UTC

Mind you Rosetta had problems before I upgraded to Windows 11, so ignore that. They have different errors anyway.

And ignore the TPM thing too, I was wrong. The i5 running LHC ok does NOT have TPM, it's the Ryzen with TPM. So I have no idea what makes only that one work. Also my Ryzen 3900XT is running it ok (hadn't tried that one before). So the difference is now the two newest PCs run LHC ok. Has there been a change that means it doesn't like older processors?
ID: 46329 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : computation error


©2022 CERN