Message boards : Number crunching : All-out errors on LHC seemingly due to virtualbox.
Message board moderation

To post messages, you must log in.

AuthorMessage
BelgianEnthousiast

Send message
Joined: 5 Apr 15
Posts: 18
Credit: 5,910,849
RAC: 0
Message 34202 - Posted: 1 Feb 2018, 7:48:42 UTC

Hi,

I've been crunching LHC for ages now and everything works fine.
Until yesterday I enabled not just LHC but all the other sub-projects too.
This morning - to my horror - all tasks were listed as "FAILED".

When checking the log, I got this as error messages. I haven't got a clue what they are referring to.
Can anyone help please ?

Big thanks !

31/01/2018 23:58:57 | LHC@home | Starting task Theory_17686_1517416129.405500_0
31/01/2018 23:59:01 | | Vbox app stderr indicates CPU VM extensions disabled
31/01/2018 23:59:02 | LHC@home | Computation for task Theory_17888_1497842526.551771_0 finished
31/01/2018 23:59:02 | LHC@home | Starting task LHCb_17627_1517416128.197336_0
31/01/2018 23:59:10 | LHC@home | Computation for task w15_ats2017_b1_qp_2_ats2017_b1_QP_2_IOCT_24__52__s__64.27_59.295__11_13__5__7.5_1_sixvf_boinc3983_0 finished
31/01/2018 23:59:10 | LHC@home | Starting task Theory_17820_1497842525.300976_0
31/01/2018 23:59:13 | LHC@home | Started upload of w15_ats2017_b1_qp_2_ats2017_b1_QP_2_IOCT_24__52__s__64.27_59.295__11_13__5__7.5_1_sixvf_boinc3983_0_r1915472003_0
31/01/2018 23:59:16 | | Vbox app stderr indicates CPU VM extensions disabled
31/01/2018 23:59:16 | LHC@home | Computation for task Theory_17685_1517416129.384875_0 finished
31/01/2018 23:59:16 | LHC@home | Starting task CMS_17712_1517416129.756855_0
31/01/2018 23:59:18 | | Vbox app stderr indicates CPU VM extensions disabled
31/01/2018 23:59:18 | LHC@home | Computation for task LHCb_29353_1517417334.615865_0 finished
31/01/2018 23:59:18 | LHC@home | Starting task Theory_17821_1497842525.321976_0
31/01/2018 23:59:20 | LHC@home | Finished upload of w15_ats2017_b1_qp_2_ats2017_b1_QP_2_IOCT_24__52__s__64.27_59.295__11_13__5__7.5_1_sixvf_boinc3983_0_r1915472003_0
31/01/2018 23:59:23 | | Vbox app stderr indicates CPU VM extensions disabled
31/01/2018 23:59:23 | LHC@home | Computation for task Theory_574_1517417637.429044_0 finished
31/01/2018 23:59:23 | LHC@home | Starting task CMS_29450_1517417336.133922_0
31/01/2018 23:59:34 | | Vbox app stderr indicates CPU VM extensions disabled
31/01/2018 23:59:35 | LHC@home | Computation for task Theory_17686_1517416129.405500_0 finished
31/01/2018 23:59:35 | LHC@home | Starting task LHCb_17643_1517416128.494939_0
31/01/2018 23:59:39 | | Vbox app stderr indicates CPU VM extensions disabled
31/01/2018 23:59:40 | LHC@home | Computation for task LHCb_17627_1517416128.197336_0 finished
31/01/2018 23:59:40 | LHC@home | Starting task Theory_17691_1517416129.506237_0
31/01/2018 23:59:44 | | Vbox app stderr indicates CPU VM extensions disabled
ID: 34202 · Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 5 Nov 15
Posts: 144
Credit: 6,301,268
RAC: 0
Message 34203 - Posted: 1 Feb 2018, 7:59:02 UTC - in response to Message 34202.  

That error seems to indicate that the extension pack isn't installed.
Did you install the matching Virtual Box Extensions pack for your current version of VBox?
https://www.virtualbox.org/wiki/Downloads

It could be though that the CPU Virtualization features are not turned on in your computer BIOS though.
Check the BIOS advanced features.
ID: 34203 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,336,573
RAC: 102,088
Message 34205 - Posted: 1 Feb 2018, 8:30:01 UTC - in response to Message 34203.  

That error seems to indicate that the extension pack isn't installed.
There is no compulsory need to install the VM extension pack in order to have the VM run properly.

It could be though that the CPU Virtualization features are not turned on in your computer BIOS though. Check the BIOS advanced features.
this is exactly what I suspect
ID: 34205 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 34210 - Posted: 1 Feb 2018, 12:16:28 UTC
Last modified: 1 Feb 2018, 12:17:47 UTC

VirtualBox seems not to be able to use the VT-X-Features of your CPU.

One reason may be that it got disabled in BIOS (for example with BIOS-Update) or you installed / activated a different programm that uses VT-X-Features of the CPU.

Maybe you take a journey through my checklist: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161&postid=29359#29359


Supporting BOINC, a great concept !
ID: 34210 · Report as offensive     Reply Quote
BelgianEnthousiast

Send message
Joined: 5 Apr 15
Posts: 18
Credit: 5,910,849
RAC: 0
Message 34271 - Posted: 4 Feb 2018, 17:03:50 UTC - in response to Message 34210.  

Hi All, Yeti,

Big thanks for your very quick and indeed very precise support !

I followed your checklist and indeed it turned out that for some reason virtualisation got disabled.
I would suspect as you mentioned that it was disabled through a BIOS updated which I performed just before year end.

The "<p_vm_extensions_disabled>1</p_vm_extensions_disabled>" was indeed also at "1", so put that back to "0".

Now everything works like a charm !

Maybe a quick side question : does 17 hours as crunch time seem correct to you for CMS and Theory WU's ?

Traditionally, I saw rather 2.5 to max. 7 à 7 hours for LHC or ATLAS WU's...

Again thanks for your support !

Have a nice sunday afternoon :-)
ID: 34271 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,336,573
RAC: 102,088
Message 34272 - Posted: 4 Feb 2018, 17:26:26 UTC - in response to Message 34271.  

Maybe a quick side question : does 17 hours as crunch time seem correct to you for CMS and Theory WU's ?
it much depends on the speed of your processor and some other components in your rig.
Also, whether all the processor cores are pretty much exhausted with what they've got to work on.

With my i7-4930k @ 3.9GHz (6 cores + 6 HT), and use of 10 cores (i.e. about 86% of total processor usage), CMS and Theory WUs use to run for about 13 hours.
ID: 34272 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,890,073
RAC: 138,318
Message 34274 - Posted: 4 Feb 2018, 17:39:22 UTC - in response to Message 34271.  

Maybe a quick side question : does 17 hours as crunch time seem correct to you for CMS and Theory WU's ?

Traditionally, I saw rather 2.5 to max. 7 à 7 hours for LHC or ATLAS WU's...

Theory, CMS and LHCb run subtasks with typical runtimes between a few minutes and 2 h.
If a subtask ends, the VMs checks if a total runtime of 12 h is reached.
Then the VM shuts down after a grace period of 10 minutes.
If 12 h are not yet reached, the VM requests another subtask.

A couple of watchdogs ensure that
- a starting VM that gets no subtask will shutdown after about 15 minutes.
- the VM shuts down after a total runtime of 18 h.

LHCb may have problems with it's watchdogs or the generation of enough subtasks as the VM runtimes are often very short.

The runtime of ATLAS tasks differ depending on the number of events your VM gets at startup.
Most common are 50 events, but 100 or 200 are also possible.
Each event runtime can be between a few minutes and 20 minutes.
Maximum total runtimes do normally not exceed 18 h on a singlecore VM.
ID: 34274 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,336,573
RAC: 102,088
Message 34275 - Posted: 4 Feb 2018, 17:48:57 UTC - in response to Message 34274.  

LHCb may have problems with it's watchdogs or the generation of enough subtasks as the VM runtimes are often very short.
same is true with CMS tasks, if the WMAgent fails for one reason or the other, and hence does not deliver jobs (sub-tasks).
ID: 34275 · Report as offensive     Reply Quote
BelgianEnthousiast

Send message
Joined: 5 Apr 15
Posts: 18
Credit: 5,910,849
RAC: 0
Message 34276 - Posted: 4 Feb 2018, 20:16:49 UTC - in response to Message 34275.  

Ok, that's clear, thanks for the sunday evening response guys ! Most appreciated !
ID: 34276 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 34277 - Posted: 4 Feb 2018, 21:59:55 UTC - in response to Message 34271.  

Maybe a quick side question : does 17 hours as crunch time seem correct to you for CMS and Theory WU's ?

Last I heard, CMS stops tasks when the total jobs have exceeded 12 hours' CPU time; a hard limit then activates at (IIRC) 18 hours.
So 17 hours is not unreasonable.
ID: 34277 · Report as offensive     Reply Quote

Message boards : Number crunching : All-out errors on LHC seemingly due to virtualbox.


©2024 CERN