Message boards : News : Aborted Work Units
Message board moderation

To post messages, you must log in.

AuthorMessage
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,578,195
RAC: 42
Message 31600 - Posted: 24 Jul 2017, 12:26:54 UTC

After deleting many really old results from 2013 until March 2017 (was meant to
be December 2016) it seems many Tasks have been aborted. A full analysis
and report will be posted. No action required by volunteers. Eric.
ID: 31600 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,578,195
RAC: 42
Message 31602 - Posted: 24 Jul 2017, 16:20:51 UTC - in response to Message 31600.  

Apologies all round; it was MY fault. My MySQL query was missing
and AND clause on server_state=5... Sorry about that. Eric.
ID: 31602 · Report as offensive     Reply Quote
Profile Bill F
Avatar

Send message
Joined: 2 Jun 07
Posts: 25
Credit: 1,105,015
RAC: 397
Message 31608 - Posted: 24 Jul 2017, 21:53:04 UTC - in response to Message 31602.  

Were the work units recycled back into the too be distributed stack for reassignment ?

Bill F
ID: 31608 · Report as offensive     Reply Quote
PFLIEGER Guy

Send message
Joined: 22 Jun 17
Posts: 12
Credit: 272,799
RAC: 0
Message 31609 - Posted: 24 Jul 2017, 22:30:39 UTC

I encountered an other problem by crunching LHC@home:
Some WU with more CPU stoped after around 10 minutes. The messages where:
-waiting for memory
-reported:waiting for hyperwisor....

So i abandonned this files because they couldn't be runned until the end like other WU
I had regulary such problem or similar when i was crunching by LHC@home
I had six track or atlas simulation
Every atlas simulations failed
ID: 31609 · Report as offensive     Reply Quote
FurryGuy

Send message
Joined: 1 Aug 05
Posts: 2
Credit: 627,709
RAC: 632
Message 31610 - Posted: 24 Jul 2017, 22:42:09 UTC - in response to Message 31609.  

I encountered an other problem by crunching LHC@home:
Some WU with more CPU stoped after around 10 minutes. The messages where:
-waiting for memory
-reported:waiting for hyperwisor...

I have been getting that message more than more, and not just from LHC@Home. Simply let it go, and eventually the BOINC scheduler will run the job.

I saw one of my LHC@Home jobs get aborted, I never knew projects could do this remotely.
ID: 31610 · Report as offensive     Reply Quote
PFLIEGER Guy

Send message
Joined: 22 Jun 17
Posts: 12
Credit: 272,799
RAC: 0
Message 31611 - Posted: 24 Jul 2017, 22:52:24 UTC - in response to Message 31610.  

I gived the answer upon the observation on my screen not to bring the world to worry but to help the engineer to understand the origin of the problem.
A little peace and love in this world and avoid to scratch the hair of the other one. Usually the enginneer don't have to much hair on the head because the have enough problem to solve each day. Sometimes they are using her own head to be quiet if you understand what i mean!

Best Regards

Guy PFLIEGER
ID: 31611 · Report as offensive     Reply Quote
EeqMC252

Send message
Joined: 27 Jul 05
Posts: 11
Credit: 2,700,982
RAC: 755
Message 31613 - Posted: 25 Jul 2017, 3:49:27 UTC - in response to Message 31610.  

I had the same problems, it would come and go but eventually became very severe affecting 100% of the LHC tasks. I resolved my issues by modifying the "Disk and memory" settings under "Computing preferences" in BOINC. I increased the "Use no more than x GB" value and decreased the "Leave at least x GB free" value. My computer has worked perfectly since allowing additional resources for BONIC projects.
ID: 31613 · Report as offensive     Reply Quote
PFLIEGER Guy

Send message
Joined: 22 Jun 17
Posts: 12
Credit: 272,799
RAC: 0
Message 31620 - Posted: 25 Jul 2017, 11:20:17 UTC

in the internal error treatment of the error must be noticed every parameter of the computing machines
-Software
-boinc version
-state of the register
-allocated memory
-unallowed operations
-state of the memory
-name of the WU
...

Guy PFLIEGER
ID: 31620 · Report as offensive     Reply Quote
PFLIEGER Guy

Send message
Joined: 22 Jun 17
Posts: 12
Credit: 272,799
RAC: 0
Message 31621 - Posted: 25 Jul 2017, 11:23:34 UTC

actually i don't have a job but i have ideas

Best regards

Guy PFLIEGER
ID: 31621 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 436
Credit: 14,281,748
RAC: 20,381
Message 31622 - Posted: 25 Jul 2017, 11:39:17 UTC

Sorry,

this is a NEWS thread!

For crunshing please write in number crunshing.
ID: 31622 · Report as offensive     Reply Quote
Carlos

Send message
Joined: 10 Nov 17
Posts: 6
Credit: 164,580
RAC: 838
Message 33147 - Posted: 28 Nov 2017, 6:05:14 UTC
Last modified: 28 Nov 2017, 6:13:35 UTC

Hi everybody out there. I'm absolutely beginer on LHC@home. I'v started about 1 month ago. After recieving the message "VB not installed" i did it, with extension. I didn't configured anything.(Must i?) Now nearly all Atlas Simulations reports "Berechnungsfehler" = computingfailture. Is it a failture of my machine or is it a failture in the script of the WU ? I'm running BOINC on 3 of 4 cores and adjust few times the amount of memory and disc space use so i don't need to pause while working with other software.
Funny is also that after this failture the "progress" is on "100.000%". Most simulations are running not more then 20sec. Only 1 was more than 3h "on the run".
Can somebody tell me what to do, if necessary?
(sorry about my english!)
ID: 33147 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 500
Credit: 2,154,877
RAC: 176
Message 33148 - Posted: 28 Nov 2017, 8:22:04 UTC - in response to Message 33147.  

I believe your memory is too small. Try running on one core only, then on two and see how it goes. I am running on 2 cores on a Windows 10 PC with 22 GB RAM and on one core on two Linux boxes with 8 GB RAM.
Tullio
ID: 33148 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 469
Credit: 3,346,968
RAC: 821
Message 33151 - Posted: 28 Nov 2017, 15:04:43 UTC - in response to Message 33147.  

VirtualBox 5.2 is not 100% compatible with vboxwrapper used by the ATLAS project.
Try downgrading to 5.1.30 https://www.virtualbox.org/wiki/Download_Old_Builds_5_1
ID: 33151 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 500
Credit: 2,154,877
RAC: 176
Message 33153 - Posted: 29 Nov 2017, 2:52:39 UTC

I am using 5.2.2, both on Windows and Linux in Atlas tasks.
Tullio
ID: 33153 · Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 5 Nov 15
Posts: 119
Credit: 5,250,392
RAC: 0
Message 33186 - Posted: 2 Dec 2017, 23:01:38 UTC - in response to Message 33151.  

VirtualBox 5.2 is not 100% compatible with vboxwrapper used by the ATLAS project.
Try downgrading to 5.1.30 https://www.virtualbox.org/wiki/Download_Old_Builds_5_1


You might have something there.

My Windows 7 Pro machine was getting (on ATLAS WU's only) a constant stream of "Postponed: VM is unmanageable" errors on 5.2.0 and 5.2.2 so after downgrading to 5.1.30, there have been no more of those errors in 3 days.
ID: 33186 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 469
Credit: 3,346,968
RAC: 821
Message 33191 - Posted: 3 Dec 2017, 10:13:26 UTC - in response to Message 33186.  

You might have something there.

My Windows 7 Pro machine was getting (on ATLAS WU's only) a constant stream of "Postponed: VM is unmanageable" errors on 5.2.0 and 5.2.2 so after downgrading to 5.1.30, there have been no more of those errors in 3 days.

The same may happen with the CMS-subproject.

ATLAS and CMS are using wrapper version 26196 and that conflicts with VBox 5.2.
Theory (wrapper 26198ab5) and LHCb (wrapper 26198ab7) probably will work together with VirtualBox version 5.2.
ID: 33191 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 783
Credit: 5,514,891
RAC: 9,020
Message 33192 - Posted: 3 Dec 2017, 13:44:36 UTC - in response to Message 33191.  

ATLAS and CMS are using wrapper version 26196 and that conflicts with VBox 5.2. ...


well, what concerns CMS I can tell that on my system, it works with VBox 5.2; although the third line in stderr reads
Error creating VirtualBox instance! rc = 0x80004002

However, this does not seem to be a real problem - all CMS tasks are completed properly.

For ATLAS I cannot tell, because for memory requirement reasons I don't run ATLAS on this machine.
ID: 33192 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 500
Credit: 2,154,877
RAC: 176
Message 33193 - Posted: 3 Dec 2017, 14:13:22 UTC

Atlas tasks run on my Windows and Linux PCs with VBox 5.2.2. This is probably due to the fact that the kernel is still 4.4 on the Linux boxen. Windows 10 is being update constantly by Microsoft. I only have problems with SETI@home GPU tasks running on it, while Einstein@home GPU tasks run perfectly.
Tullio
ID: 33193 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 436
Credit: 14,281,748
RAC: 20,381
Message 33194 - Posted: 3 Dec 2017, 14:54:07 UTC - in response to Message 33191.  
Last modified: 3 Dec 2017, 14:54:51 UTC

ATLAS and CMS are using wrapper version 26196 and that conflicts with VBox 5.2.
Theory (wrapper 26198ab5) and LHCb (wrapper 26198ab7) probably will work together with VirtualBox version 5.2.


Atlas native App for Linux use wrapper_26015_x for vboxwrapper.
ID: 33194 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 783
Credit: 5,514,891
RAC: 9,020
Message 33195 - Posted: 3 Dec 2017, 17:49:53 UTC - in response to Message 33193.  

I only have problems with SETI@home GPU tasks running on it, while Einstein@home GPU tasks run perfectly.
Tullio

But this cannot be related to any VM version, since these two projects don't use VM
ID: 33195 · Report as offensive     Reply Quote

Message boards : News : Aborted Work Units


©2018 CERN