Message boards : Number crunching : A sudden huge increase in computation errors
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Spatzthecat

Send message
Joined: 2 Apr 10
Posts: 15
Credit: 8,604,036
RAC: 0
Message 27721 - Posted: 21 Mar 2016, 21:11:18 UTC - in response to Message 27720.  

I run Atlas, vLHC and LHC and experience similar. (Milkyway on the GPU @ 100 resourse share).
My settings on resourse share are 100, 300 and 300 respectively for the CERN projects, however BOINC seems to disregard this to a point.

Try letting the projects sort themselves out and considering that the LHC task availability is normally less than the others I view this as a benefit when it takes the lead, as when LHC has no work Atlas then takes the lead, vLHC is the constant that will give way for an hour on ocassion but is the easiest to manage.
Hope this helps
ID: 27721 · Report as offensive     Reply Quote
Spatzthecat

Send message
Joined: 2 Apr 10
Posts: 15
Credit: 8,604,036
RAC: 0
Message 27722 - Posted: 21 Mar 2016, 23:49:04 UTC - in response to Message 27720.  

Additionally: Try restricting your CPU usage to 75% which will allow your CPU to "breathe" eg 1 core/2 threads available for your system. This will still allow 10 tasks, on my core i7 990x @ 75% I can run 1 x Milkyway on the GPU, 7 LHC/Atlas and 2 x vLHC tasks without locking up.
I think the recent problems with errors is unusual for LHC and has now been rectified.
ID: 27722 · Report as offensive     Reply Quote
BelgianEnthousiast

Send message
Joined: 5 Apr 15
Posts: 18
Credit: 5,910,849
RAC: 0
Message 27723 - Posted: 22 Mar 2016, 14:25:23 UTC

Boinc is only using 10 out of the 12 cores to allow headroom for the VB & other tasks. I usually run 3 Climates, 4 Rosetta's and 3 Atlas'es in parallel, working just perfect for about a year now. I run GPUGrid on the GPU additionally to that.

It's just recently that suddenly my system started hanging...

On the distribution mechanism, this is most of the time the case, just when LHC pops up with a new batch of WU's, then LHC takes all resources for about a day/day and a half. I don't mind it doing that as the workload readiness tends to be intermittent. (couple of days WU's available, then two weeks nothing, etc.)
ID: 27723 · Report as offensive     Reply Quote
Spatzthecat

Send message
Joined: 2 Apr 10
Posts: 15
Credit: 8,604,036
RAC: 0
Message 27725 - Posted: 22 Mar 2016, 19:13:28 UTC - in response to Message 27723.  

Hi
3 Climates, 4 Rosetta's and 3 Atlas'es plus the GPUGrid unit is 11.
Try adjusting the CPU usage so that it only crunches 10 in total.
ID: 27725 · Report as offensive     Reply Quote
Spatzthecat

Send message
Joined: 2 Apr 10
Posts: 15
Credit: 8,604,036
RAC: 0
Message 27733 - Posted: 24 Mar 2016, 16:47:10 UTC

Seem to be a problem with WU's similar to:

w7_newhllhc10_round_226_w7__3__s__62.31_60.32__12_14__5__45_1_sixvf_boinc4121

errors Too many total results
ID: 27733 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : Number crunching : A sudden huge increase in computation errors


©2024 CERN