Message boards : Number crunching : Host messing up tons of results
Message board moderation

To post messages, you must log in.

Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · Next

AuthorMessage
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27353 - Posted: 9 Apr 2015, 22:49:43 UTC
Last modified: 9 Apr 2015, 23:04:23 UTC

ID: 27353 · Report as offensive     Reply Quote
Phil
Avatar

Send message
Joined: 26 Jul 05
Posts: 63
Credit: 4,083,755
RAC: 0
Message 27354 - Posted: 9 Apr 2015, 23:22:33 UTC - in response to Message 27352.  

Ah good to hear 1 of them replied! :)

One of us)

We're all one team.
SO I urge everyone to check their error and invalid and inconclusive tasks on regular basis to see any kind of irregularities so it could be fixed asap.

Yea but problems are:

  • Only a certain percentage ever come on here to read the boards, or even see the news headline on the front page.
  • Some people will have changed email addresses and wont recieve messages
  • Some people install BOINC without the manager and dont even see whats happening any more.


ID: 27354 · Report as offensive     Reply Quote
William C Wilson
Avatar

Send message
Joined: 11 Sep 08
Posts: 25
Credit: 384,225
RAC: 0
Message 27355 - Posted: 10 Apr 2015, 0:04:56 UTC - in response to Message 27352.  

I upgraded mother board, raid drives, and CPU still with windows 8.1 Pro (64 bits) and had only one problem of system shutdown when CPU overheated. Intel´s furnished cooler could not handle the 186 watts. Went to water, no more problems. Last weekend went to Windows 10 Enterprise (64 bit)build 9926 but upgrading to 1041 or 1046 this weekend. Have set no more tasks to down load.

Seems that every other task is giving me the invalid or inconclusive tasks, even with CPU not over clocked.

Watch message board, but see nothing on Windows 10 issues. My other projects are not generating any errors at all.

HELP please.

Bill in Brazil
William C Wilson
São Paulo Brazil
ID: 27355 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27356 - Posted: 10 Apr 2015, 0:11:32 UTC - in response to Message 27355.  

Seems that every other task is giving me the invalid or inconclusive tasks, even with CPU not over clocked.


Bill
I'd propose you to run BOINC in Compatibilty modes and play with 32 and 64-bit versions compatibility Win7,8 , Vista etc and see output.
Also in properties set "Always Run as Administrator"
make sure your BIOS is latest and no fancy settings about CPU and memory there.
Finally play with CPU settings in BOINC and LHC project
ID: 27356 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27357 - Posted: 10 Apr 2015, 4:45:39 UTC
Last modified: 10 Apr 2015, 4:47:14 UTC

here are more details about that other host with huge number or errors
http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10301941

All (978) In progress (32) Validation pending (3) Validation inconclusive (2) Valid (47) Invalid (0) Error (894)

GenuineIntel
Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz [Family 6 Model 60 Stepping 3]
Number of processors 8
Coprocessors NVIDIA GeForce GTX TITAN (4095MB) driver: 347.52 OpenCL: 1.1
Operating System
Microsoft Windows 7 Professional x64 Edition, Service Pack 1, (06.01.7601.00)
assuming BOINC client has to be 64-bit and all tasks have all 64-bit too?
or it could do both x86 and 64-bit?

http://lhcathomeclassic.cern.ch/sixtrack/host_app_versions.php?hostid=10301941

SixTrack 451.07 windows_x86_64 (sse2)
Number of tasks completed 228
Max tasks per day 7
Number of tasks today 27
Consecutive valid tasks 0
Average processing rate 20.86 GFLOPS
Average turnaround time 0.44 days


SixTrack 451.07 windows_intelx86 (pni)
Number of tasks completed 910
Max tasks per day 38
Number of tasks today 0
Consecutive valid tasks 10
Average processing rate 9.20 GFLOPS
Average turnaround time 1.20 days


SixTrack 451.07 windows_intelx86
Number of tasks completed 20
Max tasks per day 48
Number of tasks today 0
Consecutive valid tasks 0
Average processing rate 6.02 GFLOPS
Average turnaround time 3.99 days
ID: 27357 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27358 - Posted: 10 Apr 2015, 6:22:18 UTC - in response to Message 27357.  

Thanks; I have e-mailed the user.
In fact all errors seem to be
DISK_LIMIT_EXCEEDED.
Still, not good for him nor us. Eric.
ID: 27358 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27359 - Posted: 10 Apr 2015, 6:54:52 UTC - in response to Message 27355.  

Thanks Bill; this could be really interesting and useful.
Fast machine, Windows 10, ......
I am now watching closely all your Wus and trying to determine
the cause of the Invalid. Shame it takes so much CPU time
but there you are. Eric.
ID: 27359 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27360 - Posted: 10 Apr 2015, 10:18:15 UTC - in response to Message 27355.  

Eric
so users do crunch x86 on their 64-bit PC right? isn't it possible incompatibility point?
ID: 27360 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27361 - Posted: 10 Apr 2015, 10:30:15 UTC - in response to Message 27360.  

Eric
so users do crunch x86 on their 64-bit PC right? isn't it possible incompatibility point?

I run SixTrack and other 32-bit BOINC project applications on 64-bit Windows 7 with no problem at all. If anyone is having problems with their SysWOW64 environment, it's specific to their computer, not general or widespread (Windows 10 - as yet unreleased - is another question entirely, of course).

32-bit Linux apps on a 64-bit system are more of a problem, because many projects rely on 32-bit compatibility libraries and not every user installs them. But that results in an immediate application crash, not the sort of pseudo-valid results that Eric is wrestling with.
ID: 27361 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27362 - Posted: 10 Apr 2015, 10:32:10 UTC - in response to Message 27360.  

It could be, but all my tests work and there is no
consistent failure. Still there might be something
on perhaps very new machines.

Once I have cleared all the "noise" I'll get back to the
basic problem of genuine result differences. I have one
CERN customer (maybe two) who have different Validated results
for the same task! Bill seems to be suffering from a real
problem too. I don't feel I can publish my work until these
are resolved.

(I shall be on vacation from 14th to 28th April and this will
be a great chance to investigate in real detail :-)

Just want to leave the service in good shape and the customers and
clients happy. Eric.
ID: 27362 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27363 - Posted: 10 Apr 2015, 10:38:22 UTC - in response to Message 27361.  

Also my executables are all statically linked, though
there could still be other compatibility issues.
I think Microsoft are very strong on backwards compatibility; many many users/owners run 32-bit apps.

I am trying to keep an open mind and analyse the evidence.
Remember we are sill talking about very few differences
among millions of tasks. Seems to be getting worse though. Eric.
ID: 27363 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27364 - Posted: 10 Apr 2015, 11:20:50 UTC

another host with empty results
http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10297384
All (235) In progress (12) Validation pending (2) Validation inconclusive (0) Valid (14) Invalid (0) Error (207)
ID: 27364 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 59
Credit: 4,012,100
RAC: 0
Message 27366 - Posted: 10 Apr 2015, 17:22:47 UTC - in response to Message 27364.  

Not sure if this host has already been flagged, but it's got loads of invalids.

http://lhcathomeclassic.cern.ch/sixtrack/show_host_detail.php?hostid=9996388
Team AnandTech - WCG, Uni@H, F@H, MW@H, Ast@H, LHC@H, R@H, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RTX 3060 Ti 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64
ID: 27366 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27367 - Posted: 10 Apr 2015, 17:50:02 UTC - in response to Message 27366.  

You'll find dozens of references to host 9996388 in this thread.
ID: 27367 · Report as offensive     Reply Quote
Phil
Avatar

Send message
Joined: 26 Jul 05
Posts: 63
Credit: 4,083,755
RAC: 0
Message 27368 - Posted: 10 Apr 2015, 17:54:52 UTC - in response to Message 27358.  

DISK_LIMIT_EXCEEDED.

Apropos of nothing really, I had a bunch of LHC and SETI stop with this error yesterday, but I was running my first CMS-dev job which had also crashed...
ID: 27368 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1177
Credit: 54,887,670
RAC: 3,877
Message 27369 - Posted: 11 Apr 2015, 8:44:34 UTC - in response to Message 27368.  

DISK_LIMIT_EXCEEDED.

Apropos of nothing really, I had a bunch of LHC and SETI stop with this error yesterday, but I was running my first CMS-dev job which had also crashed...



Yes CMS-dev is what caused that to happen.
Volunteer Mad Scientist For Life
ID: 27369 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 27370 - Posted: 11 Apr 2015, 12:26:19 UTC

Hello, my machine (id: 10327477) has started to get some invalids. Is it normal?
ID: 27370 · Report as offensive     Reply Quote
[TA]Assimilator1
Avatar

Send message
Joined: 29 Nov 13
Posts: 59
Credit: 4,012,100
RAC: 0
Message 27371 - Posted: 11 Apr 2015, 16:18:29 UTC - in response to Message 27370.  

I only see 3 you have as invalid, 2 of those were also invalid for other hosts, looks like those were dodgy WUs. The 3rd maybe down to your machine, but I wouldn't worry about just 1, just keep an eye on your tasks page to make sure they don't grow in number.

Btw why have you cancelled so many WUs?

Host linked as someone seems to of forgotten to do that ;) http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10327477
Team AnandTech - WCG, Uni@H, F@H, MW@H, Ast@H, LHC@H, R@H, CPDN, E@H.
Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RTX 3060 Ti 8GB, Win10 64bit
2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64
ID: 27371 · Report as offensive     Reply Quote
Luigi R.
Avatar

Send message
Joined: 7 Feb 14
Posts: 99
Credit: 5,180,005
RAC: 0
Message 27372 - Posted: 11 Apr 2015, 18:21:34 UTC - in response to Message 27371.  

About cancelled WUs...

Because I have often got network issues with my repeater and "flash"-tasks (that terminate in few seconds) could leave my machine without work. I edited my ncpus from cc_config to get about ~150 WUs and to ensure workload for an entire week, but I got too many ~8h tasks that weren't finishing on time.
ID: 27372 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27373 - Posted: 12 Apr 2015, 4:18:00 UTC - in response to Message 27372.  
Last modified: 12 Apr 2015, 4:46:49 UTC

to get about ~150 WUs and to ensure workload for an entire week

why would you need that many? if router fails you will see it in day or two, not a week I guess?
If units failed due to timeout you put other users job in waste bin as they lost their already executed tasks due to your timeout I think. Admins, correct me if it's not like that.
ID: 27373 · Report as offensive     Reply Quote
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · 9 · Next

Message boards : Number crunching : Host messing up tons of results


©2024 CERN