Message boards : Number crunching : Status and Plans 25th May, 2013.
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25603 - Posted: 25 May 2013, 11:01:50 UTC

Well the server is "down" indeed and thanks for pointing it out.
Hope it will be restarted soonest.
I am currently trying to finish the second intensity scan (the results
from the first look good.). For result discrepancies please see the
thread with that name on this Message Board. Once we have the test server
setup we shall try doing 10 million turns again, but with restarts and new work units
every million turns. I am also about to try some GPUs, but don't hold your breath.
Running SixTrack on GPU is (very) non-trivial. I am also looking for some
financial/manpower support for that and for further work on result
replication.I would like to buy some more compilers (and I await complete
Fortran 2003 implementations) and also to acquire a "Keppler" GPU. Eric.
ID: 25603 · Report as offensive     Reply Quote
Robish

Send message
Joined: 23 May 06
Posts: 2
Credit: 2,014,992
RAC: 0
Message 25606 - Posted: 26 May 2013, 18:56:06 UTC - in response to Message 25603.  

Hey Eric

Why not set up a beta project like "seti@home beta" alongside lhc@home for gpu testing only?

I have 4 nvidia gtx 690s and 10 ati 5870s that you can use for testing if you do?

just let me know.

Cheers

Rob.
ID: 25606 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25608 - Posted: 28 May 2013, 7:49:44 UTC - in response to Message 25606.  

Hi Rob; I have been trying to do that for over two years now!
It is exactly what we need. However, my "support" does
not seem to be hearing me....... Thanks. Eric.
ID: 25608 · Report as offensive     Reply Quote
Profile Tom95134

Send message
Joined: 4 May 07
Posts: 250
Credit: 826,541
RAC: 0
Message 25609 - Posted: 28 May 2013, 16:14:53 UTC

Does the amount of work (Tasks) on LHC justify the effort necessary to develop and test GPU based crunching?
ID: 25609 · Report as offensive     Reply Quote
Profile Coleslaw
Avatar

Send message
Joined: 29 Apr 08
Posts: 24
Credit: 4,972,971
RAC: 9,875
Message 25610 - Posted: 28 May 2013, 21:49:20 UTC

Why do you need yet another project for BETA? Just have seperate BETA work units that people can opt into. Many projects have that option. Just make sure to have it turned off by default so people have to elect to participate.
ID: 25610 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 880
Credit: 746,950,531
RAC: 326,675
Message 25611 - Posted: 28 May 2013, 21:53:28 UTC

I agree with Coleslaw, it's seem pointless to create another project when the option is in the current project.
ID: 25611 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25615 - Posted: 29 May 2013, 8:21:10 UTC

Thanks for the feedback. I was not aware of the Beta test option!
I am just a user. Sounds like it might be a better option for
testing, for one million turns, etc etc. Eric.
ID: 25615 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25616 - Posted: 29 May 2013, 8:26:26 UTC - in response to Message 25609.  

Good question! However i am interested in GPU because:
1. It is there and I want to be all inclusive.
2. I want to check replication of identical results.
3. It looks like a good, even essential, option for studying
space charge effects with a very very large number of particles.
Finally I believe we could get a performance improvement reducing
the real time to get a study completed, but that said, with up to
a 100,000 processors being volunteered we are in pretty good
shape. Eric.
ID: 25616 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 25617 - Posted: 29 May 2013, 9:51:01 UTC - in response to Message 25616.  

In SETI@home they get a huge number of invalid results from users with GPU boards and the wrong drivers. There is even a thread dedicated to this problem. If a user has hidden computers and is therefore anonymous, it is impossible to PM him and warn him that he is producing thousands of invalid results.
Tullio
ID: 25617 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 25618 - Posted: 29 May 2013, 11:46:03 UTC - in response to Message 25617.  

In SETI@home they get a huge number of invalid results from users with GPU boards and the wrong drivers. There is even a thread dedicated to this problem. If a user has hidden computers and is therefore anonymous, it is impossible to PM him and warn him that he is producing thousands of invalid results.
Tullio

Not exactly.

SETI has two GPU problems that might loosely be described as 'driver related', but they're both more subtle than that. But I agree with Tullio - there are lessons to be learned from both of them, and both are worth thinking about before embarking on the GPU treadmill.

First, be advised that SETI - like you, Eric! - are grossly under financed, under resourced, and under staffed for the scale of the search they're attempting, and the size of the user base they've attracted. In particular, they have nobody 'in house' with the time and the skillset to program GPU applications from scratch: they've had to rely exclusively on third-party programmers. So:

Problem #1: Their first GPU application - back at the very tail-end of 2008 - was for NVIDIA cards, and was supplied gratis by the NVIDIA corp themselves. It was rushed out, without adequate testing, in time to feed the Christmas marketing frenzy. But it worked, after a fashion (and with some bugfixes) on the hardware of the day.

Two years later (Spring 2010), NVidia released the next generation 'Fermi' hardware: the original applications wouldn't run on it. NVidia, again gratis, supplied an updated application which ran, and still runs, on this generation of hardware.

Two years later again, NVidia released another new generation of hardware, the 'Kepler' cards. Again, the previous (Fermi) applications wouldn't run on the new cards. This time, the NVidia programmers didn't supply an updated application: the most they could be persuaded to do was to build a compatibility mode switch into their drivers, but this requires manual activation by the user: see my screenshot thread. But of course there are many, many users who throw new hardware into older computer, or simply buy new computers and install BOINC, without checking that things are running properly, or visiting the message boards when they aren't. Kepler cards without the compatibility switch cause a lot of the errors Tullio describes. GPUGrid is having similar compatibility problems with the even-newer 'Titan' cards at the moment.

Problem #2: ATI cards. It's a generalisation, but I think a widely accepted one, that AMD provide much poorer developer support for general-purpose computing on the range of cards they bought in from ATI. They didn't write an application for SETI, but of course there was a strong 'me too' pressure from ATI owners in the user base to be able to match their NVidia colleagues.

Without manufacturer support, it fell to a single, independent, volunteer developer to supply applications for AMD/ATI hardware. The ATI hardware seems to have produced fewer complications than the NVidia changes - newer cards introduce new features, of course, but don't appear to break older applications.

Instead, it's the driver changes which do that. AMD release a new GPU driver every calendar month, without fail - ready or not. Sometimes the changes are not even notified to developers, and sometimes the claims made for software support on their driver download pages are just plain false. (For example, before I get sued for libel: the download pages for the Windows XP versions of the February and March 2012 drivers - 12.2 and 12.3 - stated OpenCL support, but this was silently removed for Windows XP after the 12.1 drivers). Similarly, our volunteer developer had huge difficulties getting his applications to work under drivers 13.2/13.3/13.4, but this was eventually traced to the undocumented cessation of support for a single compiler option switch. They're working now, but many errors were caused by users updating to the 'latest and greatest' drivers before the application was fixed.

Lessons to be learned?

The GPU market is fast-moving and immature.

Manufacturer support is patchy at best, and often poor.

Even if you can get a single hardware/software combination to work, external forces - largely gaming driven - can cause it to break later. It's a treadmill, as I said at the beginning.

And finally - largely in consequence of the previous comment - if you accept third-party assistance with the development process, make sure both of you go into it with your eyes open, and a clear understanding of the obligations (of either party, or both) for long-term and frequent support.
ID: 25618 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 25619 - Posted: 29 May 2013, 14:52:57 UTC - in response to Message 25617.  
Last modified: 29 May 2013, 14:54:45 UTC

Thanks you Richard for your clear explanation. I am not a GPU user, although I have bought a HD 7770 card but haven't found so far the courage to install it in a Linux box with an Opteron 1210 as CPU. My other Linux box, a HP laptop, has an AMD APU E-450 which has itself some graphic capabilities. But I declare myself a complete novice in the graphics field, being an Old UNIX Hand.I've read that newer AMD chips will have more more graphic capabilities. Maybe my next box shall have one of these. Cheers.
Tullio
ID: 25619 · Report as offensive     Reply Quote
Profile Coleslaw
Avatar

Send message
Joined: 29 Apr 08
Posts: 24
Credit: 4,972,971
RAC: 9,875
Message 25621 - Posted: 30 May 2013, 17:46:49 UTC

And don't forget that Intel is finally getting their butts in the game. A few projects now have apps for those GPU's as well.

I would recommend going with the ARM devices first. There is a growing user base there and only a hand full of projects that are options. Yes, they are slower then modern PC's. However, they are advancing quickly and the power consumption is much lower. Given time, these could be a very valuable resource. I'm up to 3 older phones doing nothing but running nativeBOINC now and my current somewhat modern phone running it part time while on the charger. Tablets and Raspberry Pi boards are also joining in the dance. You could even contact the developer. www.nativeboinc.org He has offered assistance to many projects already.
ID: 25621 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25622 - Posted: 30 May 2013, 22:34:13 UTC - in response to Message 25617.  

Thanks Tullio; I'll have a look a that thread when I find it.
Eric.
ID: 25622 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25623 - Posted: 30 May 2013, 22:37:28 UTC - in response to Message 25621.  

Fascinating; I have ARM in mind. We have lots of so called
"trivial" parallelism, the number of cases, bulk computing.
Strength in numbers. It is the time to complete a study that
is important.
ID: 25623 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 25624 - Posted: 30 May 2013, 22:55:13 UTC - in response to Message 25622.  

Thanks Tullio; I'll have a look a that thread when I find it.
Eric.

I think he probably means http://setiathome.berkeley.edu/forum_thread.php?id=69782.

It's not just drivers. It's badly managed machines and absentee owners of all sorts. Heat, overclocking, bad memory. Outdated and incompatible applications. Third-party applications which have outlived their usefulness but not been removed. Huge graphics cards paired with puny power supplies. You name it, somebody has ballsed it up. I could go on..... :P
ID: 25624 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 25626 - Posted: 31 May 2013, 8:45:43 UTC - in response to Message 25624.  
Last modified: 31 May 2013, 8:46:26 UTC

Thanks Richard. That is the thread. There is also a MMMM Invalids? thread. Now SETI@home has a new version, 7.0.x and entropy is increasing.
Tullio
ID: 25626 · Report as offensive     Reply Quote
Profile Coleslaw
Avatar

Send message
Joined: 29 Apr 08
Posts: 24
Credit: 4,972,971
RAC: 9,875
Message 25627 - Posted: 31 May 2013, 15:41:17 UTC - in response to Message 25623.  

As far as ARM goes, nativeBOINC (which is Android only right now)currently has apps available at Albert, Asteroids, Enigma, Milkyway, OProject (this one is on hold), Primegrid, SETI, SubsetSum, WUProp, YoYo, and theSkyNet POGS. If you look at the stats page on www.nativeboinc.org, you will also see the various devices being added. This does not include all of the Raspberry Pi boards (the non-Android ones that some of the projects also support) or ARM devices using Windows 8 (which I haven't seen any BOINC capable apps for yet).
ID: 25627 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25631 - Posted: 2 Jun 2013, 8:32:21 UTC - in response to Message 25627.  

Thanks for that update; I would love to get going on
Raspberry Pi :-) Fun. Maybe ARM first.
As for Windows 8 :-( Eric.
ID: 25631 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 25632 - Posted: 2 Jun 2013, 13:14:23 UTC

ARM used to stand for Acorn RISC Machine. I saw an Acorn PC in Hannover in the Eighties, then the firm collapsed and was acquired by Olivetti which promptly resold it to a financial institution. Now ARM stands for Advanced RISC Machine and ARM-based processors power most of the smartphones. I have used a RISC based minicomputer from 1991 to 1995 and could appreciate its speed.
Tullio
ID: 25632 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 25633 - Posted: 3 Jun 2013, 12:44:35 UTC

Humble pie time again. I had been asking for another
server when all I need is a Beta Project on our
existing infrastructure. Sorry about that. Eric.
ID: 25633 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Status and Plans 25th May, 2013.


©2025 CERN