Message boards : Number crunching : Sixtrack with newer CPU extentions
Message board moderation

To post messages, you must log in.

AuthorMessage
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 877
Credit: 742,765,804
RAC: 252,375
Message 26160 - Posted: 23 Jan 2014, 4:33:27 UTC

Hi Eric,

I know that you don't have much time but I thought it might be easy to implement with complier switch.

I read article on Anandtech that quoted some performance figures for newer CPU's.

For the AVX instructions there is about a 2x performance improvement vs SSE in fp64 code and AVX+FMA give a further 2x over just AVX.

Would it be easy to do an AVX/FMA binary?
ID: 26160 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26211 - Posted: 1 Feb 2014, 18:08:13 UTC - in response to Message 26160.  

Thanks Tony; no real problem and a factor of 2!.
Eric
ID: 26211 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26212 - Posted: 1 Feb 2014, 18:13:28 UTC

.......but FMA is not really possible because numerically
incompatible with non-FMA,........ :-(

I'll try and find time to test and measure SixTrack. Eric.
ID: 26212 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 877
Credit: 742,765,804
RAC: 252,375
Message 26213 - Posted: 1 Feb 2014, 18:51:43 UTC

No problem, just thought it could be an easy thing with complier switch.

Numeric equilivence is another matter ;)
ID: 26213 · Report as offensive     Reply Quote
Profile Ananas

Send message
Joined: 17 Jul 05
Posts: 102
Credit: 542,016
RAC: 0
Message 26227 - Posted: 4 Mar 2014, 6:11:46 UTC
Last modified: 4 Mar 2014, 6:12:27 UTC

The project seems not to enforce using the latest version, I see a lot of PNI results on hosts that would be capable of running SSE3.

I'm using a little patch on my (outdated 5.10.28) core clients so they report that my CPUs can do SSE3 but it needed a reset in order to make the hosts pull the SSE3 project application. Before the reset, my hosts did report SSE3 properly (can be seen in the sched_request) but still crunched using the PNI application.
ID: 26227 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 780
Credit: 59,447,886
RAC: 46,059
Message 26228 - Posted: 4 Mar 2014, 7:56:13 UTC - in response to Message 26227.  

If I remember correctly, PNI = SSE3, and the applications are also identical. Please correct if I'm wrong.
ID: 26228 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 26229 - Posted: 4 Mar 2014, 8:08:01 UTC - in response to Message 26228.  

If I remember correctly, PNI = SSE3, and the applications are also identical. Please correct if I'm wrong.

You are correct. PNI is short for Prescott New Instructions, and is exactly the same thing as SSE3.

I did a binary file compare on the older pair of Sixtrack applications (for Windows), and they were identical. I haven't repeated that with 446.03, but I expect they would be again.
ID: 26229 · Report as offensive     Reply Quote
Profile Ananas

Send message
Joined: 17 Jul 05
Posts: 102
Credit: 542,016
RAC: 0
Message 26230 - Posted: 4 Mar 2014, 14:42:29 UTC - in response to Message 26229.  

oops, I thought that SSE3 came later than P4 - but I skipped that CPU type and went from Tualatin and Thoroughbred (and one Dothan) to Core2. Thanks for clearing that up.
ID: 26230 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 26231 - Posted: 5 Mar 2014, 14:38:01 UTC - in response to Message 26229.  

Well we just copy the SSE3 executable to PNI. We build only
32-bit generic ia32, sse2 an sse3. Eric.
ID: 26231 · Report as offensive     Reply Quote

Message boards : Number crunching : Sixtrack with newer CPU extentions


©2025 CERN