log in

AVX/AVX2 support


Advanced search

Message boards : Sixtrack Application : AVX/AVX2 support

Author Message
Toby Broom
Volunteer moderator
Send message
Joined: 27 Sep 08
Posts: 358
Credit: 78,217,893
RAC: 112,259
Message 28616 - Posted: 22 Jan 2017, 20:48:56 UTC

Hello Eric,

Is it possible, to add AVX support to the app, seems like it could offer a significant improvement in floating point calculations?

Thanks

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 28628 - Posted: 23 Jan 2017, 8:21:05 UTC - in response to Message 28616.

Thanks, noted, will do soonest. now I can finally produce Windows executables again. Eric.
____________

kyrsjo
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 23 Jan 17
Posts: 29
Credit: 94,364
RAC: 2,045
Message 30639 - Posted: 5 Jun 2017, 14:04:26 UTC

The test execs have and AVX version. From our initial testing there gain is rather small :( But let's see what happens when they are ran on more machines...

Toby Broom
Volunteer moderator
Send message
Joined: 27 Sep 08
Posts: 358
Credit: 78,217,893
RAC: 112,259
Message 30643 - Posted: 5 Jun 2017, 19:24:16 UTC

Sorry then, I assumed that this would be a good canditate, from these slides seems like it would need to vectorize well

https://twiki.cern.ch/twiki/bin/view/Main/VinInn

http://docplayer.net/22964956-Haswell-conundrum-avx-or-not-avx.html

Profile MAGIC Quantum Mechanic
Avatar
Send message
Joined: 24 Oct 04
Posts: 494
Credit: 14,290,993
RAC: 12,147
Message 30645 - Posted: 6 Jun 2017, 4:02:16 UTC - in response to Message 30643.

Sorry then, I assumed that this would be a good candidate, from these slides seems like it would need to vectorize well

https://twiki.cern.ch/twiki/bin/view/Main/VinInn

http://docplayer.net/22964956-Haswell-conundrum-avx-or-not-avx.html


Yeah I was wondering about that myself since I have a few Sandybridge

https://twiki.cern.ch/twiki/bin/view/LCG/VIAVXBenchMarks

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 30653 - Posted: 6 Jun 2017, 11:00:36 UTC

I think Kyrre was misunderstood here. SSE2/SSE3 (or pni) gives about a factor of
2 speed up and should always be used when available, which I understand is
not always true. I'll check with him as I think he means AVX doesn't give much more.
Overall vectorisation/pipelining gives a factor of 2 speedup for SixTrack.
(The code was vectorised many years ago for the Cray and IBM VF.) . Eric.
____________

maeax
Send message
Joined: 2 May 07
Posts: 182
Credit: 11,291,167
RAC: 10,971
Message 30657 - Posted: 6 Jun 2017, 11:57:10 UTC - in response to Message 30653.
Last modified: 6 Jun 2017, 11:57:40 UTC

(The code was vectorised many years ago for the Cray and IBM VF.) . Eric.

We are so sad, to have this Computer not at home ;-).
Or is it not true, because of the Flops of the Computer today?

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 30660 - Posted: 6 Jun 2017, 12:23:46 UTC - in response to Message 30657.

Well I think there is a CRAY in Lausanne Museum :-)
Today even Cray is using massive parallelisation, with
a multitude of "cheap" processors as we predicted many
years ago! Mark you I am sure they are doing it very
professionally in the great Cray tradition. Eric.
____________

Profile MAGIC Quantum Mechanic
Avatar
Send message
Joined: 24 Oct 04
Posts: 494
Credit: 14,290,993
RAC: 12,147
Message 30665 - Posted: 6 Jun 2017, 16:46:13 UTC

These SixTrack tasks remind me of 2004 with these tasks that run about 60 seconds or less......when you watch them they look like a pni Error but when you check they are Valids.....but I did have a few that actually ran over 7 hours.

Not much to see on the Stderr's but imagine running those multi-core
____________
Volunteer Mad Scientist For Life

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 30668 - Posted: 6 Jun 2017, 18:07:03 UTC - in response to Message 30665.

Well they may or may not be valid. We are looking very hard at this.
It is just very complicated to rerun every "short" task and check whether it
is valid or not! But we are trying. Eric.
____________

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 30669 - Posted: 6 Jun 2017, 18:11:41 UTC - in response to Message 30668.

...and sorry, I forgot to say that the files you downlaoded may no longer be
on the server, we may no be able to access the user's files either. Hence I
am concentrating on tasks/WUs with name wzero_jtbb2cm1.......
If you ever spot a null zero length fort.10 I'd be very interested. Eric.
____________

maeax
Send message
Joined: 2 May 07
Posts: 182
Credit: 11,291,167
RAC: 10,971
Message 30681 - Posted: 7 Jun 2017, 6:15:34 UTC
Last modified: 7 Jun 2017, 6:16:14 UTC

Those finished are waiting for the second finished task. All with more than 40.000 seconds and wzero_jtbb2cm1...
https://lhcathome.cern.ch/lhcathome/results.php?userid=75468&offset=0&show_names=0&state=2&appid=1

Erich56
Send message
Joined: 18 Dec 15
Posts: 304
Credit: 3,432,451
RAC: 8,423
Message 30682 - Posted: 7 Jun 2017, 6:34:38 UTC - in response to Message 30681.

https://lhcathome.cern.ch/lhcathome/results.php?userid=75468&offset=0&show_names=0&state=2&appid=1

"Access denied" :-)

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 30776 - Posted: 14 Jun 2017, 8:48:56 UTC

Well this is actually about the "null" empty results.
I have identified them and there are several different
causes, platform independent. This will be fixed in our
upcoming release. Eric.
____________

Toby Broom
Volunteer moderator
Send message
Joined: 27 Sep 08
Posts: 358
Credit: 78,217,893
RAC: 112,259
Message 30781 - Posted: 14 Jun 2017, 16:59:10 UTC

Great news

Toby Broom
Volunteer moderator
Send message
Joined: 27 Sep 08
Posts: 358
Credit: 78,217,893
RAC: 112,259
Message 31523 - Posted: 20 Jul 2017, 19:57:30 UTC

Since the new Skylake chips have AVX-512, is this something that the app would benift from?

James Molson
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Jan 17
Posts: 16
Credit: 864,017
RAC: 1,034
Message 31526 - Posted: 20 Jul 2017, 20:41:38 UTC - in response to Message 31523.

yes

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 836
Credit: 1,420,242
RAC: 1,081
Message 31618 - Posted: 25 Jul 2017, 4:28:48 UTC - in response to Message 31523.

Yes, but we can't use FMA because of different rounding and numeric portability. Eric.

(Since the new Skylake chips have AVX-512, is this something that the app would benefit from?)
____________

Message boards : Sixtrack Application : AVX/AVX2 support