Thread 'GPU advertised for LHC, but they don't do it?'

Author	Message
Jim1348 Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0	Message 40265 - Posted: 24 Oct 2019, 16:32:49 UTC - in response to Message 40263. For some curious reason, my Radeon RX 570 does well on MilkyWay (Win7 64-bit), even though it is merely OK, not great on DP. https://milkyway.cs.rpi.edu/milkyway/results.php?hostid=803662 So there are various factors involved. We will really need to try out a card and see how it does. ID: 40265 · Reply Quote

Magic Quantum Mechanic Send message Joined: 24 Oct 04 Posts: 1282 Credit: 95,114,454 RAC: 40,869	Message 40266 - Posted: 24 Oct 2019, 17:31:08 UTC The Einstein Project is the place to go if you want to run GPU tasks. Or find out anything about running them every day from the experts. ID: 40266 · Reply Quote

[VENETO] boboviz Send message Joined: 7 May 08 Posts: 272 Credit: 2,128,847 RAC: 160	Message 40271 - Posted: 24 Oct 2019, 18:57:10 UTC - in response to Message 40258. We are still in the process of developing SixTrackLib, validating the physics and tackle the integration into the main SixTrack. Our time scale is still probably a year or two to expose the functionality into LHC@Home, but we already performing simulations using SixTrackLib in GPU for specific problems. That's great! If you need help to debug gpu app.... (i think you will use lhc-dev project) :-P ID: 40271 · Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 449 Credit: 14,923,549 RAC: 2,016	Message 40276 - Posted: 25 Oct 2019, 11:05:16 UTC - in response to Message 40245. Then how come there are projects which only use ATI but not Nvidea? Which project? All opencl app support both Amd and Nvidia gpu. The only "problem" may be the different optimization of the code. There aren't, but there ones that don't list Intel, but do list AMD, which is strange if OpenCL works on both. ID: 40276 · Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 449 Credit: 14,923,549 RAC: 2,016	Message 40278 - Posted: 25 Oct 2019, 11:10:45 UTC - in response to Message 40258. Last modified: 25 Oct 2019, 11:13:35 UTC Despite this, GPUs also support 64bit floating point calculations but at a lower speed, typically 16/32 times more slowly, however still the computing capabilities they provide are comparable and often exceed the ones of CPUs. Even the latest embedded GPUs in notebooks are fairly capable. There are also GPUs that are strong in 64bit arithmetic such as Nvidia Quadro GV100, Tesla V100/P100, Titan V, AMD Radeon VII, Firepro W9100, W8100 although they are very expensive and less common in workstations (instead more common in supercomputers and the cloud providers). 2nd hand cards are great for 64 bit - The R9 280X is the best I've found. Â£60 2nd hand, and it does 1024 GFLOPS at 64bit (that's 1:4 ratio - older cards tended to have more double precision). More expensive ones are faster, but you're better buying several of these instead for the same price and getting even more computing power. I'm going to experiment with PCI express splitters at some point when I get more cards - it could be possible to get about 20 GPUs connected to one motherboard - I know bitcoin miners have done it. ID: 40278 · Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 449 Credit: 14,923,549 RAC: 2,016	Message 40280 - Posted: 25 Oct 2019, 11:32:19 UTC - in response to Message 40258. Generally GPUs have very strong 32bit floating point capabilities, but we will not be able to profit from them. The calculations needed by SixTrack requires variables with large dynamic range, which is very different from tasks like machine learning. Is it true that you can do a 64bit calculation using two or more 32bit floating point units on the card? What's the performance hit if you did? ID: 40280 · Reply Quote

zepingouin Send message Joined: 7 Jan 07 Posts: 41 Credit: 16,121,918 RAC: 25	Message 41137 - Posted: 1 Jan 2020, 14:46:35 UTC - in response to Message 40280. Is it true that you can do a 64bit calculation using two or more 32bit floating point units on the card? What's the performance hit if you did? Typically reported performance is ~40% of FP32 performance, which is an order of madnitude better than the 1/24 (~4%) FP64:FP32 performance available in almost all consumer-grade GPUs. Source HNY 2020 ! ID: 41137 · Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 449 Credit: 14,923,549 RAC: 2,016	Message 41139 - Posted: 1 Jan 2020, 18:37:06 UTC - in response to Message 41137. Is it true that you can do a 64bit calculation using two or more 32bit floating point units on the card? What's the performance hit if you did? Typically reported performance is ~40% of FP32 performance, which is an order of madnitude better than the 1/24 (~4%) FP64:FP32 performance available in almost all consumer-grade GPUs. Source HNY 2020 ! Interesting. I tend to buy up 2nd hand cards from back when they had a better ratio. Still, mine is 1:4, so 25%. So I wonder why Milkyway uses double precision? According to what you said, they'd get more calculations out of even my card. ID: 41139 · Reply Quote

zepingouin Send message Joined: 7 Jan 07 Posts: 41 Credit: 16,121,918 RAC: 25	Message 41141 - Posted: 2 Jan 2020, 15:09:33 UTC - in response to Message 41139. Last modified: 2 Jan 2020, 15:11:03 UTC Interesting. I tend to buy up 2nd hand cards from back when they had a better ratio. Still, mine is 1:4, so 25%. So I wonder why Milkyway uses double precision? According to what you said, they'd get more calculations out of even my card. It's all about the needed precision to do the maths, no rounding errors, no overflow, etc ... A reading to have some clues : Floating Point's not Real ID: 41141 · Reply Quote

David E. Merchant Send message Joined: 11 Apr 17 Posts: 39 Credit: 7,735,161 RAC: 0	Message 41946 - Posted: 18 Mar 2020, 9:16:13 UTC - in response to Message 41141. Well, 64 bit floats still give you rounding errors but they tend to be way out there past the decimal points. :) ID: 41946 · Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 449 Credit: 14,923,549 RAC: 2,016	Message 41991 - Posted: 24 Mar 2020, 19:45:28 UTC - in response to Message 41946. Well, 64 bit floats still give you rounding errors but they tend to be way out there past the decimal points. :) Can more calculations at FP32 not be done to make the accuracy better? On cards with stupidly low ratios like 1:24, Even doing 10 calculations at FP32 would be faster than 1 at FP64. ID: 41991 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 299,800,971 RAC: 52,749	Message 41994 - Posted: 24 Mar 2020, 20:34:24 UTC - in response to Message 41991. https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html This might be a good starting point to understand why it is not a good idea to work with reduced precision. ID: 41994 · Reply Quote

Mr P Hucker Send message Joined: 12 Aug 06 Posts: 449 Credit: 14,923,549 RAC: 2,016	Message 41995 - Posted: 24 Mar 2020, 20:56:29 UTC - in response to Message 41994. https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html This might be a good starting point to understand why it is not a good idea to work with reduced precision. I know reduced precision is bad. I don't have enough maths skills to understand any papers on it. But can't more calculations be done at FP32 to make it more accurate? Even if it requires several, it would still be faster than FP64 on most cards. ID: 41995 · Reply Quote

[VENETO] boboviz Send message Joined: 7 May 08 Posts: 272 Credit: 2,128,847 RAC: 160	Message 42848 - Posted: 12 Jun 2020, 15:08:31 UTC - in response to Message 40258. As some of you argued, we are focusing on using OpenCL 1.2 because it allows to run on AMD, Intel and Nvidia GPUs, although is less advanced than other options (OpenCL 1.2 dates back to 2011). You can use incoming OpenCl 3.0, that is OpenCl 1.2 plus some extras.... ID: 42848 · Reply Quote

[VENETO] boboviz Send message Joined: 7 May 08 Posts: 272 Credit: 2,128,847 RAC: 160	Message 43473 - Posted: 7 Oct 2020, 8:10:12 UTC - in response to Message 42848. Last modified: 7 Oct 2020, 8:10:47 UTC Seems that the development of sixtrack code is freezed...latest modify 15 Jul, latest version 19 Dec 2019... ID: 43473 · Reply Quote

Sesson Send message Joined: 4 Apr 19 Posts: 31 Credit: 4,860,362 RAC: 0	Message 43478 - Posted: 7 Oct 2020, 16:30:38 UTC - in response to Message 41995. https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html This might be a good starting point to understand why it is not a good idea to work with reduced precision. I know reduced precision is bad. I don't have enough maths skills to understand any papers on it. But can't more calculations be done at FP32 to make it more accurate? Even if it requires several, it would still be faster than FP64 on most cards. One of the features of Sixtrack is that it always try to produce the same result across all platforms they support. In order to do that the scientists even added brackets to expressions to prohibit the compiler from freely choose a mathematically equivalent computation. You can see scientific publications of Sixtrack and LHC@home for stories like that. Now here comes GPU. Not only it cannot reproduce the bit-exact result as CPU, but must operate at reduced precision for best performance. I'd doubt if scientists would ever accept it. ID: 43478 · Reply Quote

[VENETO] boboviz Send message Joined: 7 May 08 Posts: 272 Credit: 2,128,847 RAC: 160	Message 43509 - Posted: 16 Oct 2020, 13:53:37 UTC - in response to Message 43478. Now here comes GPU. Not only it cannot reproduce the bit-exact result as CPU, but must operate at reduced precision for best performance. I'd doubt if scientists would ever accept it. I don't know. In the past they have often spoken about gpu in sixtrack (i also spoke with Mcintosh) and seems that they are very interested in gpu calculation. But it depends on them... ID: 43509 · Reply Quote

[VENETO] boboviz Send message Joined: 7 May 08 Posts: 272 Credit: 2,128,847 RAC: 160	Message 43630 - Posted: 17 Nov 2020, 9:52:44 UTC Interesting video about gpu and LHC ID: 43630 · Reply Quote

[VENETO] boboviz Send message Joined: 7 May 08 Posts: 272 Credit: 2,128,847 RAC: 160	Message 43631 - Posted: 17 Nov 2020, 9:56:54 UTC AMD released first gpu with over 10Tflops FP64 (and 23 Tflops FP32). AMD MI100 ID: 43631 · Reply Quote

Ben Segal Volunteer moderator Project administrator Send message Joined: 1 Sep 04 Posts: 143 Credit: 2,579 RAC: 0	Message 43645 - Posted: 18 Nov 2020, 12:22:43 UTC - in response to Message 43630. Yes, very interesting but nothing to do with Sixtrack or LHC@home. This describes the LHCb online triggering system. Interesting video about gpu and LHC ID: 43645 · Reply Quote