Message boards :
Theory Application :
Theory Simulation v263.90 (vbox64_mt_mcore): Big differences in scoring
Message board moderation
Author | Message |
---|---|
Send message Joined: 16 Dec 16 Posts: 3 Credit: 27,993,679 RAC: 0 |
Hi, there are big differences in the credits for Theory Simulation v263.90 (vbox64_mt_mcore). My AMD 2700X is getting sth. about 0,015 credits per second (cpu time) while for example an 1700 reaches 0,40 credits per second (cpu time). The operating systems are different, but difference of factor 26?!?!? What ist the secret? :D 2700X: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10572024&offset=0&show_names=0&state=4&appid=13 1700: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10389189&offset=0&show_names=0&state=0&appid=13 Thanks and happy crunching! |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
What ist the secret? :DI have brought up this topic in the past, since I also am experiencing huge differences, which are totally unlogical. But so far, no one has been able to provide a comprehensible explanation. |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
the big differences in the scoring are also existing with the latest version 263.98 - here an example of two tasks which got finished on 7/19 and 7/20, on the same host, no changes whatsoever made by me: https://lhcathome.cern.ch/lhcathome/result.php?resultid=237289911 runtime: 45,643.06 - CPU time: 44,278.19 - credit points: 1,227.11 https://lhcathome.cern.ch/lhcathome/result.php?resultid=237326272 runtime: 47,549.99 - CPU time: 46,239.37 - credit points: 639.19 can someone enlighten me why the second task delivers only half of the points? Would be very interesting to know, just out of curiosity. |
Send message Joined: 14 Jan 10 Posts: 1435 Credit: 9,599,192 RAC: 3,468 |
Your first mentioned task reports: Device peak FLOPS 11.61 GFLOPS Your second mentioned task with the lower credit reports: Device peak FLOPS 5.81 GFLOPS (the half) From where that difference? The first task: Setting Memory Size for VM. (1030MB), what indicates the task thinks 4 cores are used. The 2nd task: Setting Memory Size for VM. (830MB), what looks like 'only' 2 cores are needed. |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
Crystal Pellet, thanks for the hint regarding Memory Size and number of cores - I didn't notice this (the difference in the peak FLOPS value I saw only after I had written my posting - and I was wondering). The question now is: why is this so? As said above, I didn't make any changes in any settings at all. |
Send message Joined: 28 Sep 04 Posts: 738 Credit: 50,193,546 RAC: 25,054 |
I think that the Max # CPUs setting in the web preferences is used as a multiplier when server is counting GFLOPS for the host. It doesn't use the number of CPUs the user is actually using (from app_config => std_err). The GFLOPS value is used in the calculations of the granted credits, required memory and in the calculation for the time limit for error 197 EXIT_TIME_LIMIT_EXCEEDED. If you set in your app_config.xml a smaller value for number of CPUs than in the web preferences, you will get higher credits but you are also more likely to get the 197 error if the task is a long one. At least this is what I think is happening, please correct me if I am wrong. |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
At least this is what I think is happening, please correct me if I am wrong.well, the thing is (as mentioned before): I didn't change anything. So, why all of a sudden is there a change in behaviour? |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
... but you are also more likely to get the 197 error if the task is a long one.this problem should no longer occur, since Laurence wrote: I have pushed out a new version (263.98) which doubles the lifetime of the VM. This should allow more time for the last job to run. |
Send message Joined: 28 Sep 04 Posts: 738 Credit: 50,193,546 RAC: 25,054 |
... but you are also more likely to get the 197 error if the task is a long one.this problem should no longer occur, since Laurence wrote: Yes, let's hope that this solves that problem. But the mechanism is still there. |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
I have made the following interesting observation on a notebook with Intel CPU i5 M 480 @ 2.67GHz: - - - - - - - - - - - - - - Total runtime - CPU time - credits Theory task 1-core: 129,937.35 - 129,148.50 - 2,623.52 Theory task 2-core: 129,949.30 - 92,688.76 - 2,623.76 Hence, following questions: - the CPU time for a 2-core should be around double than for 1-core, and not LESS ! - the credit for a 2-core should be around double than for a 1-core, right? What is going wrong here? |
Send message Joined: 14 Jan 10 Posts: 1435 Credit: 9,599,192 RAC: 3,468 |
Hence, following questions: It's just calculating the credits as already mostly explained in another Theory thread: New version 263.90 Example post there: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4890&postid=39793 The extra info to add is that not the cpu time is used for the calculation, but the run time. It seems you had set Max # CPUs to 3 in your preferences for that machine, but reduced for one task the number of usable cpu's to 1 and for the other task to 2 by means of app_config.xml. |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
It seems you had set Max # CPUs to 3 in your preferences for that machine, but reduced for one task the number of usable cpu's to 1 and for the other task to 2 by means of app_config.xml.yes, this is exactly what I did. And the results make me wonder. So why should one ever run a 2-core (or even higher-core) task, if the result, in terms of CPU time and of credits, is the same, or even worse? Am I missing something? |
Send message Joined: 14 Jan 10 Posts: 1435 Credit: 9,599,192 RAC: 3,468 |
So why should one ever run a 2-core (or even higher-core) task, if the result, in terms of CPU time and of credits, is the same, or even worse? Am I missing something?The best practice for Theory is to setup single core VM's to process as many jobs for MC Production inside the VM('s). On my 8-threaded machine I run 8 single core VM's (executing cap set to 90% to avoid sluggishness) and I'm using the snapshot mechanism for safety. In fact LHC@home could skip the Theory mt-application imo, except for the very few users with very low RAM. Only they could run more jobs inside a multi-core VM, because their RAM is too low to setup 2 VM's with 730MB RAM each. (The current false server setting 750MB + (750MB * cores) doesn't help here. |
Send message Joined: 15 Jun 08 Posts: 2564 Credit: 257,164,581 RAC: 113,553 |
In fact LHC@home could skip the Theory mt-application ... Right. Theory vbox should be made a singlecore app. At least until BOINC has a much better multicore support. This would avoid typical misconfigurations as well as lots of discussions. |
Send message Joined: 18 Dec 15 Posts: 1835 Credit: 120,858,731 RAC: 80,017 |
Theory vbox should be made a singlecore app.I fully agree. My experience with Theory multicore processing in the recent past (on more than one machine) has shown that it's definitely not working as supposed or as expected. |
Send message Joined: 27 Sep 08 Posts: 853 Credit: 696,407,422 RAC: 127,087 |
I have always done this, the only reason I can see to not run single core tasks is to save on ram usage. After the changes to Working Set in the latest version, I tried running some 8core tasks to see if I could push the CPU usage up as each task was assigned 22Gb of ram. When you lookinside the VM it would not even use 8cores it would seem, maybe 1 time I think it did. Running 4cores at the moment and it seems good so the most I could recommend is a 4 core WU. |
©2025 CERN