Message boards : Number crunching : Inefficient usage of AMD threadripper 16c32t CPU
Message board moderation

To post messages, you must log in.

AuthorMessage
far

Send message
Joined: 27 May 11
Posts: 5
Credit: 9,747,819
RAC: 0
Message 40172 - Posted: 16 Oct 2019, 23:27:23 UTC

Hi,
There appears to be extremely inefficient usage of an AMD threadripper CPU with LHC projects.
Currently a 32cpu theory simulation 263.98 job is running. While it locks out all 32threads, it only uses 18% of the CPU capacity.
Other LHC jobs perform in a similar manner, even when they are 8CPU jobs. eg Atlas tasks will also cause the same sort of issue.

All other CPU based BOINC projects running which use individual threads for a task will use 95-98% of the CPU capability consistently.

The AMD CPU has chiplets - small groupings of CPUs (8 from memory) which are on different bits of silicon and connected via a substrate under the CPU cover.

Q1 - Would it cause any issues if I upgraded the virtual box installation to whatever is current? From it's release notes the more recent versions would appear to be more aware of what quirks the AMD CPUs may have as compared to the Intel ones and maybe this could assist.

Q2 - Is there any advice which regarding preference settings which could assist in more fully using the CPU?

Thanks, Far
ID: 40172 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,800
RAC: 1,930
Message 40173 - Posted: 17 Oct 2019, 5:48:17 UTC - in response to Message 40172.  

Q1 - Upgrade to a recent VirtualBox version

Q2 - For Theory set in your preferences for Max # of jobs for this project 'No limit' and for Max # of CPUs for this project '1'.
For ATLAS the setting could be 'No limit' and 4. When you want a combination of tasks, you have to use an local app_config.xml, but first try to run only Theory or ATLAS.
ID: 40173 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,033,372
RAC: 137,035
Message 40174 - Posted: 17 Oct 2019, 6:17:30 UTC - in response to Message 40172.  

Would it cause any issues if I upgraded the virtual box installation to whatever is current?

In most cases it is recommended to use the most recent VirtualBox version.
A downgrade makes sense if you encounter crashes.

Be aware that this will not really solve your problems as they are mainly caused by your setup.




Is there any advice which regarding preference settings which could assist in more fully using the CPU?

ATLAS

It looks like your ATLAS tasks are configured to use 8 cores.
In an 8-core setup the RAM requirement per task will be 10200 MB.
2019-10-15 15:07:06 (8852): Setting Memory Size for VM. (10200MB)
2019-10-15 15:07:07 (8852): Setting CPU Count for VM. (8)

Your computer has 32 GB RAM.
Hence it will be able to run up to 3 ATLAS tasks, most likely only 2 as some RAM is required for your OS and other work.
If 3 ATLAS task are running and other stuff needs RAM your BOINC client will pause/restart 1 ATLAS which requires additional resources.
See the timestamps from this example:
2019-10-15 15:09:57 (8852): VM state change detected. (old = 'running', new = 'paused')
2019-10-15 15:10:07 (8852): VM state change detected. (old = 'paused', new = 'running')


ATLAS is the only real multithreading app here at LHC@home but only during it's calculation phase. During startup and shutdown of a task it runs singlecore scripts, e.g. to fill the local CVMFS cache or to expand/compress the job data.
Idle cores can't be used by other BOINC tasks during this phases.



Theory

This is not a real multithreading app.
If you configure a n-core setup it provides an n-core VM but inside the VM it runs n subtask slots and each subtask slot runs an independent scientific app.
Each scientific apps needs between 10-15 min and much more than a day to finish calculations.
Empty subtask slots will be repopulated if the total task runtime is lower than 12 h.
After the 12 h mark the subtask slots will not be repopulated.
So, if you got one of those longrunners short before the 12 h mark and all other subtasks finish early you will get n-1 idle cores for many hours.

What to do?
Theory runs best as n 1-core tasks instead of 1 n-core tasks.
The 32-core setup you described is the most inefficient you can have.
ID: 40174 · Report as offensive     Reply Quote
far

Send message
Joined: 27 May 11
Posts: 5
Credit: 9,747,819
RAC: 0
Message 40248 - Posted: 23 Oct 2019, 4:28:54 UTC - in response to Message 40174.  

Hi there
VirtualBox is now upgraded to current. Hopefully this makes it more understanding of the AMD architecture and ways to group CPUs efficiently.
The Boinc group had recommended not to upgrade, but as you are the only project on that machine using it, it is all good.
To stop idle time on this machine which was basically built just for boinc, I've disabled Atlas and theory tasks -> hopefully it will run as close to 100% as possible for most of the time now.
If there are better setting of preferences, eg limiting the number of CPUs available for LHC tasks instead - please let me know
Thanks and regards, Far
ID: 40248 · Report as offensive     Reply Quote
the Kris

Send message
Joined: 29 Mar 10
Posts: 2
Credit: 1,183,960
RAC: 0
Message 40249 - Posted: 23 Oct 2019, 8:55:03 UTC - in response to Message 40174.  
Last modified: 23 Oct 2019, 8:59:41 UTC

I have the same problem on a Ryzen 1700.
The Theory task locks out 15 CPUs but used only 4 in the beginning, and now it is even only using 1, while not even at 50% progress.

"Theory runs best as n 1-core tasks instead of 1 n-core tasks."

How do a force *only* Theory to run as "n 1-core tasks" ?
ID: 40249 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,800
RAC: 1,930
Message 40250 - Posted: 23 Oct 2019, 9:19:59 UTC - in response to Message 40249.  

How do a force *only* Theory to run as "n 1-core tasks" ?
Set in your your project preferences the Max # CPUs to 1.

If you want to run a mix of Theory with other LHC VBox applications you have to use an app_config.xml.
ID: 40250 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 418
Credit: 5,667,249
RAC: 48
Message 40300 - Posted: 27 Oct 2019, 16:12:27 UTC
Last modified: 27 Oct 2019, 16:21:17 UTC

This does not make any sense. What is the program doing when the CPU is at 5% and the disk is not in use, and there's no network activity? There has to be a bottleneck somewhere! What is preventing the program from maxing out the CPU?

Ignore the above paragraph. I was only looking at a 6 core task, the 1 core tasks weren't downloaded yet. I can understand a 6 core task having 1 core waiting on the other. Anyway it's jumped up to 60% now, and once I install the new virtualbox maybe even more.

I was going to delete this post, but this forum seems to have no delete function.
ID: 40300 · Report as offensive     Reply Quote
far

Send message
Joined: 27 May 11
Posts: 5
Credit: 9,747,819
RAC: 0
Message 40422 - Posted: 13 Nov 2019, 9:06:12 UTC

Updating VirtualBox to current version (not the old one bundled by Boinc), plus disabling Atlas and theory tasks got the machine running at about 95% consistently (albeit over a spread of tasks). Which was quite an improvement over the previous average but about the same peak/best usage.

After applying Windows 10 feature update 19H2 today, the pc sits finally sits consistently at 100% usage! After nearly 2 years..
There was something in the update about being more aware of cores and tasks + favoured cores etc, and it's made a difference for me :-)

Hopefully this helps others with same issues also.
ID: 40422 · Report as offensive     Reply Quote

Message boards : Number crunching : Inefficient usage of AMD threadripper 16c32t CPU


©2024 CERN