Message boards :
Number crunching :
CPUs left unused???
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
For example, an E5-2699v4 has 22c/44t with 4 GPU WUs. That leaves 40 threads for LHC WUs. I do not use virtualbox so I get WUs for sixtract, native theory & atlas. I'm not running any other CPU project. Yet it's only running 23 WUs and leaving 17 threads idle. I also have 1 CPU max set in preferences. It has 32 GB RAM and is using 17.5 GB now. I have not created an app_config file and my cc_client has <ncpus>-1</ncpus>. Any idea why so many resources are not being used??? |
Send message Joined: 15 Jun 08 Posts: 2411 Credit: 226,401,842 RAC: 131,719 |
Your BOINC client may be set to use not more than 55-60% of the available RAM. Check this setting in your BOINC manager. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
Memory is set to 95, 95 & 95% with 100% CPU. Plenty of space left on the SSD. Might there be an L3 Cache limitation??? An E5-2699v4 has 55 MB. |
Send message Joined: 15 Jun 08 Posts: 2411 Credit: 226,401,842 RAC: 131,719 |
L3 Cache limitation Nice joke! ;-D Plenty of space left on the SSD Did you also check how much disk space your BOINC client is allowed to use? Do you have unstarted tasks waiting in your buffer? If not you may have hit a project limit (max #tasks). In this case you can only solve the issue if you set up additional BOINC clients on that machine. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
L3 Cache limitation. Nice joke! ;-D Wish I knew some computer jokes. The MIP project on WCG admitted to improperly programming the L3 Cache usage. Limited to 5 MB per WU, above that CPU performance drops over 60% for all work. Not that it might change the number of WUs the server DLs. Did you also check how much disk space your BOINC client is allowed to use? That computer is using 4.79 GB for LHC with 57.5 GB available to BOINC. Do you have unstarted tasks waiting in your buffer? No. I increased my buffer from 0.2/0.2 to 0.5/0.5 and still no waiting WUs. If not you may have hit a project limit (max #tasks). In this case you can only solve the issue if you set up additional BOINC clients on that machine. Bizarre! Why would they set a project limit? Don't they want the work to get done??? I'm not willing to set up additional BOINC clients so I guess I'll move to another project. Just Allowed New Work for another CPU project & it immediately filled the 17 idle CPU threads. |
Send message Joined: 14 Jan 10 Posts: 1280 Credit: 8,491,903 RAC: 2,069 |
|
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
Did you set |
Send message Joined: 14 Jan 10 Posts: 1280 Credit: 8,491,903 RAC: 2,069 |
There is something wrong with the server setting: Max # CPUs 1. When you set no limit, I think you will fill up your buffer with sixtracks. See if that works, but disable ATLAS and Theory native for the time being. If your problem is solved (getting all cores busy), you could use an app_config.xml to manage the cores for Theory and ATLAS. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
I just checked all my computers and for CPUs with 28 or fewer threads it fills up with LHC WUs. For CPUs with 32, 36, 40 & 44 threads it gets fewer than a 28 thread CPU. I just allowed another CPU project TN-Grid for all big CPUs. I'll suspend them and try the CPU unlimited. Using an app_config to limit the number of WUs for an application throws the queue out of balance. The server DLs too many of the restricted WUs and maybe not enough of the unrestricted applications. This approach requires babysitting and aborting excess restricted applications. Not the way I like to go. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
Just suspended TN-Grid on those 13 computers and switched preferences to CPU No Limit. Got one or two sixtracts but mostly 2C Theory_native and 12C Atlas_native. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
For CPUs with 32 or more threads if LHC plus another CPU project are allowed then the total number of running WUs is limited. If LHC project is suspended then the non-LHC project starts running all the WUs it rightly should. So until LHC fixes this problem I'll direct those 13 computers to TN-Grid and turn off LHC. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,905,247 RAC: 60,096 |
Sure would be nice if LHC staff would look at this bug and comment. TIA |
©2024 CERN