Message boards : Number crunching : Not all cores / threads being utilized .
Message board moderation

To post messages, you must log in.

AuthorMessage
Perle
Avatar

Send message
Joined: 25 Oct 04
Posts: 83
Credit: 78,462,065
RAC: 27,137
Message 45945 - Posted: 24 Dec 2021, 18:14:38 UTC

just built this system. ID: 10748799
Xeonster
4,288.09 50,713 7.16.20
GenuineIntel Intel(R) Xeon(R) CPU E5-2650L v3 @ 1.80GHz [Family 6 Model 63 Stepping 2]
(48 processors)
NVIDIA NVIDIA GeForce GTX 1070 (4095MB) driver: 497.29 OpenCL: 3.0
Microsoft Windows 10
Professional x64 Edition, (10.00.19043.00)

all power settings are set to High Performance in the bios and in windows power management .

Bios sees both cpu's and windows correctly reports both cpu's , cores and threads 2x12x24x48

yet it seems to only be running 32 wu's at a time .

When running ...the Task manager shows 60% cpu usage and about the same in Ram usage .

thanks for any ideas .
ID: 45945 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 675
Credit: 43,556,368
RAC: 15,479
Message 45947 - Posted: 24 Dec 2021, 21:03:38 UTC

You have run Theory and CMS tasks on that Host. Note that the CMS tasks require 2-3 GB memory for each task so you may hit a memory limit with them. The Theory tasks require about 700 MB per task. Check what is the limit of memory you allow the Boinc to use when crunching.
ID: 45947 · Report as offensive     Reply Quote
Perle
Avatar

Send message
Joined: 25 Oct 04
Posts: 83
Credit: 78,462,065
RAC: 27,137
Message 45948 - Posted: 25 Dec 2021, 5:50:50 UTC - in response to Message 45947.  
Last modified: 25 Dec 2021, 5:54:26 UTC

thanks for the response ... it is set at 95%

right this moment 71% cpu and 62% memory at 32 threads.


..another thing that I dont understand .. on a 500 gb drive that is only for boinc ...

... i say use 95% as well

when ever I look at the Disk usage there is 300+ GB available for boinc ...

but is it always asking for a mere 300 mb or 7696 mb

The whole drive is open and available ...
ID: 45948 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2402
Credit: 225,608,518
RAC: 121,604
Message 45949 - Posted: 25 Dec 2021, 9:09:08 UTC - in response to Message 45945.  

Your logfiles show lots of entries like this:
2021-12-23 19:41:49 (1712): Stopping VM.
2021-12-23 19:41:49 (1712): Error in stop VM for VM: -108
Command:
VBoxManage -q controlvm "boinc_1f03da790d8d87e4" savestate
Output:

2021-12-23 19:41:49 (1712): VM did not stop when requested.
2021-12-23 19:41:49 (1712): VM was successfully terminated.

The more VMs are running concurrently the more important it becomes not to start/stop/suspend/resume them concurrently.
BOINC doesn't provide that out of the box - it just sees a small program called vboxwrapper.
Hence it must be done manually or by a self made script.


In addition it is strongly recommended that clusters with more than 5 cores are attached via a local Squid proxy.
Your cluster has more than 90 cores.
One single Squid instance within your LAN would be enough.
Setup should be done within 1 hour.
See:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5473
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5474
ID: 45949 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2402
Credit: 225,608,518
RAC: 121,604
Message 45950 - Posted: 25 Dec 2021, 10:23:46 UTC - in response to Message 45945.  

it seems to only be running 32 wu's at a time

As far as I understand BOINC's calculation method instead of the real memory consumption (like task manager) it uses a dummy value called "working_set_size_smoothed" for VM tasks.
Regarding CMS on Linux this is set to "3000000000" (bytes).
On a computer with 64GB RAM this would allow 22 active CMS tasks (with no RAM left for any other processes).

The corresponding value for Theory can be found in client_state.xml.
ID: 45950 · Report as offensive     Reply Quote

Message boards : Number crunching : Not all cores / threads being utilized .


©2024 CERN