Questions and Answers : Unix/Linux : CMS Simulation 47.60 (vbox64) CPU core load is much less than allowed
Message board moderation

To post messages, you must log in.

AuthorMessage
Marton Nemeth

Send message
Joined: 10 Feb 17
Posts: 6
Credit: 39,117
RAC: 0
Message 28898 - Posted: 17 Feb 2017, 5:57:41 UTC

I'm running Boinc 7.6.33 and Oracle VirtualBox 5.1.14 r112924 (Qt5.6.1) on Ubuntu 16.10. If the computer gets "CMS Simulation 47.60 (vbox64)" Work Unit then it is hardly loading any of the cores, "top" shows less than 6% core usage. In boincmgr the Options/Computing Preferences/Computing/Usage limits are set to "Use at most 100% of the CPUs", "Use at most 100% of CPU time", "Suspend when computer is in use" is not set, "Suspend when non-BOINC CPU usage is above ...%" is not set.

For testing purposes I tried to install Ubuntu 16.10 inside a VirtualBox with 1024MB Base Memory, 1 CPU, Execution Cap: 100%, Enabled VT-x/AMD-V and enabled Nested Paging Hardware virtualization. I tried to run memtester 4.3.0 (64-bit) to test 512MB of the available memory ("memtester 512M 1"). During the testing "top" reports around 100% CPU usage.

Is there any other setting which can be tuned so that CMS Simulation can utilize 100% of the available core computing capacity?


root@quinkana3:~# top -b -u boinc -n 1
top - 07:35:38 up 14:30, 2 users, load average: 1.71, 1.76, 1.73
Tasks: 1103 total, 7 running, 1096 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.2 us, 0.7 sy, 0.3 ni, 96.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 52644848+total, 48294528+free, 12452732 used, 31050460 buff/cache
KiB Swap: 2946044 total, 2945236 free, 808 used. 51069523+avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND WCHAN
168672 boinc 39 19 2127140 86248 43800 S 5.3 0.0 6:51.77 VBoxHeadless poll_sche+
172135 boinc 39 19 2149668 87528 43976 R 5.3 0.0 6:55.30 VBoxHeadless poll_sche+
172480 boinc 39 19 2118948 87232 43788 S 5.3 0.0 6:53.19 VBoxHeadless poll_sche+
2558 boinc 30 10 4496 768 692 S 0.0 0.0 0:00.00 sh wait
2560 boinc 30 10 286684 15984 11684 S 0.0 0.0 6:42.48 boinc poll_sche+
153474 boinc 39 19 165564 12624 10068 S 0.0 0.0 63:30.74 VBoxXPCOMIPCD poll_sche+
153479 boinc 39 19 3384060 35336 18652 S 0.0 0.0 173:26.31 VBoxSVC poll_sche+
155171 boinc 39 19 13396 3644 2808 S 0.0 0.0 1:15.40 vboxwrapper_261 hrtimer_n+
156619 boinc 39 19 13004 3512 2840 S 0.0 0.0 1:14.10 vboxwrapper_261 hrtimer_n+
156921 boinc 39 19 13000 3464 2796 S 0.0 0.0 1:14.15 vboxwrapper_261 pipe_wait
157100 boinc 39 19 13000 3532 2864 S 0.0 0.0 1:13.75 vboxwrapper_261 pipe_wait
157280 boinc 39 19 13004 3480 2808 S 0.0 0.0 1:14.83 vboxwrapper_261 hrtimer_n+
157401 boinc 39 19 13000 3432 2764 S 0.0 0.0 1:14.55 vboxwrapper_261 hrtimer_n+
157461 boinc 39 19 13000 3464 2796 S 0.0 0.0 1:15.02 vboxwrapper_261 hrtimer_n+
157641 boinc 39 19 13004 3504 2836 S 0.0 0.0 1:16.11 vboxwrapper_261 hrtimer_n+
157822 boinc 39 19 13004 3436 2764 S 0.0 0.0 1:14.21 vboxwrapper_261 hrtimer_n+
157882 boinc 39 19 13000 3532 2864 S 0.0 0.0 1:13.43 vboxwrapper_261 hrtimer_n+
158062 boinc 39 19 13000 3484 2816 S 0.0 0.0 1:14.50 vboxwrapper_261 pipe_wait
158122 boinc 39 19 13000 3484 2816 S 0.0 0.0 1:14.16 vboxwrapper_261 hrtimer_n+
158182 boinc 39 19 12996 3488 2824 S 0.0 0.0 1:14.23 vboxwrapper_261 hrtimer_n+
158422 boinc 39 19 13000 3508 2840 S 0.0 0.0 1:14.19 vboxwrapper_261 pipe_wait
158542 boinc 39 19 12996 3388 2724 S 0.0 0.0 1:13.73 vboxwrapper_261 hrtimer_n+
165402 boinc 39 19 2143040 86880 43916 S 0.0 0.0 6:49.78 VBoxHeadless poll_sche+
168566 boinc 39 19 2067616 87012 43508 S 0.0 0.0 6:49.73 VBoxHeadless poll_sche+
169870 boinc 39 19 2079904 86840 44132 S 0.0 0.0 6:39.25 VBoxHeadless poll_sche+
170761 boinc 39 19 2086048 87332 44096 S 0.0 0.0 6:44.15 VBoxHeadless poll_sche+
170990 boinc 39 19 2149668 86492 43724 S 0.0 0.0 6:58.66 VBoxHeadless poll_sche+
171396 boinc 39 19 2051232 87420 43980 S 0.0 0.0 7:08.20 VBoxHeadless poll_sche+
171586 boinc 39 19 2139428 91108 43692 S 0.0 0.0 7:07.92 VBoxHeadless poll_sche+
171942 boinc 39 19 2155704 88552 45736 S 0.0 0.0 6:46.32 VBoxHeadless poll_sche+
171943 boinc 39 19 2116900 87292 43532 S 0.0 0.0 7:15.02 VBoxHeadless poll_sche+
172134 boinc 39 19 2123044 90596 46048 S 0.0 0.0 6:54.14 VBoxHeadless poll_sche+
172228 boinc 39 19 2073760 87396 43996 S 0.0 0.0 6:53.35 VBoxHeadless poll_sche+
172261 boinc 39 19 2129188 86144 43716 R 0.0 0.0 6:55.01 VBoxHeadless poll_sche+
178698 boinc 39 19 4496 700 624 S 0.0 0.0 0:00.00 sh wait
178699 boinc 39 19 4496 700 628 S 0.0 0.0 0:00.00 sh wait
178700 boinc 39 19 4496 784 712 S 0.0 0.0 0:00.00 sh wait
178701 boinc 39 19 4496 708 636 S 0.0 0.0 0:00.00 sh wait
178702 boinc 39 19 17344 876 800 R 0.0 0.0 0:00.00 VBoxManage -
178703 boinc 39 19 309928 12896 11056 R 0.0 0.0 0:00.00 VBoxManage -
178704 boinc 39 19 23360 1064 968 R 0.0 0.0 0:00.00 VBoxManage -
178706 boinc 39 19 25632 1200 1092 R 0.0 0.0 0:00.00 VBoxManage -
root@quinkana3:~# boinccmd --get_tasks

======== Tasks ========
1) -----------
name: CMS_29065_1487279674.946249_0
WU name: CMS_29065_1487279674.946249
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:26:16 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 401.340000
current CPU time: 409.780000
fraction done: 0.373256
swap size: 2106 MB
working set size: 2384 MB
estimated CPU time remaining: 40664.013609
2) -----------
name: CMS_29066_1487279674.977954_0
WU name: CMS_29066_1487279674.977954
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:26:16 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 405.190000
current CPU time: 413.420000
fraction done: 0.373179
swap size: 2038 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
3) -----------
name: CMS_32355_1487280276.285156_0
WU name: CMS_32355_1487280276.285156
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:26:47 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 419.020000
current CPU time: 427.990000
fraction done: 0.373181
swap size: 2102 MB
working set size: 2384 MB
estimated CPU time remaining: 40668.866738
4) -----------
name: CMS_29064_1487279674.918296_0
WU name: CMS_29064_1487279674.918296
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:26:47 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 391.210000
current CPU time: 399.300000
fraction done: 0.373195
swap size: 2044 MB
working set size: 2384 MB
estimated CPU time remaining: 40667.977863
5) -----------
name: CMS_23363_1487278171.671498_0
WU name: CMS_23363_1487278171.671498
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:27:13 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 405.900000
current CPU time: 414.220000
fraction done: 0.373179
swap size: 2086 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
6) -----------
name: CMS_25608_1487278773.168180_0
WU name: CMS_25608_1487278773.168180
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:27:23 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 396.170000
current CPU time: 404.210000
fraction done: 0.373194
swap size: 2050 MB
working set size: 2384 MB
estimated CPU time remaining: 40668.023280
7) -----------
name: CMS_32352_1487280276.171441_0
WU name: CMS_32352_1487280276.171441
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:27:23 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 425.890000
current CPU time: 435.100000
fraction done: 0.373179
swap size: 2080 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
8) -----------
name: CMS_25606_1487278773.131620_0
WU name: CMS_25606_1487278773.131620
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:27:44 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 406.410000
current CPU time: 415.360000
fraction done: 0.373179
swap size: 2112 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
9) -----------
name: CMS_32354_1487280276.258223_0
WU name: CMS_32354_1487280276.258223
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:27:44 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 406.080000
current CPU time: 415.080000
fraction done: 0.373179
swap size: 2092 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
10) -----------
name: CMS_32353_1487280276.203482_0
WU name: CMS_32353_1487280276.203482
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:28:00 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 404.100000
current CPU time: 413.260000
fraction done: 0.373179
swap size: 2082 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
11) -----------
name: CMS_25610_1487278773.246978_0
WU name: CMS_25610_1487278773.246978
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:28:00 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 403.410000
current CPU time: 411.800000
fraction done: 0.373211
swap size: 2090 MB
working set size: 2384 MB
estimated CPU time remaining: 40666.926784
12) -----------
name: CMS_1155_1487280578.765914_0
WU name: CMS_1155_1487280578.765914
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:28:12 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 410.650000
current CPU time: 418.720000
fraction done: 0.373194
swap size: 2112 MB
working set size: 2384 MB
estimated CPU time remaining: 40668.023280
13) -----------
name: CMS_24478_1487278472.511551_0
WU name: CMS_24478_1487278472.511551
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:28:12 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 397.780000
current CPU time: 406.390000
fraction done: 0.373179
swap size: 2118 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
14) -----------
name: CMS_32358_1487280276.388765_0
WU name: CMS_32358_1487280276.388765
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:28:27 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 401.430000
current CPU time: 409.770000
fraction done: 0.373212
swap size: 2032 MB
working set size: 2384 MB
estimated CPU time remaining: 40666.907320
15) -----------
name: CMS_32357_1487280276.359628_0
WU name: CMS_32357_1487280276.359628
project URL: https://lhcathome.cern.ch/lhcathome/
report deadline: Sat Mar 18 23:28:27 2017
ready to report: no
got server ack: no
final CPU time: 0.000000
state: downloaded
scheduler state: scheduled
exit_status: 0
signal: 0
suspended via GUI: no
active_task_state: EXECUTING
app version num: 4760
checkpoint CPU time: 420.650000
current CPU time: 428.260000
fraction done: 0.373179
swap size: 2384 MB
working set size: 2384 MB
estimated CPU time remaining: 40669.022453
root@quinkana3:~# boinccmd --get_host_info
timezone: 3600
domain name: quinkana3
IP addr: 192.168.2.10
#CPUS: 112
CPU vendor: GenuineIntel
CPU model: 06/55 [Family 6 Model 85 Stepping 2]
CPU FP OPS: 2613250214.725667
CPU int OPS: 34274369110.431229
CPU mem BW: 1000000000.000000
OS name: Linux
OS version: 4.8.0-37-generic
mem size: 539083247616.000000
cache size: 40370176.000000
swap size: 3016749056.000000
disk size: 125488705536.000000
disk free: 89301372928.000000
root@quinkana3:~# cat /etc/boinc-client/cc_config.xml
<!--
This is a minimal configuration file cc_config.xml of the BOINC core client.
For a complete list of all available options and logging flags and their
meaning see: https://boinc.berkeley.edu/wiki/client_configuration
-->
<cc_config>
<log_flags>
<task>1</task>
<file_xfer>1</file_xfer>
<sched_ops>1</sched_ops>
</log_flags>
</cc_config>
root@quinkana3:~# cat /etc/boinc-client/global_prefs_override.xml
<global_preferences>
<run_on_batteries>0</run_on_batteries>
<run_if_user_active>1</run_if_user_active>
<run_gpu_if_user_active>1</run_gpu_if_user_active>
<suspend_cpu_usage>0.000000</suspend_cpu_usage>
<start_hour>0.000000</start_hour>
<end_hour>0.000000</end_hour>
<net_start_hour>0.000000</net_start_hour>
<net_end_hour>0.000000</net_end_hour>
<leave_apps_in_memory>0</leave_apps_in_memory>
<confirm_before_connecting>1</confirm_before_connecting>
<hangup_if_dialed>0</hangup_if_dialed>
<dont_verify_images>0</dont_verify_images>
<work_buf_min_days>0.010000</work_buf_min_days>
<work_buf_additional_days>0.020000</work_buf_additional_days>
<max_ncpus_pct>100.000000</max_ncpus_pct>
<cpu_scheduling_period_minutes>60.000000</cpu_scheduling_period_minutes>
<disk_interval>600.000000</disk_interval>
<disk_max_used_gb>40.000000</disk_max_used_gb>
<disk_max_used_pct>100.000000</disk_max_used_pct>
<disk_min_free_gb>0.000000</disk_min_free_gb>
<vm_max_used_pct>75.000000</vm_max_used_pct>
<ram_max_used_busy_pct>50.000000</ram_max_used_busy_pct>
<ram_max_used_idle_pct>90.000000</ram_max_used_idle_pct>
<max_bytes_sec_up>0.000000</max_bytes_sec_up>
<max_bytes_sec_down>0.000000</max_bytes_sec_down>
<cpu_usage_limit>100.000000</cpu_usage_limit>
<daily_xfer_limit_mb>0.000000</daily_xfer_limit_mb>
<daily_xfer_period_days>0</daily_xfer_period_days>
</global_preferences>
root@quinkana3:~# boinc --version
7.6.33 x86_64-pc-linux-gnu
root@quinkana3:~# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.10
Release: 16.10
Codename: yakkety
root@quinkana3:~#


Logs for loading a test VirtualBox instance with memtester:

root@quinkana3:~# top -b -u bhumanem -n 1
top - 07:49:10 up 14:44, 2 users, load average: 1.70, 2.04, 1.91
Tasks: 1096 total, 1 running, 1095 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.1 us, 0.7 sy, 0.3 ni, 96.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem : 52644848+total, 48200105+free, 13439328 used, 31008088 buff/cache
KiB Swap: 2946044 total, 2945236 free, 808 used. 50970128+avail Mem

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND WCHAN
95388 bhumanem 20 0 3200676 500992 426120 S 100.0 0.1 1:10.75 VirtualBox poll_sche+
33291 bhumanem 20 0 62896 6472 5340 S 0.0 0.0 0:00.04 systemd ep_poll
33293 bhumanem 20 0 242268 7856 0 S 0.0 0.0 0:00.00 (sd-pam) sigtimedw+
33313 bhumanem 20 0 43128 3944 3416 S 0.0 0.0 0:00.05 dbus-daemon ep_poll
33341 bhumanem 20 0 187424 5168 4648 S 0.0 0.0 0:00.00 dconf-service poll_sche+
33398 bhumanem 20 0 282208 6372 5660 S 0.0 0.0 0:00.01 gvfsd poll_sche+
33536 bhumanem 20 0 350208 5512 4940 S 0.0 0.0 0:00.00 gvfsd-fuse futex_wai+
67253 bhumanem 20 0 34512 20788 10456 S 0.0 0.0 0:32.02 Xvnc4 poll_sche+
68345 bhumanem 20 0 346896 5888 5372 S 0.0 0.0 0:00.01 at-spi-bus-laun poll_sche+
68509 bhumanem 20 0 43000 3440 3096 S 0.0 0.0 0:00.00 dbus-daemon ep_poll
68526 bhumanem 20 0 213860 6872 6188 S 0.0 0.0 0:00.01 at-spi2-registr poll_sche+
82371 bhumanem 20 0 80568 8072 5688 S 0.0 0.0 0:00.14 xterm poll_sche+
82477 bhumanem 20 0 21452 5444 3404 S 0.0 0.0 0:00.12 bash wait
94135 bhumanem 20 0 1048684 72820 56620 S 0.0 0.0 0:01.61 VirtualBox poll_sche+
94171 bhumanem 20 0 165544 12676 10048 S 0.0 0.0 0:00.53 VBoxXPCOMIPCD poll_sche+
94176 bhumanem 20 0 690904 22728 18192 S 0.0 0.0 0:01.19 VBoxSVC poll_sche+
95616 bhumanem 20 0 334488 7180 5972 S 0.0 0.0 0:00.01 pulseaudio poll_sche+
131855 bhumanem 20 0 105988 4216 3180 S 0.0 0.0 0:00.42 sshd -
131931 bhumanem 20 0 21396 5508 3520 S 0.0 0.0 0:00.08 bash wait
150743 bhumanem 20 0 67016 5508 5000 S 0.0 0.0 0:00.03 gconfd-2 poll_sche+
root@quinkana3:~#
ID: 28898 · Report as offensive     Reply Quote
marmot
Avatar

Send message
Joined: 5 Nov 15
Posts: 144
Credit: 6,301,268
RAC: 0
Message 28912 - Posted: 18 Feb 2017, 21:16:12 UTC
Last modified: 18 Feb 2017, 21:28:49 UTC

CMS and Theory do not become CPU intensive till 8 to 20 minutes into their run. From my experience, they have lapses in CPU usage over their entire runs and keep the CPU's cooler than many other projects even when task managers show they are at 100% usage. (related to intensity of work put to the floating point units?)

To have 100% core usage 24/7, (and maximum winter heat) I ran other CPU aggressive projects in combination with vLHC and avoided multi-core Theory/CMS WU's because of some inefficiencies I noticed in VBox multi-core where the VM's OS would take up a complete virtual core on OS services.
ID: 28912 · Report as offensive     Reply Quote
Marton Nemeth

Send message
Joined: 10 Feb 17
Posts: 6
Credit: 39,117
RAC: 0
Message 28920 - Posted: 19 Feb 2017, 10:14:19 UTC - in response to Message 28912.  

It was found that port 3125 is rejected from remote end. This might be the result of network firewall but this was not confirmed, yet. Looks like in my system the failure of accessing port 3125 causes that CMS Simulation might not start the actual computation, but this is just a speculation as of now.
ID: 28920 · Report as offensive     Reply Quote

Questions and Answers : Unix/Linux : CMS Simulation 47.60 (vbox64) CPU core load is much less than allowed


©2024 CERN