1) Message boards : Number crunching : Issues Controlling Number of Threads Used (Message 42144)
Posted 12 Apr 2020 by Profile RueiKe
Post:
Thanks @computezrmle and @Gunde for the helpful feedback.

I was expecting a waiting for memory error on the 2990WX systems and assumed they were not attempting to run more than 16 threads for some other reason. It make sense that it limits the amount of threads based on memory. I wonder of the cost of memory has come down.. I will get a quote.

For the Intel system, I had fixed the virtualization setting in BIOS at the time of my post. The system had been running on LHC for a while, so I wrongly assumed it was already enabled. Once I enabled, I observed no issues.

Not sure why I saw one of the 1950X systems running more threads than the system has. Well actually the sum of the MT allocation for the Atlas tasks plus the other tasks was greater than number of threads, but maybe actually used was lower. When in this state, it would sporadically stop tasks and indicate it was waiting for memory. Seems ok now Perhaps an app_config.xml is needed here.

I did have an VM become corrupt on the Epyc system. Not sure why, but perhaps an upgrade to virtualbox is needed. What is the best version people are using on Ubuntu?3

Seems like it is tricky to make sure systems don't download CMS. I thought I had it so all of my systems would only download Atlas and Sixtrack, but one system is still downloading Theory.

One more question, if I set MT to 4 CPUs, will it select cores based on NUMA nodes?
2) Message boards : Number crunching : Issues Controlling Number of Threads Used (Message 42140)
Posted 12 Apr 2020 by Profile RueiKe
Post:
I am having trouble controlling the number of task running on my machines. In the past, I only had LHC download a limited number of tasks and run whenever available. So I had not experienced problems in the past few years. Now I am ramping up LHC on all of my systems as the primary project. My experience with loading has varied by system types:

    1) Intel 10core 20 thread, running windows and latest client loads as the options->computing-preferences-usage_limits specifies. No issues
    2) Two Threadripper 1950Xs (32 threads) in Windows, seems to max out loading (> number of threads available), ignoring options->computing-preferences-usage_limits. These are running 17.4.2 boincmgr.
    3) Two Threadripper 2990WXs (32 threads) in Linux, only run 16 threads each, no matter what I set options->computing-preferences-usage_limits to. I am using TBar's 7.8.3 build of boincmgr from previous SETI work.
    4) A 7702p on Linux seems to follow options->computing-preferences-usage_limits. It is also running TBar's build, same as my other 2 Linux systems.



Am I missing a configuration setting? Is there a known issue with older versions of boincmgr?

I am also working other issues: CMS tasks erroring out and Theory tasks erroring after several days. I am trying to now focus on just getting Atlas running and will troubleshoot the other later.

3) Message boards : Number crunching : Notices: Needs More Disk Space (Message 42112)
Posted 9 Apr 2020 by Profile RueiKe
Post:
You may set a value for "Use no more than...".

Just set to 250GB and will monitor...
4) Message boards : Number crunching : Notices: Needs More Disk Space (Message 42110)
Posted 9 Apr 2020 by Profile RueiKe
Post:
I should probably add that this is a 64core 7702P Rome CPU with 256GB of memory, 1TB ssd for home partition and 250GB ssd for system:
Filesystem     1K-blocks      Used Available Use% Mounted on
udev           131950444         0 131950444   0% /dev
tmpfs           26401552      2200  26399352   1% /run
/dev/nvme0n1p2 238798492  16285752 210312704   8% /
tmpfs          132007740    494736 131513004   1% /dev/shm
tmpfs               5120         4      5116   1% /run/lock
tmpfs          132007740         0 132007740   0% /sys/fs/cgroup
/dev/nvme0n1p1    523248      6152    517096   2% /boot/efi
/dev/nvme1n1   960381672 235680796 675846364  26% /home
tmpfs           26401548        48  26401500   1% /run/user/1000
5) Message boards : Number crunching : Notices: Needs More Disk Space (Message 42109)
Posted 9 Apr 2020 by Profile RueiKe
Post:
Looks more like a BOINC client message.
What (diskspace) quota do you allow your BOINC client to use?
You may check if that quota is already full.


boincmgr indicates 644 GB free and available to BOINC

In computing preferences:
Use no more than ___ is unchecked
Leave at least 0.1GB free is checked
Use no more than 90% of total is checked.
6) Message boards : Number crunching : Notices: Needs More Disk Space (Message 42107)
Posted 9 Apr 2020 by Profile RueiKe
Post:
On one of my systems, I get the following messages:
LHC@home: Notice from server
SixTrack needs 190.73MB more disk space. You currently have 0.00 MB available and it needs 190.73 MB.
Thu 09 Apr 2020 07:16:09 PM CST
LHC@home: Notice from server
ATLAS Simulation needs 9536.74MB more disk space. You currently have 0.00 MB available and it needs 9536.74 MB.
Thu 09 Apr 2020 07:16:09 PM CST
LHC@home: Notice from server
Theory Simulation needs 7629.39MB more disk space. You currently have 0.00 MB available and it needs 7629.39 MB.
Thu 09 Apr 2020 07:16:09 PM CST


I have plenty of empty disk space, but noticed many /dev/loop devices that appear to be full. Is the low disk space message due to low space in a VM? If so, can I configure the VM to be larger?



©2024 CERN