Message boards : Number crunching : Max # jobs and Max # CPUs
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 40651 - Posted: 25 Nov 2019, 12:58:15 UTC
Last modified: 26 Nov 2019, 12:57:14 UTC

The Max # jobs and Max # CPUs settings in the project preferences were initially added so that we could limit the number jobs and CPUs used by a new volunteer by providing default value which could be changed later. The aim was to stop a machine maxing out on VM tasks and rendering the host unusable for anything else. The current behavior for a multi-threaded application is as follows (threads are CPUs for the VM apps):
Max 1 CPU, Max 1 Job => 1 single threaded job
Max 2 CPU, Max 1 Job => 2 threaded job 
Max 1 CPU, Max 2 Job => 1 single threaded job
Max 2 CPU, Max 2 Job => 2 x 2 threaded jobs

In practice Max CPUs is used to set the number of threads and hence CPUs to be used by a VM, hence Max 1 CPU, Max 2 Job => 1 single threaded job, does not function as expected. In this case it should run two single CPU jobs. As far as I understand we could remove the Max CPU setting and nthreads could be set for the app in the app_config.xml. The two reasons given for not doing this are:

    * As nthreads is set after the job has been sent out via the scheduler, this value is not taken into consideration when assigning credit
    * Prefer to set it via the Web page rather than editing XML


Since the recent changes to Theory, this mainly affects that ATLAS application as setting Max # CPU is only relevant for that application.

Further comments welcome.

ID: 40651 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 928
Credit: 33,756,172
RAC: 2,611
Message 40652 - Posted: 25 Nov 2019, 13:18:15 UTC

Run native if available?
is now in Atlas-Prefs. Do we need to activate it now?
Is the Beta-Pref obsolet?
ID: 40652 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 40653 - Posted: 25 Nov 2019, 13:59:57 UTC - in response to Message 40652.  

Run native if available?
is now in Atlas-Prefs. Do we need to activate it now?
Is the Beta-Pref obsolet?

I haven't enabled this for ATLAS yet, am waiting for the green light.
ID: 40653 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1443
Credit: 76,443,429
RAC: 99,305
Message 40655 - Posted: 25 Nov 2019, 14:49:57 UTC

An average user would always expect both settings as independent from each other.


Hence
Max #CPUs should do the correct settings for all following variables
- <avg_ncpus>
- --nthreads
- --memory_size_mb
- <rsc_memory_bound>


Max #tasks
- should act like <project_max_concurrent>


Only then a default setting via web preferences, e.g. 1:1, would make sense and a user could set higher limits without dealing with an app_config.xml.



All complaints in the message board turn around the fact that Max #CPUs also limits Max#task.
Far more -> it even limits the #tasks that can be downloaded.
That's what nobody - especially new users - don't understand!
ID: 40655 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 598
Credit: 373,243,160
RAC: 59,149
Message 40657 - Posted: 25 Nov 2019, 17:08:37 UTC - in response to Message 40655.  
Last modified: 25 Nov 2019, 17:09:16 UTC

I agree with computezrmle

One additional point that I have found challenging since the only option that works for me is unlimited (due to job/task limit), the amount of memory used can be tricky. With the previous config I had to script a workaround for the working set. As it is now I set the --memory_size_mb greater than the project defined numbers so no script needed but I have to take care I don't use more RAM than I actually have since BOINC doesn't actually know how much RAM is being used.

I also run all my ATLAS WU in a separate instance of BOINC since unlimited for this project is 1 Job, so the setting needed for ATLAS are not compatible with settings for Theory and CMS.
ID: 40657 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1283
Credit: 23,077,796
RAC: 2,904
Message 40665 - Posted: 25 Nov 2019, 19:58:14 UTC - in response to Message 40655.  

An average user would always expect both settings as independent from each other. ...

All complaints in the message board turn around the fact that Max #CPUs also limits Max#task.
Far more -> it even limits the #tasks that can be downloaded.
That's what nobody - especially new users - don't understand!
+ 1
ID: 40665 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 116
Credit: 13,379,669
RAC: 8,575
Message 40673 - Posted: 26 Nov 2019, 9:22:02 UTC - in response to Message 40655.  

If someone's going to be diving into the code:
Max #CPUs should do the correct settings for all following variables...
Also, "Max #CPUs" is misleadingly named since it is a specification, not a limit.
e.g. my 4-core machine asks for 4-core Atlas tasks: if it was sent a 3-core one then there would have to be a slot completely wasted, and then if the project tried to send a (long-running) single-core to fill the space it would take days to untangle the mess...!

(And single-threaded-only apps will just ignore that setting, whatever it's called).
ID: 40673 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 951
Credit: 6,307,994
RAC: 493
Message 40676 - Posted: 26 Nov 2019, 11:30:12 UTC - in response to Message 40673.  

If someone's going to be diving into the code:
Max #CPUs should do the correct settings for all following variables...
Also, "Max #CPUs" is misleadingly named since it is a specification, not a limit.
e.g. my 4-core machine asks for 4-core Atlas tasks: if it was sent a 3-core one then there would have to be a slot completely wasted, and then if the project tried to send a (long-running) single-core to fill the space it would take days to untangle the mess...!

The server does not sent a single-, dual-, quad-, etc-core tasks. It just send a task and your preference setting decides how it will be treated on your system - how many cores the VM to create must have.

And yes, you're right it's not a limit, but a spec.
ID: 40676 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 40681 - Posted: 26 Nov 2019, 14:33:15 UTC - in response to Message 40653.  

Run native if available?
is now in Atlas-Prefs. Do we need to activate it now?
Is the Beta-Pref obsolet?

I haven't enabled this for ATLAS yet, am waiting for the green light.

It is now enabled.
ID: 40681 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 40682 - Posted: 26 Nov 2019, 14:52:27 UTC - in response to Message 40655.  

As far as I understand the scheduler code, the issue is that the project preferences setting is limiting the ncpus and that this value is used to set the number of threads. We probably don't want to touch ncpus and just set the number of threads. Will switch over to the dev project to test out some changes.Those that want to join can see me there.
ID: 40682 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 431
Credit: 23,051,674
RAC: 14,096
Message 40683 - Posted: 26 Nov 2019, 15:20:31 UTC

Something has changed now. My settings are No limit for Max# jobs and 4 for Max# CPUs and my host has now 8 Atlas and 8 Theory tasks downloaded. Three Atlas tasks are running with 4 CPUs (set by app_config.xml) + 1 Theory task (15/16 CPUs allowed for Boinc to use, 2 CPUs reserved for GPU tasks).
ID: 40683 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 928
Credit: 33,756,172
RAC: 2,611
Message 40684 - Posted: 26 Nov 2019, 15:57:00 UTC - in response to Message 40681.  

Run native if available?
is now in Atlas-Prefs. Do we need to activate it now?
Is the Beta-Pref obsolet?

I haven't enabled this for ATLAS yet, am waiting for the green light.

It is now enabled.

Test-Application must be enabled.
Run native if available? must be enabled - This is not translated in German-language.
Got now one task in Linux VM-native.
ID: 40684 · Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jun 18
Posts: 88
Credit: 35,825,464
RAC: 8,387
Message 40708 - Posted: 27 Nov 2019, 14:44:18 UTC - in response to Message 40655.  
Last modified: 27 Nov 2019, 14:44:48 UTC

Max #tasks
- should act like <project_max_concurrent>
Be even better if Max#tasks behaved liked like:
<max_concurrent>
and there was a setting in Preferences for each project.
ID: 40708 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 336
Credit: 237,918
RAC: 0
Message 40709 - Posted: 27 Nov 2019, 15:48:15 UTC - in response to Message 40708.  

I have updated the scheduler. I hope it is an improvement over what we have now even though there is still a small issue. For those that are interested I have opened an issue in github.
ID: 40709 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 40965 - Posted: 15 Dec 2019, 18:15:26 UTC

Hello,
Please let me know if this is the wrong thread. I've been doing some forum searching and just want to make sure I'm understanding this correctly.
I am one of those new users and am struggling to understand, but think I have it figured out.
All projects, apart from Atlas, are single threaded, regardless if they run on Native for Linux or vbox for Windows. This is why max number of CPUs is offered and if it is changed, vms it will use that many cores, regardless if a task from another project was running. For instance, on the single little core 2 duo I have running on here, if there was an Asteroids at home task running, and boinc decided a native theory task should start, with max number of jobs set to 2, there would be 2 theory tasks using 0.5 CPUs (sharing one core) and asteroids would use the second if max CPUs was also set to 2. Am I understanding this correctly or does this also apply to Atlas?

I have a few Haswell machines with 32 GB of ram, an older Sandy bridge i7 with 16, and an fx8350. I take it setting max number of CPU and jobs is dependent on what subprojects I'd like to run? Keep in mind these are dedicated crunchers - 3 out of the 4 are running linux but I'm unsure of how to start off and don't want to overburden myself with work or end up having failed tasks because I didn't configure something correctly.

Any recommendations on a per project basis? I'm still trying to figure out the ins and outs of each - runtimes, resources, etc. i like how it's a little more involved and requires a little more work on the Linux side. I feel like I'm actually contributing to something special.

thanks.
ID: 40965 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1443
Credit: 76,443,429
RAC: 99,305
Message 40966 - Posted: 15 Dec 2019, 19:19:57 UTC - in response to Message 40965.  

Hello,
Please let me know if this is the wrong thread. I've been doing some forum searching and just want to make sure I'm understanding this correctly.
I am one of those new users and am struggling to understand, but think I have it figured out.
All projects, apart from Atlas, are single threaded, regardless if they run on Native for Linux or vbox for Windows. This is why max number of CPUs is offered and if it is changed, vms it will use that many cores, regardless if a task from another project was running.

Max number of CPUs only affects ATLAS since this is the only multicore app here at LHC@home.

In case of ATLAS native it sets
- the #CPUs so the BOINC client knows how apps from all projects interact.
- the #threads the native app will start to transform EVNTS to HITS

In case of ATLAS vbox (windows or linux) it sets
- the #CPUs so the BOINC client knows how apps from all projects interact.
- the vbox VM is set to use #CPUs
- the vbox VM is configured to use more RAM if more CPUs are configured

The original idea to introduce this parameter was to easily configure #CPUs and RAM for ATLAS without an app_config.xml.


Now you changed from max #CPUs to max#Tasks!!
For instance, on the single little core 2 duo I have running on here, if there was an Asteroids at home task running, and boinc decided a native theory task should start, with max number of jobs set to 2,...

... there would be 2 theory tasks using 0.5 CPUs (sharing one core) and asteroids would use the second if max CPUs was also set to 2. Am I understanding this correctly or does this also apply to Atlas?

As explained above the #CPUs setting does only affect ATLAS.
Theory tasks remain singlecore, independent from the web setting.


I have a few Haswell machines with 32 GB of ram, an older Sandy bridge i7 with 16, and an fx8350. I take it setting max number of CPU and jobs is dependent on what subprojects I'd like to run? Keep in mind these are dedicated crunchers - 3 out of the 4 are running linux but I'm unsure of how to start off and don't want to overburden myself with work or end up having failed tasks because I didn't configure something correctly.

As a rule of thumb: ATLAS is more efficient if it uses just a few cores (<= 4).

I would start with a 2-core setup (ATLAS) and would leave 2 cores free for the OS and GPU tasks.
It mainly depends on your physical RAM compared to your total #cores.
ID: 40966 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 436
Credit: 11,956,914
RAC: 3,262
Message 40967 - Posted: 15 Dec 2019, 19:27:43 UTC - in response to Message 40965.  

Any recommendations on a per project basis? I'm still trying to figure out the ins and outs of each - runtimes, resources, etc.

Since you are already running Theory OK, you have CVMFS installed properly, so try native ATLAS.
And since you have 32 GB of memory, you can set Max # CPUs 1 and use some of it, as it is a little more efficient than using more cores per work unity (which is used if you want to save on memory).

BUT: I have found that to run native ATLAS, you have to grant additional permissions after attaching to LHC and downloading at least one native ATLAS.
Then: Your run "sudo chmod -R 777 /var/lib/boinc-client" and reboot. Then it might work.

And don't even think of upgrading to BOINC 7.16.3, or native ATLAS falls apart.

Everything else is pretty easy beyond that. To run CMS all you need is VirtualBox.
And for SixTrack you don't even need that.
ID: 40967 · Report as offensive     Reply Quote
wolfman1360

Send message
Joined: 17 Feb 17
Posts: 16
Credit: 119,862
RAC: 0
Message 40969 - Posted: 15 Dec 2019, 20:34:48 UTC - in response to Message 40967.  

you are already running Theory OK, you have CVMFS installed properly, so try native ATLAS.
And since you have 32 GB of memory, you can set Max # CPUs 1 and use some of it, as it is a little more efficient than using more cores per work unity (which is used if you want to save on memory).

BUT: I have found that to run native ATLAS, you have to grant additional permissions after attaching to LHC and downloading at least one native ATLAS.
Then: Your run "sudo chmod -R 777 /var/lib/boinc-client" and reboot. Then it might work.

So, set number of cores to 1, number of tasks to unlimited for the 32 gb machine. The core 2 duo only has 4 gb, but is also running Linux. Should that be okay with Atlas too? It seems to be crunching away at theory right now. I'll keep that command in mind when I get an Atlas task.[/quote]

And don't even think of upgrading to BOINC 7.16.3, or native ATLAS falls apart.

Everything else is pretty easy beyond that. To run CMS all you need is VirtualBox.
And for SixTrack you don't even need that.

Well. I am unfortunately running Boinc 7.16.3 thanks to costamagnagianfranco/boinc. Is there a quick method to downgrade to an earlier version or am I out of luck?

So to summarize: Number of CPUs only applies to atlas and I should set it to 1 (with 32 gb ram), number of tasks set to unlimited since I want all cores filled in that case. With machines with half the ram I should set an alternative profile with CPU cores set to 2 and jobs again set to unlimited. Should I specifically run Atlas tasks to start with under both since I hear multicore tasks can reek havoc on single core tasks?

I do have Windows hosts, they're just finishing up on another project and then I'll start porting everything over here, though this does beg an interesting question since I can't find it in the forums. I do have a Ryzen 1800x which for some unknown reason does not have virtualization able to be selected in the bios. Does native in Linux allow one not to have that running and still get by, or does it need to be enabled even if you're not using virtual box? On a similar note, do I need to install additional packages under virtual box or is that just to be able to view the current VM status? Under Linux most of my machines are headless so this would be pointless?
ID: 40969 · Report as offensive     Reply Quote
lazlo_vii
Avatar

Send message
Joined: 20 Nov 19
Posts: 21
Credit: 1,074,330
RAC: 0
Message 40970 - Posted: 15 Dec 2019, 20:43:34 UTC - in response to Message 40967.  

...run "sudo chmod -R 777 /var/lib/boinc-client"...


Friends do not let Friends chmod 777. Don't do it. It is dangerous.
ID: 40970 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 928
Credit: 33,756,172
RAC: 2,611
Message 40971 - Posted: 16 Dec 2019, 0:01:51 UTC - in response to Message 40969.  

Hi Wolfman1360,
you need more than 4 GByte for Atlas. Bios must be SVM or AMD-V enabled for Hardware Acceleration.
You find the most answers, when you read Yeti's Checklist in the Atlas-Folder.
Thank you.
ID: 40971 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Max # jobs and Max # CPUs


©2020 CERN