Message boards : Theory Application : Move TheoryN Back Into Theory.
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,902,375
RAC: 2,454
Message 40582 - Posted: 22 Nov 2019, 7:30:41 UTC

Is this a possibility for get only one task in -native?
<fetch_minimal_work>0|1</fetch_minimal_work>
Fetch one job per device (see --fetch_minimal_work).
https://boinc.berkeley.edu/wiki/Client_configuration#Logging_flags
ID: 40582 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 40588 - Posted: 22 Nov 2019, 12:30:15 UTC - in response to Message 40578.  

This will affect all VBox apps. I will investigate how to disable Max # of CPUs for single threaded apps.

I believe the relevant line is here. I don't think we can just disable this as it is used for ATLAS and CMS to select the number of CPUs to use for a VM so there needs to be an AND statement with something where that something is essentially !Theory.
ID: 40588 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 40592 - Posted: 22 Nov 2019, 15:57:42 UTC - in response to Message 40588.  

This will affect all VBox apps. I will investigate how to disable Max # of CPUs for single threaded apps.

I believe the relevant line is here. I don't think we can just disable this as it is used for ATLAS and CMS to select the number of CPUs to use for a VM so there needs to be an AND statement with something where that something is essentially !Theory.


So thinking about this a bit more, I think the use of Max # CPUs is a mistake. For a start from the BOINC scheduling perspective this is threads. In the vboxwrapper, this parameter is interpreted as CPUs. Also in the current implementation this affects the whole project where you may want to defined the VM size by host and project. The best way to do this is in the app_config.xml on the client, for example:

<app_config>
   <app_version>
       <app_name>ATLAS</app_name>
       <plan_class>vbox64_mt_mcore_atlas</plan_class>
       <avg_ncpus>x</avg_ncpus>
   </app_version>
</app_config>

or
<app_config>
   <app_version>
       <app_name>ATLAS</app_name>
       <plan_class>vbox64_mt_mcore_atlas</plan_class>
       <cmdline>--nthreads 7</cmdline>
   </app_version>]
</app_config>


My suggestion would be therefore to disable the Max # CPU functionality and control it with the app_config.xml. Comments?
ID: 40592 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1417
Credit: 9,441,837
RAC: 794
Message 40593 - Posted: 22 Nov 2019, 16:16:57 UTC - in response to Message 40592.  
Last modified: 22 Nov 2019, 16:18:38 UTC

My suggestion would be therefore to disable the Max # CPU functionality and control it with the app_config.xml. Comments?
There is one big BUT.
When using only single core from server perspective, BOINC credit will calculate and grant credit based on the elapsed time times reported GFLOPS.
When a user setup VM's as dual, quadcore whatever by using app_config.xml, the elapsed time will reduce and his credit will be significant lower.
A lot of crunchers will not appreciate that.
We know that only ATLAS real benefits from multi-core. As long as ATLAS tasks are rather equal one could change to fixed credit / task for ATLAS.
ID: 40593 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 34,439
Message 40595 - Posted: 22 Nov 2019, 16:47:43 UTC - in response to Message 40592.  

One of the reasons why Max#CPUs has been introduced was to simplify the ATLAS multicore configuration for users who didn't want to deal with an app_config.xml

When a workunit is sent to the client via scheduler reply it includes an <app_version> section and within this section <avg_ncpus> is set.
The client copies this <avg_ncpus> value to it's client_state.xml but overwrites it with the value from app_config.xml.
From now on the client as well as the vboxwrapper use the value that has been set last.

Unfortunately <avg_ncpus> is never reported back to the server via scheduler request.
It's in the user's responsibility to keep the values in sync.

If server and client are not in sync this affects credit calculation as well as the amount of work the server will send in future requests (at least until the FLOPS value is adjusted).



If Max#CPUs will be deactivated, every workunit will be treated as singlecore on the server, even ATLAS.
Not a problem for other apps since all of them are now singlecore but ATLAS may require a reworked default policy, e.g. a singlecore default combined with a new method to identify if the client runs tasks with n threads.
ID: 40595 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,902,375
RAC: 2,454
Message 40596 - Posted: 22 Nov 2019, 17:34:34 UTC

native Theory(300.02) and native-Atlas(2.73) get only ONE Task for me.
native Theory(1.01) got two tasks with allways ONE Cpu in use.
Had nothing changed in prefs or app_config.
ID: 40596 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 40602 - Posted: 22 Nov 2019, 20:57:36 UTC - in response to Message 40596.  

native Theory(300.02) and native-Atlas(2.73) get only ONE Task for me.
native Theory(1.01) got two tasks with allways ONE Cpu in use.
Had nothing changed in prefs or app_config.

This agrees with the configuration. Both ATLAS and Theory set a limit of one task per cpu and TheoryN had two.
ID: 40602 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 40603 - Posted: 22 Nov 2019, 21:01:06 UTC - in response to Message 40593.  

My suggestion would be therefore to disable the Max # CPU functionality and control it with the app_config.xml. Comments?
There is one big BUT.
When using only single core from server perspective, BOINC credit will calculate and grant credit based on the elapsed time times reported GFLOPS.
When a user setup VM's as dual, quadcore whatever by using app_config.xml, the elapsed time will reduce and his credit will be significant lower.
A lot of crunchers will not appreciate that.
We know that only ATLAS real benefits from multi-core. As long as ATLAS tasks are rather equal one could change to fixed credit / task for ATLAS.

Thanks for pointing this out. I didn't appreciate the affect on the credit.
ID: 40603 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 40604 - Posted: 22 Nov 2019, 21:12:09 UTC - in response to Message 40595.  
Last modified: 22 Nov 2019, 21:14:34 UTC

One of the reasons why Max#CPUs has been introduced was to simplify the ATLAS multicore configuration for users who didn't want to deal with an app_config.xml

This leads to a wider discussion but essentially there are policies and the implementation of those polices. We should first understand the policy that we need, how it should be implemented, then how it can be implemented within the limitations of the existing code base.

When a workunit is sent to the client via scheduler reply it includes an section and within this section is set.
The client copies this value to it's client_state.xml but overwrites it with the value from app_config.xml.
From now on the client as well as the vboxwrapper use the value that has been set last.

Unfortunately is never reported back to the server via scheduler request.
It's in the user's responsibility to keep the values in sync.

If server and client are not in sync this affects credit calculation as well as the amount of work the server will send in future requests (at least until the FLOPS value is adjusted).

Thanks, I didn't appreciate this subtlety.


If Max#CPUs will be deactivated, every workunit will be treated as singlecore on the server, even ATLAS.
Not a problem for other apps since all of them are now singlecore but ATLAS may require a reworked default policy, e.g. a singlecore default combined with a new method to identify if the client runs tasks with n threads.

I think the current implementation is wrong. This sets max_cpus to be effective_ncpus but you are talking about avg_ncpus. Will dig a little more into the code.
ID: 40604 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 34,439
Message 40605 - Posted: 22 Nov 2019, 21:40:29 UTC - in response to Message 40604.  

... you are talking about avg_ncpus

See:
https://boinc.berkeley.edu/wiki/Client_configuration
https://github.com/BOINC/boinc/blob/3a132e1ee2964fc8a03adcff7b369bd32231377b/client/app_config.cpp
Only <avg_ncpus> can be modified by app_config.xml.
Other (CPU-)options not mentioned at the referenced pages will be ignored by the client.
ID: 40605 · Report as offensive     Reply Quote
Aurum
Avatar

Send message
Joined: 12 Jun 18
Posts: 126
Credit: 53,906,164
RAC: 0
Message 40612 - Posted: 23 Nov 2019, 14:53:35 UTC - in response to Message 40595.  

Unfortunately <avg_ncpus> is never reported back to the server via scheduler request.
It's the user's responsibility to keep the values in sync.
Should clients be adding a line to our ATLAS app_config files???
ID: 40612 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 34,439
Message 40613 - Posted: 23 Nov 2019, 16:08:53 UTC - in response to Message 40612.  

Should clients be adding a line to our ATLAS app_config files???

Would better be asked in the ATLAS threads.

The lower the #threads the more efficient an ATLAS task will be.
2 reasons lead to higher #threads:
- ATLAS (vbox) requires lots of RAM per task
- less #threads ==> higher overall runtimes

If your web preferences are set to "unlimited" then control the #threads using <avg_ncpus> in your app_config.xml
The server will set up the task as an n-core task with n = the number of reported CPUs but with an upper limit of 12.
This affects only ATLAS as meanwhile all other LHC@home apps are singlecore.
ID: 40613 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 40650 - Posted: 25 Nov 2019, 12:43:19 UTC - in response to Message 40613.  

I am going to move this discussion to the number crunching topic as it affects all apps.
ID: 40650 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4

Message boards : Theory Application : Move TheoryN Back Into Theory.


©2024 CERN