Thread 'Move TheoryN Back Into Theory.'

Author	Message
maeax Send message Joined: 2 May 07 Posts: 2305 Credit: 179,727,092 RAC: 5,767	Message 40582 - Posted: 22 Nov 2019, 7:30:41 UTC Is this a possibility for get only one task in -native? <fetch_minimal_work>0\|1</fetch_minimal_work> Fetch one job per device (see --fetch_minimal_work). https://boinc.berkeley.edu/wiki/Client_configuration#Logging_flags ID: 40582 · Reply Quote

Laurence Project administrator Project developer Send message Joined: 20 Jun 14 Posts: 431 Credit: 256,317 RAC: 27	Message 40588 - Posted: 22 Nov 2019, 12:30:15 UTC - in response to Message 40578. This will affect all VBox apps. I will investigate how to disable Max # of CPUs for single threaded apps. I believe the relevant line is here. I don't think we can just disable this as it is used for ATLAS and CMS to select the number of CPUs to use for a VM so there needs to be an AND statement with something where that something is essentially !Theory. ID: 40588 · Reply Quote

Laurence Project administrator Project developer Send message Joined: 20 Jun 14 Posts: 431 Credit: 256,317 RAC: 27	Message 40592 - Posted: 22 Nov 2019, 15:57:42 UTC - in response to Message 40588. This will affect all VBox apps. I will investigate how to disable Max # of CPUs for single threaded apps. I believe the relevant line is here. I don't think we can just disable this as it is used for ATLAS and CMS to select the number of CPUs to use for a VM so there needs to be an AND statement with something where that something is essentially !Theory. So thinking about this a bit more, I think the use of Max # CPUs is a mistake. For a start from the BOINC scheduling perspective this is threads. In the vboxwrapper, this parameter is interpreted as CPUs. Also in the current implementation this affects the whole project where you may want to defined the VM size by host and project. The best way to do this is in the app_config.xml on the client, for example: <app_config> <app_version> <app_name>ATLAS</app_name> <plan_class>vbox64_mt_mcore_atlas</plan_class> <avg_ncpus>x</avg_ncpus> </app_version> </app_config> or <app_config> <app_version> <app_name>ATLAS</app_name> <plan_class>vbox64_mt_mcore_atlas</plan_class> <cmdline>--nthreads 7</cmdline> </app_version>] </app_config> My suggestion would be therefore to disable the Max # CPU functionality and control it with the app_config.xml. Comments? ID: 40592 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1561 Credit: 10,110,280 RAC: 1,268	Message 40593 - Posted: 22 Nov 2019, 16:16:57 UTC - in response to Message 40592. Last modified: 22 Nov 2019, 16:18:38 UTC My suggestion would be therefore to disable the Max # CPU functionality and control it with the app_config.xml. Comments? There is one big BUT. When using only single core from server perspective, BOINC credit will calculate and grant credit based on the elapsed time times reported GFLOPS. When a user setup VM's as dual, quadcore whatever by using app_config.xml, the elapsed time will reduce and his credit will be significant lower. A lot of crunchers will not appreciate that. We know that only ATLAS real benefits from multi-core. As long as ATLAS tasks are rather equal one could change to fixed credit / task for ATLAS. ID: 40593 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2760 Credit: 305,490,006 RAC: 124,853	Message 40595 - Posted: 22 Nov 2019, 16:47:43 UTC - in response to Message 40592. One of the reasons why Max#CPUs has been introduced was to simplify the ATLAS multicore configuration for users who didn't want to deal with an app_config.xml When a workunit is sent to the client via scheduler reply it includes an <app_version> section and within this section <avg_ncpus> is set. The client copies this <avg_ncpus> value to it's client_state.xml but overwrites it with the value from app_config.xml. From now on the client as well as the vboxwrapper use the value that has been set last. Unfortunately <avg_ncpus> is never reported back to the server via scheduler request. It's in the user's responsibility to keep the values in sync. If server and client are not in sync this affects credit calculation as well as the amount of work the server will send in future requests (at least until the FLOPS value is adjusted). If Max#CPUs will be deactivated, every workunit will be treated as singlecore on the server, even ATLAS. Not a problem for other apps since all of them are now singlecore but ATLAS may require a reworked default policy, e.g. a singlecore default combined with a new method to identify if the client runs tasks with n threads. ID: 40595 · Reply Quote

maeax Send message Joined: 2 May 07 Posts: 2305 Credit: 179,727,092 RAC: 5,767	Message 40596 - Posted: 22 Nov 2019, 17:34:34 UTC native Theory(300.02) and native-Atlas(2.73) get only ONE Task for me. native Theory(1.01) got two tasks with allways ONE Cpu in use. Had nothing changed in prefs or app_config. ID: 40596 · Reply Quote

Laurence Project administrator Project developer Send message Joined: 20 Jun 14 Posts: 431 Credit: 256,317 RAC: 27	Message 40602 - Posted: 22 Nov 2019, 20:57:36 UTC - in response to Message 40596. native Theory(300.02) and native-Atlas(2.73) get only ONE Task for me. native Theory(1.01) got two tasks with allways ONE Cpu in use. Had nothing changed in prefs or app_config. This agrees with the configuration. Both ATLAS and Theory set a limit of one task per cpu and TheoryN had two. ID: 40602 · Reply Quote

Laurence Project administrator Project developer Send message Joined: 20 Jun 14 Posts: 431 Credit: 256,317 RAC: 27	Message 40603 - Posted: 22 Nov 2019, 21:01:06 UTC - in response to Message 40593. My suggestion would be therefore to disable the Max # CPU functionality and control it with the app_config.xml. Comments? There is one big BUT. When using only single core from server perspective, BOINC credit will calculate and grant credit based on the elapsed time times reported GFLOPS. When a user setup VM's as dual, quadcore whatever by using app_config.xml, the elapsed time will reduce and his credit will be significant lower. A lot of crunchers will not appreciate that. We know that only ATLAS real benefits from multi-core. As long as ATLAS tasks are rather equal one could change to fixed credit / task for ATLAS. Thanks for pointing this out. I didn't appreciate the affect on the credit. ID: 40603 · Reply Quote

Laurence Project administrator Project developer Send message Joined: 20 Jun 14 Posts: 431 Credit: 256,317 RAC: 27	Message 40604 - Posted: 22 Nov 2019, 21:12:09 UTC - in response to Message 40595. Last modified: 22 Nov 2019, 21:14:34 UTC One of the reasons why Max#CPUs has been introduced was to simplify the ATLAS multicore configuration for users who didn't want to deal with an app_config.xml This leads to a wider discussion but essentially there are policies and the implementation of those polices. We should first understand the policy that we need, how it should be implemented, then how it can be implemented within the limitations of the existing code base. When a workunit is sent to the client via scheduler reply it includes an section and within this section is set. The client copies this value to it's client_state.xml but overwrites it with the value from app_config.xml. From now on the client as well as the vboxwrapper use the value that has been set last. Unfortunately is never reported back to the server via scheduler request. It's in the user's responsibility to keep the values in sync. If server and client are not in sync this affects credit calculation as well as the amount of work the server will send in future requests (at least until the FLOPS value is adjusted). Thanks, I didn't appreciate this subtlety. If Max#CPUs will be deactivated, every workunit will be treated as singlecore on the server, even ATLAS. Not a problem for other apps since all of them are now singlecore but ATLAS may require a reworked default policy, e.g. a singlecore default combined with a new method to identify if the client runs tasks with n threads. I think the current implementation is wrong. This sets max_cpus to be effective_ncpus but you are talking about avg_ncpus. Will dig a little more into the code. ID: 40604 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2760 Credit: 305,490,006 RAC: 124,853	Message 40605 - Posted: 22 Nov 2019, 21:40:29 UTC - in response to Message 40604. ... you are talking about avg_ncpus See: https://boinc.berkeley.edu/wiki/Client_configuration https://github.com/BOINC/boinc/blob/3a132e1ee2964fc8a03adcff7b369bd32231377b/client/app_config.cpp Only <avg_ncpus> can be modified by app_config.xml. Other (CPU-)options not mentioned at the referenced pages will be ignored by the client. ID: 40605 · Reply Quote

Aurum Send message Joined: 12 Jun 18 Posts: 142 Credit: 57,421,670 RAC: 4	Message 40612 - Posted: 23 Nov 2019, 14:53:35 UTC - in response to Message 40595. Unfortunately <avg_ncpus> is never reported back to the server via scheduler request. It's the user's responsibility to keep the values in sync. Should clients be adding a line to our ATLAS app_config files??? ID: 40612 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2760 Credit: 305,490,006 RAC: 124,853	Message 40613 - Posted: 23 Nov 2019, 16:08:53 UTC - in response to Message 40612. Should clients be adding a line to our ATLAS app_config files??? Would better be asked in the ATLAS threads. The lower the #threads the more efficient an ATLAS task will be. 2 reasons lead to higher #threads: - ATLAS (vbox) requires lots of RAM per task - less #threads ==> higher overall runtimes If your web preferences are set to "unlimited" then control the #threads using <avg_ncpus> in your app_config.xml The server will set up the task as an n-core task with n = the number of reported CPUs but with an upper limit of 12. This affects only ATLAS as meanwhile all other LHC@home apps are singlecore. ID: 40613 · Reply Quote

Laurence Project administrator Project developer Send message Joined: 20 Jun 14 Posts: 431 Credit: 256,317 RAC: 27	Message 40650 - Posted: 25 Nov 2019, 12:43:19 UTC - in response to Message 40613. I am going to move this discussion to the number crunching topic as it affects all apps. ID: 40650 · Reply Quote