Message boards :
ATLAS application :
Wrong memory usage for multicore
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,763,678 RAC: 232,370 |
I was trying to get multicore working and it was almost there the CPU load was good but I hit the working set problems from before. I was looking at the other projects and ATLAS. For CMS in the file CMS_2016_03_22.xml there is <memory_size_mb>2048 Looking at the memory in BOINC, the task uses 2.33GB, and in the logs the VM is using 2048MB, there is 0.33GB added as some buffer I assume. These numbers match up with the <rsc_memory_bound> in the init_data.xml Taking ATLAS with appconfig of 5000MB and 2 cores, and looking at the memory in BOINC, the task is using 18.16GB!! So where does this come from? Breaking it down it seems that the appconfig doesn't override the setting properly. For ATLAS the <rsc_memory_bound> in the init_data.xml is 18.16GB too. So it would seem that the memory could be calculated as so: 5000MB from the ATLAS_2017_01_09.xml (not sure why this is 5000MB?) 2x5000MB from the app_config.xml 3400MB from the normal single core task setting (for ATLAS it's unclear how the normal settings for VM are applied?) Plus a little overage as before (its a different amount from before so I'm not sure how it's calculated?) Anyway maybe someone expert in BOINC can work it out?? I'll go back to single cores as I get waiting for memory as BOINC thinks the WU's are using 18GB when there are using 4.xxGB |
Send message Joined: 2 Jan 11 Posts: 23 Credit: 5,986,899 RAC: 0 |
I had the same problem. The point is (somebody told it to me), that the value of Max # CPU on the LHC-page is used to calculate the amount of memory. I set the Max # CPU to a lower value (6 instead of 24) and use a cc_config for the real number. |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
I had the same problem. The point is (somebody told it to me), that the value of Max # CPU on the LHC-page is used to calculate the amount of memory. I set the Max # CPU to a lower value (6 instead of 24) and use a cc_config for the real number. Yes, this is at the moment the only working trick ! Supporting BOINC, a great concept ! |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,951,161 RAC: 137,233 |
... For CMS in the file CMS_2016_03_22.xml there is <memory_size_mb>2048 ... This value is used as default if none of the values described next apply: 1. The server sends a <cmdline>--memory_size_mb nnnn</cmdline> 2. The user defines a <cmdline>--memory_size_mb nnnn</cmdline> in app_config.xml Settings from a higher number overwrite settings from a lower number. This value directly controls the VM´s RAM setting. ... Looking at the memory in BOINC, the task uses 2.33GB ... What you see here is the result of the value in <rsc_memory_bound>. This value is sent by the server. A user can not change it. How does it work? Suppose you have a host wit 16 GB RAM and you allow BOINC to use 60% of it. That makes 9.6 GB Now your BOINC client will run up to 4 WUs (2.33 x 4 = 9.32) and beside that (third party) tasks with together less than <rsc_memory_bound>0.28</rsc_memory_bound>. If your BOINC client has 3 CERN WUs (2.33 GB) and a (third party) task with <rsc_memory_bound>0.3</rsc_memory_bound> running or at least in memory there would be only 2.31 GB left. Although an additional VM (2.0 GB) would fit the BOINC client would not start it (9.62 > 9.6). |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,763,678 RAC: 232,370 |
I made the max number of CPU's no limit which gave 0 in config files, so I would imagine this has no impact or it mess up totally (anything * 0 =0) I think the best option then would be to set # of CPU to 1 on the config page as this shouldn't mess with anything. To me there is still some inconsistancies in how the memory is calculated give the point of the multicore is to use less RAM. |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,763,678 RAC: 232,370 |
Setting the app_config to use 4600MB with the websettings of unlimited for both. and forcing number of cores to 20 (as per my CPU) I now get 15.04 GB <rsc_memory_bound> |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
|
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,763,678 RAC: 232,370 |
With the setting of 1 then RSC bound is 3.32GB, here BOINC could schedule more work than there is ram for as it has the wrong amount of RAM. With setting of 2 it's 4.10GB, still less than set by app_config. With setting of 3 it's 4.88GB, so this is slightly over the true amount set in the app_config but not excessive |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
|
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,951,161 RAC: 137,233 |
The 2.33 GB you wrote about here are for CMS? CMS is a 1-core app so I guess the <rsc_memory_bound> is set to a fix value on the server. Instead <rsc_memory_bound> for ATLAS should be calculated from the core setting. I´m not sure if this calculation is error free. Another point: If you play around with app_config.xml some parameters may not be reset if you delete them from the file. Unfortunately a hint in the BOINC documentation disappeard on the current page? Try at least a BOINC restart or a project reset. |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,763,678 RAC: 232,370 |
Looks like it, although from all the testing people did it was 4400MB for dual seems like it could do with a tweek. |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,763,678 RAC: 232,370 |
Yes, I took the CMS as baseline for investigation. The unlimited setting on the web seems to mess all the calculations up, I assume div/0 problems. I normally shutdown BOINC completely as this seems to let it read the appconfig properly. with 3 cores set on the web this does OK calculation for RAM usage and multicore for the RSC bound. What's really happening in the VM is fine too as this is set in the appconfig. |
©2024 CERN