Message boards :
ATLAS application :
Running ATLAS task switching to "waiting for memory" status
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1559 Credit: 57,703,398 RAC: 42,315 ![]() ![]() ![]() |
There is a strange thing which I notice for the past few days: Once in a while, one of the three 1-core ATLAS tasks halts, and the status in the BOINC manager is shown as "waiting for memory". When this happens, the task releases several GB of RAM. What I then do is: I suspend one of the other two ATLAS tasks, and after a short while the "waiting" task starts running again, taking back these several GB of RAM which it had released before. All in all, there must not be a memory shortage problem, since out of my 32GB RAM, 8-9 GB are always unused. Also, per app_config.xml each ATLAS tasks is assigned 8 GB RAM, out of which up to 6 GB are normally used (as seen in console 3). So, I am wondering what's the problem, all of a sudden. For long time, this kind of setting has worked without this specific problem. Any ideas? |
Send message Joined: 27 Sep 08 Posts: 744 Credit: 554,923,984 RAC: 292,113 ![]() ![]() ![]() |
The information from the appconfig is not known by boinc, so it uses the "The memory allocated to the virtual machine is calculated based on the number of cores following the formula: 2.6GB + 0.9GB * ncores." calculation based on the web settings. e.g. if app config say use 5Gb but we config say 8 cores then bonic will think it's using 9.8GB and stop tasks if there isn't enough ram. I assume you could set the number of cores on the web to 1 then it will only use 3.6GB as far as boinc is concerned and start up to 8 tasks if you have 8 core machine. |
Send message Joined: 18 Dec 15 Posts: 1559 Credit: 57,703,398 RAC: 42,315 ![]() ![]() ![]() |
e.g. if app config say use 5Gb but we config say 8 cores then bonic will think it's using 9.8GB and stop tasks if there isn't enough ramthe thing is: there is definitely enough RAM. At any time, there are 7-8 GB of RAM free. I assume you could set the number of cores on the web to 1 then it will only use 3.6GB as far as boinc is concerned and start up to 8 tasks if you have 8 core machine.the problem is, as already discussed in another thread here, that due to a misconfiguration in the websettings, the settings "max # jobs" and "max # CPUs" is mixed up. Which unfortunately means: if "max # cores" is set to1, only 1 task can be downloaded. So, if I want to process three 1-core tasks concurrently, I have to set both "max # jobs" and "max # cores" to 3, and in the app_config.xml I set the max number of cores to 1. As a result, I can download 3 tasks which are then processed as 1-core task each. As said before, this has worked well all time long. |
![]() Send message Joined: 28 Sep 04 Posts: 604 Credit: 36,910,884 RAC: 16,460 ![]() ![]() ![]() |
Check your Boinc settings: How much of of memory (%) you are allowing Boinc to use when computer is in use and how much when computer is idle. ![]() |
Send message Joined: 18 Dec 15 Posts: 1559 Credit: 57,703,398 RAC: 42,315 ![]() ![]() ![]() |
Check your Boinc settings: How much of of memory (%) you are allowing Boinc to use when computer is in use and how much when computer is idle.both 95%, i.e. 30GB. So, whenever this strange behaviour happens, 5-6GB are still free for BOINC. What should be noted: this does not happen everytime, but it has happened once in while during the past days. But never before. Really strange. Also, as recently written here, I cannot process more than 3 ATLAS 1-core tasks concurrently, even if in the app_config.xml the number is set to 4 (with a 7000MB RAM setting for each task). Any idea why this is so? |
![]() Send message Joined: 28 Sep 04 Posts: 604 Credit: 36,910,884 RAC: 16,460 ![]() ![]() ![]() |
Check your Boinc settings: How much of of memory (%) you are allowing Boinc to use when computer is in use and how much when computer is idle.both 95%, i.e. 30GB. So, whenever this strange behaviour happens, 5-6GB are still free for BOINC. Sorry, I don't have an answer to that. Is LHC the only project you are running on that host? Have you tried to set some debug flags for cc_config.xml to see what Boinc thinks is happening? There is at least <mem_usage_debug> you could try. More info in here: http://boinc.berkeley.edu/wiki/Client_configuration ![]() |
©2023 CERN