Thread 'Memory requirements for LHC applications'

Author	Message
bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 37374 - Posted: 19 Nov 2018, 4:58:11 UTC - in response to Message 37372. Last modified: 19 Nov 2018, 4:59:54 UTC Formula suggestion for native ATLAS tasks with some additional safety margin and considering the fact that the longer the tasks run the more memory they need (at least the â€œused memoryâ€ value rises with run time): 2100MB + 300MBnCPUs Nice work, gyllic, thank you :-) Due to the additional safety margin, I think 2100MB +300nCPUs is the better formula suggestion. ID: 37374 · Reply Quote

David Cameron Project administrator Project developer Project scientist Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0	Message 37385 - Posted: 20 Nov 2018, 14:29:19 UTC Hi all, the thing to remember with ATLAS native is that the memory requirements configured on the BOINC server (minimum memory required on your machine to run a task) is different to the memory actually available to the task. With vbox the memory requirement is directly translated to the memory capacity of the VM and if the task uses more than that it crashes. With native the memory requirement is only used to say whether or not your host can run the tasks and the task has in theory the whole memory of your host available (and more depending on whether you have configured swap). Of course we try to specify a reasonable requirement not to overload hosts. The memory requirement we use for ATLAS native is a fixed 4GB and does not vary with number of cores. This means that any host with 4GB of more of RAM can run the tasks no matter how many cores they have. As other very nice studies have shown here, the actual memory used is roughly 2100MB + 300MB*nCPUs and this is what you should use when figuring out the configuration of your ATLAS tasks. But as far as the BOINC scheduler is concerned, as long as you have at least 4GB you will be sent tasks. The reason for the 4GB limit is that there is a point near the beginning of the tasks where there is a spike of memory consumption slightly above 4GB, and we found that if you have a host with 4GB RAM (in reality ~3.8GB available to the OS) and no swap, the task would crash due to out of memory errors. Specifying 4GB in the means that hosts with "4GB RAM" will be excluded because the RAM available is always slightly less than 4GB. ID: 37385 · Reply Quote

David Cameron Project administrator Project developer Project scientist Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0	Message 37398 - Posted: 22 Nov 2018, 8:42:16 UTC I mined some numbers from our ATLAS database, here is the average max memory used per task grouped by number of cores: Note the y-axis scale says MB but it should be GB. This includes native and vbox tasks but it is what is reported by the task itself, so it's independent of whether vbox is used. ID: 37398 · Reply Quote

gyllic Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,659,192 RAC: 56	Message 37400 - Posted: 22 Nov 2018, 15:28:44 UTC - in response to Message 37398. Thanks for the info david! Just out of interest, do you get these values you used for the plot from memory_monitor_out (or something like this) files? If not, how do you get those values? How big are the differences in used/needed RAM depending on the task IDs (probably small because the vbox app uses a fixed value for all different task IDs)? ID: 37400 · Reply Quote

David Cameron Project administrator Project developer Project scientist Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0	Message 37409 - Posted: 23 Nov 2018, 8:13:46 UTC - in response to Message 37400. Here is the same plot with each column split into different task IDs: As you can see it is very similar for each task. Every ATLAS task monitors itself and while it is running writes information to several files including the ones like memory_monitor. At the end of the task all these measurements are reported back to ATLAS central databases from which we can make plots like this. ID: 37409 · Reply Quote

Harri Liljeroos Send message Joined: 28 Sep 04 Posts: 795 Credit: 63,912,805 RAC: 30,885	Message 37411 - Posted: 23 Nov 2018, 9:02:19 UTC - in response to Message 37409. Are you talking about Boinc tasks or VB jobs or what? ID: 37411 · Reply Quote

gyllic Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,659,192 RAC: 56	Message 37412 - Posted: 23 Nov 2018, 9:29:48 UTC - in response to Message 37411. Last modified: 23 Nov 2018, 9:30:52 UTC Thanks David! Are you talking about Boinc tasks or VB jobs or what? About VB jobs (that run inside the vbox/boinc tasks) and native ATLAS tasks (these are the same as the vbox jobs that run inside the vbox/boinc tasks). So the entire vbox/boinc task will need much more RAM than shown in the plots from David (because of the OS and all other stuff that needs to be virtualized/emulated). ID: 37412 · Reply Quote