Message boards :
ATLAS application :
2-core tasks with process "athena.py" running 4 times
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 |
Hi, I have seen a few times some ATLAS tasks where athena.py is running 4 times inside the VM instead of 2 times. I have configured both "LHC Preferences" and "app_config.xml" to use only 2 cores, but from time to time I see tasks where: - "Alt-F2" in the console shows "Event nr. 1" 4 times, and the same for all other events. - "Alt-F3" in the console shows 4 "athena.py" processes, each running at 50% (normal since the VM only has 2 cores allocated to it). The result is a task that takes twice the time to finish. Is that a known issue? If yes, what can be done to avoid it? If no, how can I help to further investigate it? Regards, Herve We are the product of random evolution. |
Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 |
Examples of such tasks: https://lhcathome.cern.ch/lhcathome/result.php?resultid=157723308 https://lhcathome.cern.ch/lhcathome/result.php?resultid=157723749 We are the product of random evolution. |
Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 |
This task is supposed to run with 3-cores only, but actually has "athena.py" running 6 times within the VM: https://lhcathome.cern.ch/lhcathome/result.php?resultid=157814626 We are the product of random evolution. |
Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 |
Another 3-cores task with "athena.py" running 6 times within the VM: https://lhcathome.cern.ch/lhcathome/result.php?resultid=157867841 And this 3-cores task has "athena.py" running 9 times within the VM: https://lhcathome.cern.ch/lhcathome/result.php?resultid=157867240 We are the product of random evolution. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
These tasks all have duplicated (or triplicated) log messages so I wonder if the task is running twice (or three times) in parallel inside the same VM. I never got to the bottom of why we sometimes get these duplicate log messages. The line Guest Log: ATHENA_PROC_NUMBER=3 shows what is passed to the task to set the number of cores used. |
Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 |
Thanks David, I think I have had this issue for quite some time, but just recently linked it to the duplicated or triplicated athena.py processes. Are the results of the task valid for you guys when this is happening ? I just check one of my computers. Out of 6 running tasks: 3 have triplicated log messages, 2 have duplicated log messages and one is OK. So right now this computer is spending the CPU time of 14 ATLAS tasks to actually run only 6 tasks. It would be great of computer's crunching capacity could be better used. Is there anything I can do to investigate? We are the product of random evolution. |
Send message Joined: 25 Sep 17 Posts: 99 Credit: 3,425,566 RAC: 0 |
Another possible odd thing in the logs is the memory assigned to the VM. It looks like 9Gb is assigned for a 3 processor work unit. My 4 processor work units end up with 6,2Gb for the virtual machine. This follows the 2,6Gb + (0,9Gb * # processors) |
Send message Joined: 18 Dec 16 Posts: 123 Credit: 37,495,365 RAC: 0 |
Another possible odd thing in the logs is the memory assigned to the VM. It looks like 9Gb is assigned for a 3 processor work unit. My 4 processor work units end up with 6,2Gb for the virtual machine. This follows the 2,6Gb + (0,9Gb * # processors) That configuration is intentional because I had recently a few tasks that did not go through the starting phase with 7 Gbytes. Could it be that because I give 9GB ATLAS process ends up running multiple times because it finds a lot of memory available? We are the product of random evolution. |
Send message Joined: 28 Sep 04 Posts: 722 Credit: 48,340,580 RAC: 29,745 |
I have a three CPU core task running and on TOP I see every once in a while a fourth athena.py running. It uses only a very little CPU time (< 2%) and it don't seem to run any jobs as job numbers appear only three times on Alt+F2 display. I have also given "a little extra memory" for the task (5400 MB) via app_config.xml. |
Send message Joined: 25 Sep 17 Posts: 99 Credit: 3,425,566 RAC: 0 |
I think you would both be fine going back to the stock setting for the ATLAS tasks. It looks like the VM memory was bumped up from 0,8 to 0,9 per core in the calculation. |
©2024 CERN