Message boards :
ATLAS application :
Running (0.517 CPUs)
Message board moderation
Author | Message |
---|---|
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
Got an Atlas task that shows that it is running using 0.517 CPUs. Is this normal? If you need more info, please let me know. |
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
Additional information: It appears to me that since BOINC Manager on the desktop thinks that the ATLAS task only needs a fraction of a CPU, BOINC Manager will start more tasks than it has CPU's available. When all started tasks get going and ask for the resources they need, a task will get suspended then restarted frequently. When an ATLAS task has been running longer than any of the other tasks, it gets suspended and restarted more than any other task. When this happens, ATLAS tasks run much longer than they should if left uninterrupted. |
![]() Send message Joined: 30 Aug 14 Posts: 145 Credit: 10,847,070 RAC: 0 ![]() ![]() |
Hello captainjack, ATLAS using a fraction of a cpu thread is not normal. It only uses a fraction at the start (for about 6 to 10 minutes) and at the very end of a task. I don't know exactly about the virtualbox version of ATLAS, but i think to remember that an ATLAS task need to run uninterrupted (200 events), because if it is being interrupted the complete tasks starts from the very beginning each time again. You should visit Yeti's checklist for ATLAS and check it step by step: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4161&postid=29359#29359 Regards, djoser. Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! https://gridcoin.us |
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
djoser, Thanks for the reply. Yes, I am running the virtualbox version of ATLAS and yes, I have been through Yerti's checklist. I have a screen capture of BOINC Manager with one of the ATLAS tasks in question that shows it using 0.517 CPUs that I would be glad to send to a project admin if someone will tell me where to send it. |
![]() Send message Joined: 15 Jun 08 Posts: 2141 Credit: 175,285,684 RAC: 106,151 ![]() ![]() ![]() |
Since last year this has been reported a couple of times and as far as I remember nobody was able to find out what really causes it. The only thing you can do is to set fix cpu values via an app_config.xml which would at least minimize the bad impact on your own client. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5231&postid=40762 Not the first thread that mentions the issue but if you follow the discussion it explains what to do. |
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
If anybody wants to look into this further, the Task number is 283389454 and the work unit number is 145065835. I just aborted the task after it ran for 4 days 15 hours 56 min 18 sec without a successful completion. I would be glad to help test a possible solution, just let me know when and where. |
![]() Send message Joined: 30 Aug 14 Posts: 145 Credit: 10,847,070 RAC: 0 ![]() ![]() |
Furthermore i can see that your computers are working on other projects concurrently. You could try to set those on hold (or even remove them from BOINC) and see how ATLAS behaves with no other projects interfering. Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! https://gridcoin.us |
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
djoser suggested: You could try to set those on hold (or even remove them from BOINC) and see how ATLAS behaves with no other projects interfering. Good question. I will let the queue drain down to empty on one of my machines then let it download as many single CPU ATLAS tasks as it wants and see what happens. I already know it will be constrained by memory, but it will be interesting to see what happens. |
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
Machine I used for this test has 12 threads and 15.9 GB of memory. I let LHC download 6 single core ATLAS tasks and started them one at a time. After each task had time to download additional data and go through all the initiation steps, I checked memory usage and started the next task. With one task running, Windows plus the task was using 6.6 GB of memory. With two tasks running, Windows plus 2 tasks were using 10.8 GM of memory. With three tasks running, Windows plus 3 tasks were using 14.8 GB of memory. With four tasks running, memory usage got up to 15.8 GB, it started banging away on the swap file, the system locked up and rebooted itself. No tasks initiated with use of a partial CPU. Once the system came back up, I limited LHC to 3 concurrent tasks and will let them run to completion with no other tasks running. |
Send message Joined: 2 May 07 Posts: 1718 Credit: 129,407,119 RAC: 288,025 ![]() ![]() ![]() |
Atlas is not able to say, you go out of Memory. For 16 GByte you can let 3 Atlas running (4.8GByte for every task). Your three task will running well. |
Send message Joined: 21 Jun 10 Posts: 38 Credit: 8,869,220 RAC: 28,501 ![]() ![]() ![]() |
Just got another one of these. Task says it is "Running(0.849 CPUs)" More tasks get started than the system can support and the ATLAS task gets suspended. I had to put in an app_config to restrict some of the other work to get the ATLAS task to restart. Does anybody besides me think this is a problem? |
![]() Send message Joined: 28 Sep 04 Posts: 604 Credit: 36,988,303 RAC: 16,826 ![]() ![]() ![]() |
I have never seen a situation like that. But I always have an app_config.xml file present where I set the avg_ncpus to what I want. ![]() |
©2023 CERN