Message boards :
ATLAS application :
Long event runtimes for 2000 eventers
Message board moderation
Author | Message |
---|---|
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 246,407,538 RAC: 91,753 |
Got one of the rare 2000 event tasks (a resend) running in slot 2. Compared to slots 0/1 (400 eventers) the average runtime per event is significantly higher. slots/0/PanDA_Pilot-5972920516/log.EVNTtoHITS (209th event for this worker) took 79.89 s. New average 172.1 +- 5.887 (191th event for this worker) took 115.8 s. New average 187.8 +- 7.3 slots/1/PanDA_Pilot-5972641056/log.EVNTtoHITS (87th event for this worker) took 66.32 s. New average 175.2 +- 10.05 (89th event for this worker) took 133.6 s. New average 169.9 +- 8.096 slots/2/PanDA_Pilot-5961031867/log.EVNTtoHITS (162th event for this worker) took 744.8 s. New average 706.2 +- 8.818 (160th event for this worker) took 546.7 s. New average 710.3 +- 8.53 Another 2000 eventer a week ago also had averages above 700 s but that tasks failed after ~3 days. |
Send message Joined: 18 Dec 15 Posts: 1735 Credit: 114,053,636 RAC: 84,153 |
I have a question, although not directly related to the long eventers: What concerns the amount of RAM for multi-core ATLAS tasks, long time ago there was published this formula: 3900MB for 1-core, plus 900MB for each additional core. I think that I saw a posting somewhere here some time ago saying that with the new type of ATLAS tasks, this formula is no longer relevant. Isn't there now a fixed amount of MB for a given task, regardless of the number of cores? Can anybody please enlighten me? |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 246,407,538 RAC: 91,753 |
So far a qualified answer to both posts can only be given by somebody who is familiar with the internal structure of ATLAS 3.x respectively with the parameters used to create the few 2000 eventers still being around. Looks like ATM LRZ-LMU might be the only one being able to answer this. |
Send message Joined: 27 Sep 08 Posts: 817 Credit: 681,977,198 RAC: 141,894 |
The new ones should be less. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5978&postid=47977 |
Send message Joined: 4 May 17 Posts: 5 Credit: 118,785,284 RAC: 0 |
23.0.31 (as 23.0.19) has significantly less memory usage. The reason is that it is properly multi-threaded rather then the poor man approach of 21.0.15.This was multi-process where the spawned processes all used the same copy of read-only RAM, but still used some RAM themselves. There should be no more 2000 events tasks, only 400. Since you chewed through the 35Mevt task in a week, I just got another 30Mevt assigned. There will be a few hours with no jobs while the input EVNT are merged. |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 246,407,538 RAC: 91,753 |
@LRZ-LMU Got some fresh(?) tasks with 1.5 GB EVNT files each. Although there are volunteers who can deal with this others can't. Hence, it would be nice if you could limit that to the usual 200-400 MB per file we had before. |
Send message Joined: 18 Dec 15 Posts: 1735 Credit: 114,053,636 RAC: 84,153 |
23.0.31 (as 23.0.19) has significantly less memory usage. The reason is that it is properly multi-threaded rather then the poor man approach of 21.0.15.This was multi-process where the spawned processes all used the same copy of read-only RAM, but still used some RAM themselves.is there still some kind of rule how much RAM should be assigned for 1-core tasks, 2-core tasks and so on, via app_config.xml ? |
Send message Joined: 28 Sep 04 Posts: 707 Credit: 47,035,930 RAC: 31,634 |
23.0.31 (as 23.0.19) has significantly less memory usage. The reason is that it is properly multi-threaded rather then the poor man approach of 21.0.15.This was multi-process where the spawned processes all used the same copy of read-only RAM, but still used some RAM themselves.is there still some kind of rule how much RAM should be assigned for 1-core tasks, 2-core tasks and so on, via app_config.xml ? https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5978&postid=47977#47977 See this message. |
Send message Joined: 28 Sep 04 Posts: 707 Credit: 47,035,930 RAC: 31,634 |
There are quite a lot of re-sends of the failed 2000 eventers in circulation, so be aware. |
Send message Joined: 18 Dec 15 Posts: 1735 Credit: 114,053,636 RAC: 84,153 |
There are quite a lot of re-sends of the failed 2000 eventers in circulation, so be aware.yesterday, one of these on my machine was aborted by server after about 36 hours CPU time :-( |
Send message Joined: 18 Dec 15 Posts: 1735 Credit: 114,053,636 RAC: 84,153 |
well, this message tells exactly what I was referring to in my message above. But obviously, this is no longer valid, is it?23.0.31 (as 23.0.19) has significantly less memory usage. The reason is that it is properly multi-threaded rather then the poor man approach of 21.0.15.This was multi-process where the spawned processes all used the same copy of read-only RAM, but still used some RAM themselves.is there still some kind of rule how much RAM should be assigned for 1-core tasks, 2-core tasks and so on, via app_config.xml ? |
Send message Joined: 28 Sep 04 Posts: 707 Credit: 47,035,930 RAC: 31,634 |
Well, I'm running the 400 and 2000 event tasks with setting of -- nthreads 4 --memory_size_mb 4400 without a problem. |
Send message Joined: 2 May 07 Posts: 2181 Credit: 172,554,084 RAC: 49,909 |
+1 |
Send message Joined: 27 Sep 08 Posts: 817 Credit: 681,977,198 RAC: 141,894 |
David said> On the development system we tested the new application with much lower memory and it was working fine even with 3000MB RAM, but here for safety it is set to 4000MB. |
©2024 CERN