Message boards :
ATLAS application :
Atlas task slowing right down near the end but still using all cores - continue?
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
This task https://lhcathome.cern.ch/lhcathome/result.php?resultid=345643716 is running on an admittedly slow 4 core CPU under Windows, but this machine can do Atlas ok. This one is showing a slower and slower % complete, and has moved from 99.990 percent 12 hours ago to 99.999% now. Task manger shows all 4 cores are still being used. Should I let it run? Is it doing anything useful? |
Send message Joined: 14 Jan 10 Posts: 1371 Credit: 9,144,180 RAC: 4,341 |
Don't look at BOINC's progress. Use the VM's Console with Alt-F3 and Alt-F2 to see the processes/cpu usage and events progress (total events 200 to do) |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
Thanks, it's showing it did 111 of 200 so far with a reasonable range of times (7 mins to 80 mins, average 29 minutes). I'll look again in some hours and see if it's done any more. 4 of athena.py are running, getting 98% ish CPU (cores presumably) each. Edit - moved to 112. All is fine, just an unusually long task. |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,478,100 RAC: 126,724 |
Just to mention it The computer running the task (https://lhcathome.cern.ch/lhcathome/result.php?resultid=345643716) reports an Intel Pentium N3700 CPU: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10772270 https://ark.intel.com/content/www/us/en/ark/products/87261/intel-pentium-processor-n3700-2m-cache-up-to-2-40-ghz.html The data sheet points out it is a 4C/4T CPU. Regarding multicore setups there's a clear VirtualBox recommendation here: https://forums.virtualbox.org/viewtopic.php?f=35&t=77413 To make it short: On this computer ATLAS VMs should not be configured to use more than 3 CPUs. At any time this advice can be ignored on your own responsibility. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
I have 7 different machines, the 6 others are more powerful than that one, and having varying cores and HT. Are you recommending I lower the number of cores for Atlas on them all? Why doesn't Boinc / your server allocate something more sensible? Should I use the other cores for single core non-VB Boinc projects or leave them idle? What harm do I do by using all the cores? Will it slow Atlas down or just the host? On my main 24 thread machine, I only run one 8-thread Atlas limited in app_config, the other 6 machines are only Boinc and I don't care if the Windows system is sluggish. But I do care if Atlas is not running efficiently. |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,478,100 RAC: 126,724 |
Ah, the expected kind of reply. But I was looking on exactly 1 computer with an N3700 CPU and I clearly wrote: "At any time this advice can be ignored ..." Just in case you want to try it out: Use an app_config.xml to run ATLAS as 3-core VM. Then let this setup run a couple of days and compare it against the 4-core. Without this test nobody knows which setup will be more efficient on that computer. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
Hmph, you completely misunderstood me. I was simply looking for information. Presumably you've already tested what's best and should be issuing what will be more efficient. If it's better to leave 1 core free, then Atlas tasks should be issued according to this. An 8 core computer should get 7 core tasks. Ok, Boinc maybe makes this impossible to do, or maybe a 1 core task running some other project at the same time negates any benefit. I'm just asking! I'm not having a go, I'm trying to understand how it works and how I can make all my computers do the most work. I can run a test on any or all of my computers if you like, but this information should be held centrally and has probably already been done by someone? Perhaps there's a useful rule for Intel/AMD/HT that could benefit all users? Let me know what you want me to run, I'm quite willing to offer help, you can see my computers here: https://lhcathome.cern.ch/lhcathome/hosts_user.php |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,478,100 RAC: 126,724 |
As CP mentioned each ATLAS task processes 200 events from a pool. Each event is processed by a thread (a real thread in this case!) and takes between a few seconds and 10..30..60.. min. If a thread has finished an event it starts the next event until the pool is empty. The scientific app running in your VM sets up as many threads as your VM has cores. Setup phase and stage out phase always run on 1 core - the rest of the cores remain idle but keep allocated by VirtualBox for this VM. Long term (this is related to ATLAS only!) - it's more efficient to run an even number of threads (avoids having 1 event left in the pool) - it's more efficient to run less cores per VM but many VMs concurrently (1-4 cores per VM vs. 5-8 cores per VM) Within each range you need to test which setup is the most efficient. Better to expect only minor differences within the same range. Exceptions Running many VMs with few cores concurrently requires lots of RAM. Leave enough spare RAM for the disk cache, the OS and all other processes. VMs that allocate all available cores may run significantly slower (see the VirtualBox advice). |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
Long term (this is related to ATLAS only!)Thanks for the advice, just one more quick question. Can you explain this one further? If there are 200 events to do, what's the harm in doing 3 at a time? Since they all take different amounts of time, won't you always end up with 1 left no matter how many threads you run? |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,478,100 RAC: 126,724 |
It's a question of long term averages since nobody can predict the required processing time per event. 200 events / 2 threads => each thread processes 100 events (average: can also be 98/102 or 97/103 ...) 200 events / 4 threads => each thread processes 50 events (average: can also be <calculate yourself>) 198 events / 3 threads => each thread processes 66 events (average!) which leaves 2 events in the pool. If the pool is empty 1 thread remains idle until the last 2 events are processed. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
I disagree. Certainly on the Atlas task I'm running on the slow computer, the time for each event varies widely from 401 to 4825 seconds. So chances are as you approach the end of the pool, no number of threads will make them match up nicely. It's like 200 apples in a box. 4 people are taking them out 1 at a time each, at a random varying speed. You'd be no more likely to have some people idle at the end with 4 people than 3. |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,478,100 RAC: 126,724 |
I didn't expect that you understand it. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
I didn't expect that you understand it.You're always on the defensive. I'd like you to convince me I'm wrong. This is not about computers or Atlas at all, just stats. I cannot see how after 200 events have occurred, with widely varying times, that it matters how many workers there are. I'm interested in this idea, I've asked it in some maths forums. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
I didn't expect that you understand it.In fact, so far that comptuer has done: Worker 1: 30 events Worker 2: 37 events Worker 3: 34 events Worker 4: 39 events That's 140 in total. By the time we get near the end it could be: Worker 1: 43 events Worker 2: 52 events Worker 3: 49 events Worker 4: 55 events This leaves 1 event. But 200 is a multiple of 4.... You cannot predict when each worker will finish with such random event sizes. |
Send message Joined: 14 Jan 10 Posts: 1371 Credit: 9,144,180 RAC: 4,341 |
With 4 cores you will have 3 idle ones at the end (we don't know for how long), with 3 cores you'll have 2 idle core at the end .... So the most efficient task would be a single core VM. However, when you want to run more tasks you must have enough memory (3900MB for each task) Another point will be the duration of the task. No problem if your machine runs 24/7. ATLAS (and CMS) don't like interruptions (incl network) for longer periods. |
Send message Joined: 2 May 07 Posts: 2189 Credit: 173,308,789 RAC: 66,579 |
Have changed from HT to real Core for all Computer. No separate free Core. All is running well. (CMS, Theory and/or Atlas). For Atlas only two Cores for each task in Windows (10pro and 11pro). Then the difference for the last Collissions for this two Cores is not so different. Yes, the begin and the end of each Atlas-Task need about 8-10 Minutes. CentOs7-VM and CentOS8-VM Atlas are running with one Core for each task. All Computer are using Squid. |
Send message Joined: 12 Aug 06 Posts: 429 Credit: 10,246,033 RAC: 16,382 |
With 4 cores you will have 3 idle ones at the end (we don't know for how long), with 3 cores you'll have 2 idle core at the end ....I think I'd rather spend money on more processing power than more RAM. The RAM would be a severe limitation if I ran more Atlas VMs. Even my largest computer has 64GB and 24 threads, so not enough. The 2 dual xeons have 32 and 40GB, but 24 threads each. |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 14,982,010 RAC: 52 |
As CP mentioned each ATLAS task processes 200 events from a pool.It has struck me before, that changing the task's pool size to 180 or 240 events would give better divisibility. |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,478,100 RAC: 126,724 |
I would vote for 240. This would be perfect for all setups between 1 and 8, except 7. It would also be perfect for ATLAS native running a 12-core setup. |
Send message Joined: 2 May 07 Posts: 2189 Credit: 173,308,789 RAC: 66,579 |
ATLAS (long simulation) |
©2024 CERN