Message boards : ATLAS application : Atlas task slowing right down near the end but still using all cores - continue?
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · Next

AuthorMessage
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46370 - Posted: 28 Feb 2022, 11:29:20 UTC

This task https://lhcathome.cern.ch/lhcathome/result.php?resultid=345643716 is running on an admittedly slow 4 core CPU under Windows, but this machine can do Atlas ok. This one is showing a slower and slower % complete, and has moved from 99.990 percent 12 hours ago to 99.999% now. Task manger shows all 4 cores are still being used. Should I let it run? Is it doing anything useful?
ID: 46370 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1130
Credit: 6,937,570
RAC: 710
Message 46371 - Posted: 28 Feb 2022, 11:42:36 UTC - in response to Message 46370.  

Don't look at BOINC's progress.
Use the VM's Console with Alt-F3 and Alt-F2 to see the processes/cpu usage and events progress (total events 200 to do)
ID: 46371 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46373 - Posted: 28 Feb 2022, 12:03:55 UTC
Last modified: 28 Feb 2022, 12:07:38 UTC

Thanks, it's showing it did 111 of 200 so far with a reasonable range of times (7 mins to 80 mins, average 29 minutes). I'll look again in some hours and see if it's done any more.

4 of athena.py are running, getting 98% ish CPU (cores presumably) each.

Edit - moved to 112. All is fine, just an unusually long task.
ID: 46373 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2049
Credit: 154,289,156
RAC: 146,798
Message 46374 - Posted: 28 Feb 2022, 12:07:50 UTC - in response to Message 46370.  

Just to mention it

The computer running the task (https://lhcathome.cern.ch/lhcathome/result.php?resultid=345643716) reports an Intel Pentium N3700 CPU:
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10772270
https://ark.intel.com/content/www/us/en/ark/products/87261/intel-pentium-processor-n3700-2m-cache-up-to-2-40-ghz.html

The data sheet points out it is a 4C/4T CPU.

Regarding multicore setups there's a clear VirtualBox recommendation here:
https://forums.virtualbox.org/viewtopic.php?f=35&t=77413


To make it short:
On this computer ATLAS VMs should not be configured to use more than 3 CPUs.


At any time this advice can be ignored on your own responsibility.
ID: 46374 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46375 - Posted: 28 Feb 2022, 12:14:34 UTC
Last modified: 28 Feb 2022, 12:14:50 UTC

I have 7 different machines, the 6 others are more powerful than that one, and having varying cores and HT.

Are you recommending I lower the number of cores for Atlas on them all?

Why doesn't Boinc / your server allocate something more sensible?

Should I use the other cores for single core non-VB Boinc projects or leave them idle?

What harm do I do by using all the cores? Will it slow Atlas down or just the host? On my main 24 thread machine, I only run one 8-thread Atlas limited in app_config, the other 6 machines are only Boinc and I don't care if the Windows system is sluggish. But I do care if Atlas is not running efficiently.
ID: 46375 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2049
Credit: 154,289,156
RAC: 146,798
Message 46376 - Posted: 28 Feb 2022, 12:31:11 UTC - in response to Message 46375.  

Ah, the expected kind of reply.
But I was looking on exactly 1 computer with an N3700 CPU and I clearly wrote:
"At any time this advice can be ignored ..."

Just in case you want to try it out:
Use an app_config.xml to run ATLAS as 3-core VM.
Then let this setup run a couple of days and compare it against the 4-core.
Without this test nobody knows which setup will be more efficient on that computer.
ID: 46376 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46377 - Posted: 28 Feb 2022, 12:46:29 UTC

Hmph, you completely misunderstood me. I was simply looking for information.

Presumably you've already tested what's best and should be issuing what will be more efficient. If it's better to leave 1 core free, then Atlas tasks should be issued according to this. An 8 core computer should get 7 core tasks. Ok, Boinc maybe makes this impossible to do, or maybe a 1 core task running some other project at the same time negates any benefit. I'm just asking!

I'm not having a go, I'm trying to understand how it works and how I can make all my computers do the most work.

I can run a test on any or all of my computers if you like, but this information should be held centrally and has probably already been done by someone? Perhaps there's a useful rule for Intel/AMD/HT that could benefit all users? Let me know what you want me to run, I'm quite willing to offer help, you can see my computers here: https://lhcathome.cern.ch/lhcathome/hosts_user.php
ID: 46377 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2049
Credit: 154,289,156
RAC: 146,798
Message 46378 - Posted: 28 Feb 2022, 13:41:43 UTC - in response to Message 46377.  

As CP mentioned each ATLAS task processes 200 events from a pool.
Each event is processed by a thread (a real thread in this case!) and takes between a few seconds and 10..30..60.. min.
If a thread has finished an event it starts the next event until the pool is empty.

The scientific app running in your VM sets up as many threads as your VM has cores.
Setup phase and stage out phase always run on 1 core - the rest of the cores remain idle but keep allocated by VirtualBox for this VM.

Long term (this is related to ATLAS only!)
- it's more efficient to run an even number of threads (avoids having 1 event left in the pool)
- it's more efficient to run less cores per VM but many VMs concurrently (1-4 cores per VM vs. 5-8 cores per VM)

Within each range you need to test which setup is the most efficient.
Better to expect only minor differences within the same range.


Exceptions
Running many VMs with few cores concurrently requires lots of RAM.
Leave enough spare RAM for the disk cache, the OS and all other processes.
VMs that allocate all available cores may run significantly slower (see the VirtualBox advice).
ID: 46378 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46379 - Posted: 28 Feb 2022, 13:58:21 UTC - in response to Message 46378.  

Long term (this is related to ATLAS only!)
- it's more efficient to run an even number of threads (avoids having 1 event left in the pool)
Thanks for the advice, just one more quick question. Can you explain this one further? If there are 200 events to do, what's the harm in doing 3 at a time? Since they all take different amounts of time, won't you always end up with 1 left no matter how many threads you run?
ID: 46379 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2049
Credit: 154,289,156
RAC: 146,798
Message 46380 - Posted: 28 Feb 2022, 14:22:57 UTC - in response to Message 46379.  
Last modified: 28 Feb 2022, 14:24:26 UTC

It's a question of long term averages since nobody can predict the required processing time per event.

200 events / 2 threads => each thread processes 100 events (average: can also be 98/102 or 97/103 ...)
200 events / 4 threads => each thread processes 50 events (average: can also be <calculate yourself>)

198 events / 3 threads => each thread processes 66 events (average!) which leaves 2 events in the pool.
If the pool is empty 1 thread remains idle until the last 2 events are processed.
ID: 46380 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46381 - Posted: 28 Feb 2022, 14:30:01 UTC - in response to Message 46380.  

I disagree. Certainly on the Atlas task I'm running on the slow computer, the time for each event varies widely from 401 to 4825 seconds. So chances are as you approach the end of the pool, no number of threads will make them match up nicely. It's like 200 apples in a box. 4 people are taking them out 1 at a time each, at a random varying speed. You'd be no more likely to have some people idle at the end with 4 people than 3.
ID: 46381 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2049
Credit: 154,289,156
RAC: 146,798
Message 46382 - Posted: 28 Feb 2022, 14:41:14 UTC - in response to Message 46381.  

I didn't expect that you understand it.
ID: 46382 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46383 - Posted: 28 Feb 2022, 14:44:46 UTC - in response to Message 46382.  
Last modified: 28 Feb 2022, 14:52:52 UTC

I didn't expect that you understand it.
You're always on the defensive. I'd like you to convince me I'm wrong. This is not about computers or Atlas at all, just stats. I cannot see how after 200 events have occurred, with widely varying times, that it matters how many workers there are.

I'm interested in this idea, I've asked it in some maths forums.
ID: 46383 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46384 - Posted: 28 Feb 2022, 15:35:07 UTC - in response to Message 46382.  

I didn't expect that you understand it.
In fact, so far that comptuer has done:

Worker 1: 30 events
Worker 2: 37 events
Worker 3: 34 events
Worker 4: 39 events

That's 140 in total.

By the time we get near the end it could be:

Worker 1: 43 events
Worker 2: 52 events
Worker 3: 49 events
Worker 4: 55 events
This leaves 1 event. But 200 is a multiple of 4....

You cannot predict when each worker will finish with such random event sizes.
ID: 46384 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1130
Credit: 6,937,570
RAC: 710
Message 46385 - Posted: 28 Feb 2022, 15:48:21 UTC

With 4 cores you will have 3 idle ones at the end (we don't know for how long), with 3 cores you'll have 2 idle core at the end ....
So the most efficient task would be a single core VM. However, when you want to run more tasks you must have enough memory (3900MB for each task)
Another point will be the duration of the task. No problem if your machine runs 24/7.
ATLAS (and CMS) don't like interruptions (incl network) for longer periods.
ID: 46385 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1622
Credit: 75,559,514
RAC: 223,163
Message 46386 - Posted: 28 Feb 2022, 16:00:59 UTC - in response to Message 46383.  

Have changed from HT to real Core for all Computer. No separate free Core.
All is running well. (CMS, Theory and/or Atlas). For Atlas only two Cores for each task in Windows (10pro and 11pro).
Then the difference for the last Collissions for this two Cores is not so different.
Yes, the begin and the end of each Atlas-Task need about 8-10 Minutes.
CentOs7-VM and CentOS8-VM Atlas are running with one Core for each task.
All Computer are using Squid.
ID: 46386 · Report as offensive     Reply Quote
Peter Hucker of the Scottish B...

Send message
Joined: 12 Aug 06
Posts: 294
Credit: 2,008,652
RAC: 24
Message 46387 - Posted: 28 Feb 2022, 16:03:47 UTC - in response to Message 46385.  

With 4 cores you will have 3 idle ones at the end (we don't know for how long), with 3 cores you'll have 2 idle core at the end ....
So the most efficient task would be a single core VM. However, when you want to run more tasks you must have enough memory (3900MB for each task)
Another point will be the duration of the task. No problem if your machine runs 24/7.
ATLAS (and CMS) don't like interruptions (incl network) for longer periods.
I think I'd rather spend money on more processing power than more RAM. The RAM would be a severe limitation if I ran more Atlas VMs. Even my largest computer has 64GB and 24 threads, so not enough. The 2 dual xeons have 32 and 40GB, but 24 threads each.
ID: 46387 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 162
Credit: 14,768,010
RAC: 2,247
Message 46390 - Posted: 1 Mar 2022, 16:13:26 UTC - in response to Message 46378.  

As CP mentioned each ATLAS task processes 200 events from a pool.
It has struck me before, that changing the task's pool size to 180 or 240 events would give better divisibility.
ID: 46390 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 2049
Credit: 154,289,156
RAC: 146,798
Message 46391 - Posted: 1 Mar 2022, 16:36:04 UTC - in response to Message 46390.  

I would vote for 240.
This would be perfect for all setups between 1 and 8, except 7.
It would also be perfect for ATLAS native running a 12-core setup.
ID: 46391 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1622
Credit: 75,559,514
RAC: 223,163
Message 46392 - Posted: 1 Mar 2022, 16:50:55 UTC

ATLAS (long simulation)
ID: 46392 · Report as offensive     Reply Quote
1 · 2 · 3 · Next

Message boards : ATLAS application : Atlas task slowing right down near the end but still using all cores - continue?


©2022 CERN