Message boards : ATLAS application : How is Work-Distribution calculated ?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 431
Credit: 117,525,067
RAC: 0
Message 43669 - Posted: 22 Nov 2020, 17:05:37 UTC

Hi,

I'm a little bit irritated about Work-Distribution of Atlas-Work.

All my clients get 10 WUs. Regardless how powerfull or slow the individual workstation is.

So, my slowest PCs has enough work for up to 2 days.

My fastest PC has work for max 6 hours.

This are my LHC-Specific-preferences:



As long as boxes have 10 WUs local, the server tells "No Atlas work available". If they have less workunits, then we get exact the difference to 10.

What to do to get more work on my Power-Machines ?


Supporting BOINC, a great concept !
ID: 43669 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 585
Credit: 33,205,410
RAC: 18,466
Message 43677 - Posted: 22 Nov 2020, 19:02:27 UTC - in response to Message 43669.  

Interesting. On my faster host I have set also the task limit to 'No limit' but my number of CPUs is set to 4 and I always get 8 tasks. The host is also running Theory tasks (to keep CPU cores busy) and also gets always 8 of those. I have set with app_config.xml to run Atlas only on 1 CPU core. I am using 12 of my 16 core for CPU tasks so it varies from 8 Atlas + 4 Theory to 4 Atlas + 8 Theory based on FIFO.

So you could experiment with number of CPUs in your preferences. This affects the amount of memory Boinc thinks each task is using (actual memory used can be set with app_config).
ID: 43677 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 431
Credit: 117,525,067
RAC: 0
Message 43679 - Posted: 22 Nov 2020, 19:33:26 UTC - in response to Message 43677.  

So you could experiment with number of CPUs in your preferences. This affects the amount of memory Boinc thinks each task is using (actual memory used can be set with app_config).
Nope, I can't play with the number of CPUs. If I raise this figure the Working-Set-Size of each Workunit will raise up to 10.200 MB. With 5-CPUs the Working-Set-Size is 7.500 MB.

The memory-setting in app_config is only responsable for the memory-setting of the Virtual-Machine.

The BOINC-Client reserves the memory that is set by Working-Set-Size, even if the Virtual-Machine needs less memory


Supporting BOINC, a great concept !
ID: 43679 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 585
Credit: 33,205,410
RAC: 18,466
Message 43680 - Posted: 22 Nov 2020, 21:22:31 UTC - in response to Message 43679.  

So you could experiment with number of CPUs in your preferences. This affects the amount of memory Boinc thinks each task is using (actual memory used can be set with app_config).
Nope, I can't play with the number of CPUs. If I raise this figure the Working-Set-Size of each Workunit will raise up to 10.200 MB. With 5-CPUs the Working-Set-Size is 7.500 MB.

The memory-setting in app_config is only responsable for the memory-setting of the Virtual-Machine.

The BOINC-Client reserves the memory that is set by Working-Set-Size, even if the Virtual-Machine needs less memory

That's what I meant with my memory comment, I just couldn't formulate it so precisely. :-)

So it looks like we are getting 2 x Max #CPUs tasks to calculate even if have set 'No limit' for 'Max # jobs'. This applies to Atlas and Theory but not to sixtrack and CMS, I think that sixtrack is limited by the cache size if I remember correctly (hard to test as there are no tasks available). CMS follows its own rules (I don't know if anybody knows what they are) and you get over a hundred of those if you have 'No limit' for Max # jobs'. This is a very confusing topic.
ID: 43680 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 366
Credit: 13,262,778
RAC: 6,995
Message 43702 - Posted: 25 Nov 2020, 10:44:14 UTC

Hi Yeti, nice to have you back :)

There is a limitation on the server side for ATLAS and Theory to send out max 2 tasks per CPU. I have asked the admins to increase this to 4 for ATLAS. I would rather not remove the limits completely since many hosts will end up with tasks they will not be able to process before the deadline.
ID: 43702 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 585
Credit: 33,205,410
RAC: 18,466
Message 43703 - Posted: 25 Nov 2020, 12:39:57 UTC - in response to Message 43702.  
Last modified: 25 Nov 2020, 12:40:25 UTC

Hi Yeti, nice to have you back :)

There is a limitation on the server side for ATLAS and Theory to send out max 2 tasks per CPU. I have asked the admins to increase this to 4 for ATLAS. I would rather not remove the limits completely since many hosts will end up with tasks they will not be able to process before the deadline.

The setting of max # of jobs should take care of that problem if it were possible to set that higher than 8. Now new high performance host owners have to set it to 'No limit'.
ID: 43703 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 703
Credit: 4,246,429
RAC: 500
Message 43704 - Posted: 25 Nov 2020, 12:56:03 UTC

What do you mean by CPU? I have a 6 processors CPU, I think it has three cores with 2 threads each. I get 4 Theory tasks and one Atlas task, all done and validated.
Tullio
ID: 43704 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1557
Credit: 57,663,728
RAC: 203,127
Message 43705 - Posted: 25 Nov 2020, 13:42:24 UTC - in response to Message 43704.  

This is a thread from 2018 for HyperThreading and Cpu's:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4616
ID: 43705 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 431
Credit: 117,525,067
RAC: 0
Message 43706 - Posted: 25 Nov 2020, 13:48:11 UTC - in response to Message 43702.  
Last modified: 25 Nov 2020, 13:49:01 UTC

Hi Yeti, nice to have you back :)
Yeah, feels good to be back again
There is a limitation on the server side for ATLAS and Theory to send out max 2 tasks per CPU. I have asked the admins to increase this to 4 for ATLAS. I would rather not remove the limits completely since many hosts will end up with tasks they will not be able to process before the deadline.

HM, 4 is better than two, but it is not really the optimize.

At the moment, "Max # of CPUs" is used for three things

    1* sets the number of cores, a WU should use, when there is no override by app_config
    2* gives the base for "Working Set Size"
    3* is taken to calculate, how much WUs a clients gets (multiplicated with 2 now, in short future with 4)



I'm shure, if you don't change number 3 to a more relaistic number, this shure will bring problems for smaller / older clients.

Examples:

A) Lando is an old 8-Core box, at the moment, gets 10 WUs, in future will have 20 WUs. Sorry, but way too much

B) Manni is my actual flagship an has 24 cores, gets 10 WUs, in future will have 20 WUs.

Couldn't you take the number of real cores into your calculation?

What about this or a similar formular: MaxWUs = Int( RealCores / "Max # CPU") * Faktor

Example-Calculations:

Manni (24 Core): Int( RealCores / "Max # CPU") * Faktor
Manni (24 Core): Int( 24 / 5) * 5 = 20

Lando ( 8 Core) : Int( RealCores / "Max # CPU") * Faktor
Lando ( 8 Core) : Int( 8 / 5) * 5 = 5

These results are much more realistic than with your old calculation!

The Faktor could be a fixed number ( 5 in example, perhaps 6) or could be taken from "Max # of jobs"

These leads us to much more realistic local Work-Balance




Supporting BOINC, a great concept !
ID: 43706 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 703
Credit: 4,246,429
RAC: 500
Message 43708 - Posted: 25 Nov 2020, 14:51:34 UTC

In QuChemPedlA@home I have wingmen, mostly AMD Ryzen Threadripper, that have 160 processors. They run native Linux and my humble 6 processor i5 9400F, running on VirtualBox because it is on a Windows 10 PC, can compete with them in CPU times.
Tullio
ID: 43708 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 158
Credit: 14,665,461
RAC: 0
Message 43730 - Posted: 28 Nov 2020, 11:32:33 UTC - in response to Message 43702.  

Hi,
I would rather not remove the limits completely since many hosts will end up with tasks they will not be able to process before the deadline.
I'm not sure I understand this: a "maximum" is a limit, not a requirement that so many tasks be downloaded. We've seen the same issue with CMS.
There's already a mechanism in BOINC for requesting how much work to download: the local cache length ("Store at least" and "Store up to an additional" ... days of work).. This works with Sixtrack, and both CMS and Atlas have steady, repeatably-sized tasks compatible with this approach. Respecting this user configuration setting would let the project raise the Max # jobs without having to worry about the side-effect on smaller machines.
ID: 43730 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 1557
Credit: 57,663,728
RAC: 203,127
Message 43731 - Posted: 28 Nov 2020, 11:44:39 UTC - in response to Message 43730.  

("Store at least" and "Store up to an additional" ... days of work).. This works with Sixtrack, and both CMS and Atlas have steady, repeatably-sized tasks compatible with this approach. Respecting this user configuration setting would let the project raise the Max # jobs without having to worry about the side-effect on smaller machines.

Have store at least 0.5 day and Zero for store up to an additional.
It is always a Task-mix of WCG and Atlas and/or Theory.
All is running well.
ID: 43731 · Report as offensive     Reply Quote

Message boards : ATLAS application : How is Work-Distribution calculated ?


©2022 CERN