Message boards : Number crunching : Project balance
Message board moderation

To post messages, you must log in.

AuthorMessage
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,684,135
RAC: 235,456
Message 41387 - Posted: 27 Jan 2020, 18:28:23 UTC

I would like to contribute equally to the sub projects, as I allow tasks from all projects.

At the moment it looks like my the projects have settled in to run as many CMS tasks as possiable with 2 Atlas and 2 Theory, my computer has queued up plenty more CMS with what looks like 1 in and 1 out for the other projects.

Anyone know how to tweek these balances?
ID: 41387 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 41388 - Posted: 27 Jan 2020, 18:44:59 UTC - in response to Message 41387.  

As far as I know it's not possible to tweak the balance beyond the capabilities that an app_config would provide (i.e. equally and permanently assigning cores)
ID: 41388 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,149,324
RAC: 16,013
Message 41389 - Posted: 27 Jan 2020, 18:53:50 UTC

I have also seen the balance between subprojects oscillating when time moves on. Sometimes it is mainly Atlas, sometimes mainly Theory or sixtrack. I don't know any way to keep them in constant proportions. Especially because sixtrack often runs out of tasks to process. Maybe Boinc makes corrections to the proportions of subprojects a bit same way it handles different projects?
ID: 41389 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,684,135
RAC: 235,456
Message 41392 - Posted: 27 Jan 2020, 19:25:10 UTC

Yes, six track is impossible to plan for.

If I limit the number of tasks for CMS then the computer doesn't pick up the slack with the other projects, just leaves the other cores idle.

I have to try some more options
ID: 41392 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 41395 - Posted: 27 Jan 2020, 20:33:24 UTC - in response to Message 41392.  

I sometimes think I would like that too, but then realize there is no point to it. The project scientists/administrators have a better idea of what their priorities are than I do (or they should have).
What difference does it really make to me? One detector is as good as another insofar as I know. That is the way it is on WCG also; you pick the projects, and they set the priorities.
ID: 41395 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,894,316
RAC: 138,096
Message 41398 - Posted: 27 Jan 2020, 21:05:41 UTC - in response to Message 41387.  

... Anyone know how to tweek these balances?

The answer is very simple.
Neither BOINC client nor BOINC server provide a mechanism to select a distinct task from the queue.
The server sends them out in the same order they were generated and stored in the server's shared memory.
Tasks from deselected apps are skipped.
If the client requests n seconds of work the server sends out
- as many tasks as necessary to reach those n seconds or
- less if a quota is set (serverside!).

In addition LHC@home runs a couple of servers concurrently for load balancing reasons and nobody knows which of them will serve the next request (decision is done via random DNS name resolution).

The best method to balance work from LHC@home is to run multiple BOINC client instances.
ID: 41398 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 41400 - Posted: 27 Jan 2020, 21:20:46 UTC - in response to Message 41392.  
Last modified: 27 Jan 2020, 21:27:52 UTC

That is the way it is on WCG also; you pick the projects, and they set the priorities.
WCG has very good options since last year where you can set per science 1 up to 64 tasks or unlimited.
Project Limits
The following settings allow you to set the maximum number of tasks assigned to one of your devices for a project.

Please note that use of these settings could cause your device to not always have work to run if one or more of the projects does not have work available at the time your device requests work.


Africa Rainfall Project		10

FightAIDS@Home - Phase 2	1	

Help Stop TB			unlimited

Mapping Cancer Markers		28	

Microbiome Immunity Project	unlimited	

Smash Childhood Cancer		unlimited

I have to try some more options
Options I see here are
- Since you have more hosts, dedicate each computer to one single application using the 4 venues default, home, school and work.
- run several instances of BOINC's client on one computer and use the 4 venues for each client.

Edit: computezrmle was way faster, I slower and reading and correcting over and over again ;)
ID: 41400 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 41402 - Posted: 27 Jan 2020, 21:33:02 UTC - in response to Message 41400.  

Options I see here are
- Since you have more hosts, dedicate each computer to one single application using the 4 venues default, home, school and work.
- run several instances of BOINC's client on one computer and use the 4 venues for each client.

I have done that myself, for two instances. However, given the unreliability of the work, if one of them is empty, you are out of luck.
Since they are on different BOINC instances, they can not share between them.

I now use a single BOINC instance, and choose at least two (native ATLAS and CMS), and hope that one of them has work.
ID: 41402 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 41403 - Posted: 27 Jan 2020, 21:40:06 UTC - in response to Message 41402.  

Since they are on different BOINC instances, they can not share between them.
I did realize that, so you could set If no work for selected applications is available, accept work from other applications? to yes
or choose a different (not LHC) project as backup project where you have set the resource share to 0 (zero)
ID: 41403 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,421,616
RAC: 2,139
Message 41404 - Posted: 27 Jan 2020, 21:40:40 UTC - in response to Message 41402.  

Since they are on different BOINC instances, they can not share between them.
I did realize that, so you could set If no work for selected applications is available, accept work from other applications? to yes
or choose a different (not LHC) project as backup project where you have set the resource share to 0 (zero)
ID: 41404 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,684,135
RAC: 235,456
Message 41405 - Posted: 27 Jan 2020, 21:44:26 UTC

I agree multiple instances is the most reliable way.

After thinking about it more, Jim's thoughts are good, the project know what most important, so we should leave it up to them.

If there is no CMS work then I do get more theory or ATLAS.
ID: 41405 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,894,316
RAC: 138,096
Message 41406 - Posted: 27 Jan 2020, 21:45:57 UTC - in response to Message 41402.  

Since you are running linux you may slightly/moderately overload your computers and control the resource shares via cgroups to avoid the interactive processes getting sluggish.
This ensures a better total operating grade if one of the app queues is empty at the expense of longer runtimes per task if all work buffers are filled.
Mine are running at a factor of 2-2.3.
ID: 41406 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,149,324
RAC: 16,013
Message 41413 - Posted: 28 Jan 2020, 10:53:26 UTC - in response to Message 41398.  

... Anyone know how to tweek these balances?

The answer is very simple.
Neither BOINC client nor BOINC server provide a mechanism to select a distinct task from the queue.
The server sends them out in the same order they were generated and stored in the server's shared memory.
Tasks from deselected apps are skipped.
If the client requests n seconds of work the server sends out
- as many tasks as necessary to reach those n seconds or
- less if a quota is set (serverside!).

In addition LHC@home runs a couple of servers concurrently for load balancing reasons and nobody knows which of them will serve the next request (decision is done via random DNS name resolution).

The best method to balance work from LHC@home is to run multiple BOINC client instances.

If tasks are in the 'ready to send' queue in the order they were created, and a computer requests works but does not accept for example Atlas tasks, the next computer that requests work is probably more likely to receive Atlas work than other types (if it is accepting all types). That could explain the fluctuation between different sub-projects.
ID: 41413 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,684,135
RAC: 235,456
Message 41433 - Posted: 30 Jan 2020, 16:46:25 UTC
Last modified: 31 Jan 2020, 6:57:38 UTC

Interesting today, when I set CMS to NNT then I never got any more than 2 from ATLAS and Theory. Like others reported it said there was none when it looked like there was some.
ID: 41433 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 41434 - Posted: 30 Jan 2020, 22:43:00 UTC - in response to Message 41433.  

Interesting today, when I set CMS to NNT then I never got any more than 2 from ATLAS and CMS.

Pretty much the same for me. I was running both CMS and native ATLAS, but set CMS to NNT because of the errors.
After that, I could not get any more ATLAS, but received the message that "none were available". So they are coupled somehow.
ID: 41434 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,684,135
RAC: 235,456
Message 41439 - Posted: 31 Jan 2020, 22:26:28 UTC

I found out what was blocking it, I had set Max # CPUs to 1, so that ATLAS would use a single core, I set to unlimited and now it gets more Theory and ATLAS.

I'll see if this now blocks CMS tasks.

I used the app_config to limit the number of CPU's to 1 locally, I assume now the working set will be for 8 core task so it blocks many tasks if you have limited RAM.
ID: 41439 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 41442 - Posted: 1 Feb 2020, 19:01:51 UTC - in response to Message 41439.  

Well I have another strange one, with no logical explanation at my end.

I just set up a new machine (no app_configs). My default location had CMS and native ATLAS selected (CPU=1), so it downloaded one ATLAS before I changed the location.

Then I changed the location to run only SixTrack and native Theory (no CMS or ATLAS). After downloading two Theory, it would not download any more until I allowed ATLAS also.
Now I have four native Theory and four native ATLAS (CPU=2), to fill up the 12 cores. So I will do ATLAS too, if that is what it wants (I had to install Singularity to get native ATLAS to run).

Maybe if it did what people asked, there would be more users.
ID: 41442 · Report as offensive     Reply Quote

Message boards : Number crunching : Project balance


©2024 CERN