Message boards : Number crunching : How to run 4 ATLAS + 4 CMS tasks?
Message board moderation

To post messages, you must log in.

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,374,011
RAC: 102,136
Message 30594 - Posted: 2 Jun 2017, 13:26:17 UTC
Last modified: 2 Jun 2017, 13:26:45 UTC

I am thinking about running 4 ATLAS (1-core) and 4 CMS tasks at the same time.

My guess is that theoretically this should be possible by the follwing app_config.xml:

<app_config>
<app>
<name>ATLAS</name>
<max_concurrent>4</max_concurrent>
<fraction_done_exact/>
</app>
<app>
<name>CMS</name>
<max_concurrent>4</max_concurrent>
</app>
</app_config>

(if someone detects an error in there, please let me know).

However, my doubts are that even if in the settings on the homepage I put any higher number than 4 each under "Max # of jobs for this Project", it will not happen that both sub-projects are being downloaded about equally, and I might end up with only ATLAS tasks or only CMS tasks being downloaded, with the consequence at at some point only 4 ATLAS or 4 CMS tasks are being crunched.

Has anyone tried such a setting yet? If yes, did it work the way it was intended to?
ID: 30594 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,739,799
RAC: 233,718
Message 30597 - Posted: 2 Jun 2017, 18:09:17 UTC - in response to Message 30594.  

You would need to set the Max # at 8 for 2x4 (or unlimited)

I didn't try this but it should do what you expect.
ID: 30597 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,933,971
RAC: 137,701
Message 30600 - Posted: 2 Jun 2017, 19:06:31 UTC - in response to Message 30594.  

1. You should set the "Max # CPUs = 1" at the project's preference page to get the correct <rsc_memory_bound> setting from the server.

2. Your app_config.xml should look like this (remove the comments):

<app_config>
  <app>
    <name>ATLAS</name>
    <max_concurrent>4</max_concurrent>
    <fraction_done_exact/>              # not necessary but it does not hurt
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>1.0</avg_ncpus>          # can be removed if the project sends the correct value
    <cmdline>--nthreads 1</cmdline>     # can be removed if the project sends the correct value
    <cmdline>--memory_size_mb 4600</cmdline> # >4600 MB; the more the better; check your stderr.txt
  </app_version>
  <app>
    <name>CMS</name>
    <max_concurrent>4</max_concurrent>
  </app>
  <app_version>
    <app_name>CMS</app_name>
    <plan_class>vbox64</plan_class>
    <avg_ncpus>1.0</avg_ncpus>      # the project will send the correct value, so the whole <app_version> section for CMS can be removed
  </app_version>
  <project_max_concurrent>8</project_max_concurrent>    # not necessary but it does not hurt
</app_config>



3. As you wrote below there is no guarantee that the server sends enough WUs from each subproject.

(4. Think about to use a proxy, e.g. squid) ;-)
ID: 30600 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,374,011
RAC: 102,136
Message 30602 - Posted: 2 Jun 2017, 19:55:05 UTC - in response to Message 30600.  

3. As you wrote below there is no guarantee that the server sends enough WUs from each subproject.

The main question, I guess, will be the following:

If, on the project reference page, I set say 10 for "max # Jobs" (or any higher number) - will the download reliably contain at least 4 ATLAS tasks and 4 CMS tasks, or could it happen that only ATLAS or only CMS tasks are coming in (even if jobs for both sub-projects are available on the server).

Or, lateron, after finished tasks were uploaded, will they always be replaced with a task from the appropriate sub-project, so that a workload consisting of 4 ATLAS and 4 CMS tasks is guaranteed (at least as long as tasks for both are available on the download server).
This is what I have my doubts about. Therefore, it would be nice to receive a statement from someone who has already tried out this setting and now could tell whether it works or not.
ID: 30602 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 30603 - Posted: 2 Jun 2017, 20:06:18 UTC
Last modified: 2 Jun 2017, 20:06:49 UTC

As you start using app_configs, BOINC has problems to keep all Cores busy as you want.

The scheduler doesn't know about your local config, so, yes, you will see times with idle cores.

And yes, I have tried this in the past

Perhaps, you should consider about running a second instance of BOINC on the same machine, that would help to make things easier.

If you a german language guy, I have a good explanation how easy it is to install a second instance: https://www.rechenkraft.net/forum/viewtopic.php?f=92&t=16614&sid=0ed7bfeb3f7ab850e15086f1872e261e

And no, I can't translate this into english, sorry


Supporting BOINC, a great concept !
ID: 30603 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,739,799
RAC: 233,718
Message 30605 - Posted: 2 Jun 2017, 20:34:18 UTC

With max # Jobs = unlimited, I get a work buffer.

Else max # Jobs = a number (n), I get n task work buffer
ID: 30605 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,933,971
RAC: 137,701
Message 30606 - Posted: 2 Jun 2017, 20:46:26 UTC - in response to Message 30603.  

How does your 2nd (3rd...) instance appear at the project website?
As an additional host?
ID: 30606 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 30610 - Posted: 2 Jun 2017, 21:16:31 UTC - in response to Message 30606.  

How does your 2nd (3rd...) instance appear at the project website?
As an additional host?

Yes, but it has the same name as the first instance


Supporting BOINC, a great concept !
ID: 30610 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,374,011
RAC: 102,136
Message 30628 - Posted: 4 Jun 2017, 15:49:28 UTC - in response to Message 30605.  

As you start using app_configs, BOINC has problems to keep all Cores busy as you want.

The scheduler doesn't know about your local config, so, yes, you will see times with idle cores.

And yes, I have tried this in the past

Perhaps, you should consider about running a second instance of BOINC on the same machine, that would help to make things easier.

If you a german language guy, I have a good explanation how easy it is to install a second instance: https://www.rechenkraft.net/forum/viewtopic.php?f=92&t=16614&sid=0ed7bfeb3f7ab850e15086f1872e261e


@Yeti, your guess was right, I am a german language guy and could therefor easily read the rechenkraft article you sent the link for; however - for a different reason - I had tried to install a second instance of BOINC about a year ago, but I did not get it to work (I remember some comments in the BOINC forum, which said that it may or may not work out).

So, yesterday I started a trial with the app_config I posted above, and in principle all went well. Interestingsly enough, after I had set the "Max # Jobs" to 14, the initial download was exactly 7 tasks ATLAS and 7 tasks CMS, from which 4 ea. were then running simultaneously (as per my app_config.xml).
However, as predicted by YETI (and not expected by me otherwise), later downloads were no longer equal.

However, I think a good workaround is what Toby is suggesting:

With max # Jobs = unlimited, I get a work buffer.

Else max # Jobs = a number (n), I get n task work buffer

What I will do is to first uncheck "ATLAS" on the Project Preference page and download a certain number of CMS tasks, and then uncheck "CMS" and download a certain number of ATLAS tasks (more ATLAS than CMS, since the ATLAS have a shorter runtime).

If by doing this I build a work buffer for about 1 day, I need to do this only once a day which is not that much of an extra effort.
So I'll see how it works out.
ID: 30628 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,933,971
RAC: 137,701
Message 30630 - Posted: 4 Jun 2017, 20:24:07 UTC - in response to Message 30610.  

How does your 2nd (3rd...) instance appear at the project website?
As an additional host?

Yes, but it has the same name as the first instance

It works with 2 instances.
If I try to setup a 3rd instance the hosts get messed up at the server.

Even if more instances would work, there are not enough venues to cover all subprojects.
ID: 30630 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 30659 - Posted: 6 Jun 2017, 12:23:23 UTC - in response to Message 30630.  

How does your 2nd (3rd...) instance appear at the project website?
As an additional host?

Yes, but it has the same name as the first instance

It works with 2 instances.
If I try to setup a 3rd instance the hosts get messed up at the server.

HM, for races I run up to 20 instances on one machine and it works fine


Supporting BOINC, a great concept !
ID: 30659 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,933,971
RAC: 137,701
Message 30664 - Posted: 6 Jun 2017, 14:33:10 UTC - in response to Message 30659.  

Got it now.
Another proof for the thesis:
"Most errors are caused by the biological device sitting in front of the computer."
:-)

Each venue now covers one of the subprojects (ATLAS, CMS, LHCb, Theory).
To run additional subprojects, e.g. SixTrack, more venues would be necessary.
ID: 30664 · Report as offensive     Reply Quote
PHILIPPE

Send message
Joined: 24 Jul 16
Posts: 88
Credit: 239,917
RAC: 0
Message 30810 - Posted: 16 Jun 2017, 20:38:34 UTC - in response to Message 30664.  

Just another adress site (overclock.net) for the people who want informations on how to set multiple boinc instances on a same local host/remote host in order to keep a better control and allowing more possibilities in their preferences.

It's a little tricky but why not ...(interesting mainly for medium and big host)

It"s an english guide for both windows and linux and there is a short presence of the boinctasks creator : Efmer at the end of the thread which gives advice anfd informs about his site and his cloud site.
ID: 30810 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,374,011
RAC: 102,136
Message 31874 - Posted: 7 Aug 2017, 14:09:09 UTC

after 4 CMS and 4 ATLAS tasks have run concurrently very well during several months, I now had to change the relation to 2 ATLAS and 6 CMS, due to the recent marked increase in ATLAS' RAM requirement.

In order to save RAM with the ATLAS tasks, my idea was to run 2 2-core ATLAS tasks + 4 CMS tasks.
However, no idea what settings to make on the Webpage under "Max # of CPUs for this Project". Because for CMS, I would still use 1-core (multicore would not make sense for CMS, as I understand), whereas for ATLAS, the value would be set to 2 (or even to 3, should the best relation turn out to run 1 or 2 3-core ATLAS tasks, besides a number of CMS tasks)

Anyone any idea how my plan could be realised? Most probably by an app_config, if at all?
ID: 31874 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,933,971
RAC: 137,701
Message 31875 - Posted: 7 Aug 2017, 14:21:00 UTC - in response to Message 31874.  

after 4 CMS and 4 ATLAS tasks have run concurrently very well during several months, I now had to change the relation to 2 ATLAS and 6 CMS, due to the recent marked increase in ATLAS' RAM requirement.

In order to save RAM with the ATLAS tasks, my idea was to run 2 2-core ATLAS tasks + 4 CMS tasks.
However, no idea what settings to make on the Webpage under "Max # of CPUs for this Project". Because for CMS, I would still use 1-core (multicore would not make sense for CMS, as I understand), whereas for ATLAS, the value would be set to 2 (or even to 3, should the best relation turn out to run 1 or 2 3-core ATLAS tasks, besides a number of CMS tasks)

Anyone any idea how my plan could be realised? Most probably by an app_config, if at all?

You may configure additional BOINC instances on your local computer and connect them to different venues on the project server.
One of the venues can be configured to run 1-core tasks like CMS, another one to run 2-core tasks for ATLAS.
It's also possible to use individual app_config.xml files as the instances run in different data directories.

A pitfall is the first contact of the new instance to the project server as a major system parameter, e.g. number of cores, MUST be different from the existing host entry.
Otherwise the server tries to merge the new instance into the existing one.
ID: 31875 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,374,011
RAC: 102,136
Message 31876 - Posted: 7 Aug 2017, 15:47:36 UTC - in response to Message 31875.  

You may configure additional BOINC instances on your local computer and connect them to different venues on the project server.

thanks for your quick reply.
However, one thing that I forgot to mention in my posting above is that an additional BOINC instance is not an option.

For a different reason, I had tried this last year, without success. There were several experts in the BOINC forum trying to help me, I simply did not get it to work.
So, I do not intend to re-try it, it's most probably a waste of time (and nerves :-).

I am aware though that this might be the only way to solve my problem; so I may not get it solved, anyway :-(
(although I was hoping that some of the experts here may be able to come up with a specific app_config.xml)
ID: 31876 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 31877 - Posted: 7 Aug 2017, 15:52:31 UTC - in response to Message 31876.  

However, one thing that I forgot to mention in my posting above is that an additional BOINC instance is not an option.

For a different reason, I had tried this last year, without success. There were several experts in the BOINC forum trying to help me, I simply did not get it to work.
So, I do not intend to re-try it, it's most probably a waste of time (and nerves :-).

I'm using 180 and more BOINC-Instances through my network and offer you, to install 2 Instances via Remote-Control-Software (Teamviewer) on your PC.

It is really not complicated.

Send me an PM if you are interested


Supporting BOINC, a great concept !
ID: 31877 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,933,971
RAC: 137,701
Message 31878 - Posted: 7 Aug 2017, 16:47:11 UTC - in response to Message 31876.  

Erich56 wrote:
... an additional BOINC instance is not an option. ...

Once installed ... you would love it.
It works much better and saves much more nerves than any solution based on a single app_config.xml.

I would accept Yeti's offer.
ID: 31878 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,374,011
RAC: 102,136
Message 31881 - Posted: 7 Aug 2017, 19:12:50 UTC

Okay, folks, let me think about it.
Yeti, many thanks anyway for your offer! In case I'd want to make use of it, I'll be glad to contact you.
ID: 31881 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1114
Credit: 49,501,728
RAC: 4,157
Message 31887 - Posted: 8 Aug 2017, 0:20:43 UTC

Yes they have a free version for you to try.

https://www.teamviewer.com/en/
Volunteer Mad Scientist For Life
ID: 31887 · Report as offensive     Reply Quote

Message boards : Number crunching : How to run 4 ATLAS + 4 CMS tasks?


©2024 CERN