Message boards :
Number crunching :
How to run 4 ATLAS + 4 CMS tasks?
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,374,011 RAC: 102,136 |
I am thinking about running 4 ATLAS (1-core) and 4 CMS tasks at the same time. My guess is that theoretically this should be possible by the follwing app_config.xml: <app_config> <app> <name>ATLAS</name> <max_concurrent>4</max_concurrent> <fraction_done_exact/> </app> <app> <name>CMS</name> <max_concurrent>4</max_concurrent> </app> </app_config> (if someone detects an error in there, please let me know). However, my doubts are that even if in the settings on the homepage I put any higher number than 4 each under "Max # of jobs for this Project", it will not happen that both sub-projects are being downloaded about equally, and I might end up with only ATLAS tasks or only CMS tasks being downloaded, with the consequence at at some point only 4 ATLAS or 4 CMS tasks are being crunched. Has anyone tried such a setting yet? If yes, did it work the way it was intended to? |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,739,799 RAC: 233,718 |
You would need to set the Max # at 8 for 2x4 (or unlimited) I didn't try this but it should do what you expect. |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,933,971 RAC: 137,701 |
1. You should set the "Max # CPUs = 1" at the project's preference page to get the correct <rsc_memory_bound> setting from the server. 2. Your app_config.xml should look like this (remove the comments): <app_config> <app> <name>ATLAS</name> <max_concurrent>4</max_concurrent> <fraction_done_exact/> # not necessary but it does not hurt </app> <app_version> <app_name>ATLAS</app_name> <plan_class>vbox64_mt_mcore_atlas</plan_class> <avg_ncpus>1.0</avg_ncpus> # can be removed if the project sends the correct value <cmdline>--nthreads 1</cmdline> # can be removed if the project sends the correct value <cmdline>--memory_size_mb 4600</cmdline> # >4600 MB; the more the better; check your stderr.txt </app_version> <app> <name>CMS</name> <max_concurrent>4</max_concurrent> </app> <app_version> <app_name>CMS</app_name> <plan_class>vbox64</plan_class> <avg_ncpus>1.0</avg_ncpus> # the project will send the correct value, so the whole <app_version> section for CMS can be removed </app_version> <project_max_concurrent>8</project_max_concurrent> # not necessary but it does not hurt </app_config> 3. As you wrote below there is no guarantee that the server sends enough WUs from each subproject. (4. Think about to use a proxy, e.g. squid) ;-) |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,374,011 RAC: 102,136 |
3. As you wrote below there is no guarantee that the server sends enough WUs from each subproject. The main question, I guess, will be the following: If, on the project reference page, I set say 10 for "max # Jobs" (or any higher number) - will the download reliably contain at least 4 ATLAS tasks and 4 CMS tasks, or could it happen that only ATLAS or only CMS tasks are coming in (even if jobs for both sub-projects are available on the server). Or, lateron, after finished tasks were uploaded, will they always be replaced with a task from the appropriate sub-project, so that a workload consisting of 4 ATLAS and 4 CMS tasks is guaranteed (at least as long as tasks for both are available on the download server). This is what I have my doubts about. Therefore, it would be nice to receive a statement from someone who has already tried out this setting and now could tell whether it works or not. |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
As you start using app_configs, BOINC has problems to keep all Cores busy as you want. The scheduler doesn't know about your local config, so, yes, you will see times with idle cores. And yes, I have tried this in the past Perhaps, you should consider about running a second instance of BOINC on the same machine, that would help to make things easier. If you a german language guy, I have a good explanation how easy it is to install a second instance: https://www.rechenkraft.net/forum/viewtopic.php?f=92&t=16614&sid=0ed7bfeb3f7ab850e15086f1872e261e And no, I can't translate this into english, sorry Supporting BOINC, a great concept ! |
Send message Joined: 27 Sep 08 Posts: 798 Credit: 644,739,799 RAC: 233,718 |
With max # Jobs = unlimited, I get a work buffer. Else max # Jobs = a number (n), I get n task work buffer |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,933,971 RAC: 137,701 |
How does your 2nd (3rd...) instance appear at the project website? As an additional host? |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
|
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,374,011 RAC: 102,136 |
As you start using app_configs, BOINC has problems to keep all Cores busy as you want. @Yeti, your guess was right, I am a german language guy and could therefor easily read the rechenkraft article you sent the link for; however - for a different reason - I had tried to install a second instance of BOINC about a year ago, but I did not get it to work (I remember some comments in the BOINC forum, which said that it may or may not work out). So, yesterday I started a trial with the app_config I posted above, and in principle all went well. Interestingsly enough, after I had set the "Max # Jobs" to 14, the initial download was exactly 7 tasks ATLAS and 7 tasks CMS, from which 4 ea. were then running simultaneously (as per my app_config.xml). However, as predicted by YETI (and not expected by me otherwise), later downloads were no longer equal. However, I think a good workaround is what Toby is suggesting: With max # Jobs = unlimited, I get a work buffer. What I will do is to first uncheck "ATLAS" on the Project Preference page and download a certain number of CMS tasks, and then uncheck "CMS" and download a certain number of ATLAS tasks (more ATLAS than CMS, since the ATLAS have a shorter runtime). If by doing this I build a work buffer for about 1 day, I need to do this only once a day which is not that much of an extra effort. So I'll see how it works out. |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,933,971 RAC: 137,701 |
How does your 2nd (3rd...) instance appear at the project website? It works with 2 instances. If I try to setup a 3rd instance the hosts get messed up at the server. Even if more instances would work, there are not enough venues to cover all subprojects. |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
How does your 2nd (3rd...) instance appear at the project website? HM, for races I run up to 20 instances on one machine and it works fine Supporting BOINC, a great concept ! |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,933,971 RAC: 137,701 |
Got it now. Another proof for the thesis: "Most errors are caused by the biological device sitting in front of the computer." :-) Each venue now covers one of the subprojects (ATLAS, CMS, LHCb, Theory). To run additional subprojects, e.g. SixTrack, more venues would be necessary. |
Send message Joined: 24 Jul 16 Posts: 88 Credit: 239,917 RAC: 0 |
Just another adress site (overclock.net) for the people who want informations on how to set multiple boinc instances on a same local host/remote host in order to keep a better control and allowing more possibilities in their preferences. It's a little tricky but why not ...(interesting mainly for medium and big host) It"s an english guide for both windows and linux and there is a short presence of the boinctasks creator : Efmer at the end of the thread which gives advice anfd informs about his site and his cloud site. |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,374,011 RAC: 102,136 |
after 4 CMS and 4 ATLAS tasks have run concurrently very well during several months, I now had to change the relation to 2 ATLAS and 6 CMS, due to the recent marked increase in ATLAS' RAM requirement. In order to save RAM with the ATLAS tasks, my idea was to run 2 2-core ATLAS tasks + 4 CMS tasks. However, no idea what settings to make on the Webpage under "Max # of CPUs for this Project". Because for CMS, I would still use 1-core (multicore would not make sense for CMS, as I understand), whereas for ATLAS, the value would be set to 2 (or even to 3, should the best relation turn out to run 1 or 2 3-core ATLAS tasks, besides a number of CMS tasks) Anyone any idea how my plan could be realised? Most probably by an app_config, if at all? |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,933,971 RAC: 137,701 |
after 4 CMS and 4 ATLAS tasks have run concurrently very well during several months, I now had to change the relation to 2 ATLAS and 6 CMS, due to the recent marked increase in ATLAS' RAM requirement. You may configure additional BOINC instances on your local computer and connect them to different venues on the project server. One of the venues can be configured to run 1-core tasks like CMS, another one to run 2-core tasks for ATLAS. It's also possible to use individual app_config.xml files as the instances run in different data directories. A pitfall is the first contact of the new instance to the project server as a major system parameter, e.g. number of cores, MUST be different from the existing host entry. Otherwise the server tries to merge the new instance into the existing one. |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,374,011 RAC: 102,136 |
You may configure additional BOINC instances on your local computer and connect them to different venues on the project server. thanks for your quick reply. However, one thing that I forgot to mention in my posting above is that an additional BOINC instance is not an option. For a different reason, I had tried this last year, without success. There were several experts in the BOINC forum trying to help me, I simply did not get it to work. So, I do not intend to re-try it, it's most probably a waste of time (and nerves :-). I am aware though that this might be the only way to solve my problem; so I may not get it solved, anyway :-( (although I was hoping that some of the experts here may be able to come up with a specific app_config.xml) |
Send message Joined: 2 Sep 04 Posts: 453 Credit: 193,369,412 RAC: 10,065 |
However, one thing that I forgot to mention in my posting above is that an additional BOINC instance is not an option. I'm using 180 and more BOINC-Instances through my network and offer you, to install 2 Instances via Remote-Control-Software (Teamviewer) on your PC. It is really not complicated. Send me an PM if you are interested Supporting BOINC, a great concept ! |
Send message Joined: 15 Jun 08 Posts: 2386 Credit: 222,933,971 RAC: 137,701 |
Erich56 wrote: ... an additional BOINC instance is not an option. ... Once installed ... you would love it. It works much better and saves much more nerves than any solution based on a single app_config.xml. I would accept Yeti's offer. |
Send message Joined: 18 Dec 15 Posts: 1686 Credit: 100,374,011 RAC: 102,136 |
Okay, folks, let me think about it. Yeti, many thanks anyway for your offer! In case I'd want to make use of it, I'll be glad to contact you. |
Send message Joined: 24 Oct 04 Posts: 1114 Credit: 49,501,728 RAC: 4,157 |
Yes they have a free version for you to try. https://www.teamviewer.com/en/ Volunteer Mad Scientist For Life |
©2024 CERN