Message boards : ATLAS application : Tasks (2-core) failing after 12-13 minutes
Message board moderation

To post messages, you must log in.

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,124,320
RAC: 123,506
Message 29446 - Posted: 20 Mar 2017, 17:15:33 UTC
Last modified: 20 Mar 2017, 17:18:15 UTC

This afternoon, several 2-core ATLAS Tasks were failing after about 12-13 minutes.

As an example, please see here:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=127234375

can anyone tell me what's going wrong?

FYI, on this machine, during the past year I have been cruching numerous 3-core ATLAS tasks without any problems, also a few new ones under the LHC@home roof.
So, why would 3-core tasks work well, and 2-core tasks don't ?

Whatever this Information is worth: also this afternoon, several CMS tasks (1-core) failed after about the same time.
ID: 29446 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 29447 - Posted: 20 Mar 2017, 17:44:41 UTC - in response to Message 29446.  
Last modified: 20 Mar 2017, 17:48:19 UTC

your log shows:
2017-03-20 17:53:45 (6816): Setting Memory Size for VM. (3600MB)
2017-03-20 17:53:45 (6816): Setting CPU Count for VM. (2)

according to this post
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4164&postid=29445
you have to increase the memory for the VM to at least 4400MB for a multicore task (by app_config for example)
ID: 29447 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,124,320
RAC: 123,506
Message 29448 - Posted: 20 Mar 2017, 17:49:17 UTC - in response to Message 29447.  
Last modified: 20 Mar 2017, 17:56:58 UTC

... you have to increase the memory for the VM to at least 4400MB for a multicore (by app_config for example)

thanks for the quick Information.
What I am surprised, though, is that on this 32 GB RAM machine I never before had to manually increase the VM Memory.
Would the VM not take as large a memory as needed for a given task? (as long as there is enough RAM left)?

Even 3-core ATLAS tasks (from the former ATLAS@home-server as well as recently from the LHC@home Server)did NOT require any manual adaption of the VM memory size.
So, why would this now suddenly be necessary with 2-core tasks?
ID: 29448 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,124,320
RAC: 123,506
Message 29451 - Posted: 20 Mar 2017, 19:26:17 UTC - in response to Message 29448.  

Out of interest, on older stderr files I now checked the VM memory creations for 1-core thru for 3-core ATLAS Tasks:

for 1-core: 2600 MB
for 2-core: 3600 MB
for 3-core: 4600 MB

As said above, I would guess that VM is knowing what it's doing. Or is there a bug somewhere in the VM and the 3600 MB for 2-core is indeed not enough?

With my total 32GB RAM, I would have not problem at all to manually (by app_config, if I can find this somewhere) set the VM RAM to any higher value than 3600 MB.
ID: 29451 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 29453 - Posted: 20 Mar 2017, 19:47:06 UTC - in response to Message 29451.  

look here:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4164&postid=29450

for your app config place a file called app_config.xml in the project directory and write this into it to run 6 atlas tasks with 2 cores each at the same time (if you have a 12 core processor) and 4400MB memory for each task (make sure to save it in the right format):

<app_config>
<app>
<name>ATLAS</name>
<max_concurrent>6</max_concurrent>
</app>
<app_version>
<app_name>ATLAS</app_name>
<avg_ncpus>2.000000</avg_ncpus>
<plan_class>vbox64_mt_mcore_atlas</plan_class>
<cmdline>--memory_size_mb 4400</cmdline>
</app_version>
</app_config>

restart boinc and you should be good to go.
ID: 29453 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1688
Credit: 103,124,320
RAC: 123,506
Message 29454 - Posted: 20 Mar 2017, 20:05:11 UTC

@gyllic, many thanks for your help!

If I am through with the one 3-core task and the other single-core tasks which are still running, I will make the Change and see what happens :-)
ID: 29454 · Report as offensive     Reply Quote
Profile HerveUAE
Avatar

Send message
Joined: 18 Dec 16
Posts: 123
Credit: 37,495,365
RAC: 0
Message 29459 - Posted: 20 Mar 2017, 21:08:11 UTC

On ATLAS@Home the default rule for setting the VirtualBox memory was: 2.5 GB + 0.8 * ncores (see Yeti's checklist).
On LHC@Home, after the introduction of a new simulator version 1.01 (more efficient), the rule was changed to 1.6 GB + 1 * ncores (see https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4146&postid=29260#29260). This explains the figures you found in your stderr files, Erich.

Tests with the new version showed that the rule was OK for 3 cores and above, but not OK for 2 cores. So if you want to run 2-cores, you need to configure an app_config.xml file with 4400 MB to overwrite the default 3600 MB, as mentionned by gyllic.
We are the product of random evolution.
ID: 29459 · Report as offensive     Reply Quote

Message boards : ATLAS application : Tasks (2-core) failing after 12-13 minutes


©2024 CERN