1) Message boards : Sixtrack Application : Tasks available / tasks not available (Message 30412)
Posted 18 May 2017 by BRG
Post:
Exactly the same issues here, no amount of updating, resetting project or other things seems to fix it... been like this for a few days, sometimes resetting project worked but last 24hrs its been pot luck if I get a wu
2) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30134)
Posted 30 Apr 2017 by BRG
Post:
You got trapped in 2 pitfalls.

1. sixtrack is also a subproject of LHC and therefore affected by the <project_max_concurrent> setting although it is not mentioned in your app_config.xml

But you managed it as you removed the <project_max_concurrent> entry.


2. You may check more than one subprojects in your preferences but you are not able to influence which of them sends WUs to your host.
Together with the "unlimited" setting your host is now flooded with WUs.

Until you have a clear picture of how many multicore ATLAS WUs you are able to run concurrently you may not check another subproject.


1, I have never used the max_concurrent till this post.

2, Ah, that's a bit of a shame :( I can honestly say I have never had so many tasks download at once! it seems with the following I can run 4 tasks, 5th needs slightly more ran :(

<app_config>
<app>
<name>ATLAS</name>
<max_concurrent>4</max_concurrent>
<fraction_done_exact/>
</app>
<app_version>
<app_name>ATLAS</app_name>
<plan_class>vbox64_mt_mcore_atlas</plan_class>
<avg_ncpus>2.0</avg_ncpus>
<cmdline>--memory_size_mb 4800</cmdline>
</app_version>
</app_config>
3) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30119)
Posted 30 Apr 2017 by BRG
Post:
On my PC with 12 cores 24 threads, I can max out 64GB if there is too many ATLAS tasks.

I've seen it very high on my 10 core 20 thread machine too.

My other PC's with more ram I haven't seen so many concurrent ATLAS task.

I have the number of task set to 10 concurrent for 64GB to see if that is a bit better as 12 made the maxed one slow.


I have set cpu and tasks to no limit to try fix this issue I'm having... doesn't seem to be working :(


removed <project_max_concurrent>1</project_max_concurrent> all fixed :)


Is there a magic way to make Bionic preference the Atlas units over Sixtrack?
4) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30118)
Posted 29 Apr 2017 by BRG
Post:
On my PC with 12 cores 24 threads, I can max out 64GB if there is too many ATLAS tasks.

I've seen it very high on my 10 core 20 thread machine too.

My other PC's with more ram I haven't seen so many concurrent ATLAS task.

I have the number of task set to 10 concurrent for 64GB to see if that is a bit better as 12 made the maxed one slow.


I have set cpu and tasks to no limit to try fix this issue I'm having... doesn't seem to be working :(


removed <project_max_concurrent>1</project_max_concurrent> all fixed :)
5) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30117)
Posted 29 Apr 2017 by BRG
Post:
On my PC with 12 cores 24 threads, I can max out 64GB if there is too many ATLAS tasks.

I've seen it very high on my 10 core 20 thread machine too.

My other PC's with more ram I haven't seen so many concurrent ATLAS task.

I have the number of task set to 10 concurrent for 64GB to see if that is a bit better as 12 made the maxed one slow.


I have set cpu and tasks to no limit to try fix this issue I'm having... doesn't seem to be working :(
6) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30116)
Posted 29 Apr 2017 by BRG
Post:
so back to problems again... can not run more than 2 altas tasks now, and only 3 sizetrack tasks running... plenty cores free and sizetrack isn't bother about ram...
7) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30111)
Posted 29 Apr 2017 by BRG
Post:
Ok, So far, so good!

Changed <max_concurrent> and <project_max_concurrent> to 4 as with 5400 on the ram and 2 cores that's the most I can do and it gives me a little room too!

Is there a tried and tested "thing" of x cores and x ram? I was always told 2 cores and 4800 ram...

In answer to your questions computezrmle:

- faulty WUs -> check the message boards
- other projects also need resources
- a saturated internet connection -> how fast is it?
- a saturated disk IO -> a lot of users don´t check/believe this point
- not enough RAM -> test another combination of #WUs / cores per WU / RAM per WU
- not enough CPUs -> unlikely in your case :-)

1, I have been, there is a few around, 27th I, like many had problem WU's
2, at the moment I only have one gpu slot running so no risks there!
3, my connection isn't great, 1.5mb down and around 0.1mb up
4, I'm not sure exactly what that is so will google it, drive isn't very old, Samsung evo 850 500g
5, ram is my issue I can go to 48g in total I think... currently 32g fitted
6, cores are not an issue currently, I do however need more ram to support those cores :(
8) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30108)
Posted 29 Apr 2017 by BRG
Post:
Quick update, once this WU has finished will start step 3 and report back but so far, so good :)
9) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30094)
Posted 27 Apr 2017 by BRG
Post:
Some suggestions for possible next steps.


1. Check the logfile

After your WU is reported check the result on the LHC webserver (it includes a copy of your stderr.txt).
- The WU should be marked as "successful"
- the logfile should include lines like
Guest Log: <metadata att_name="fsize" att_value="54070367"/>
Guest Log: -rw------- 1 root root 54070367 Apr 27 14:01 HITS.10995533._009865.pool.root.1


If this is successful, go to step 2



2. Try 1 multicore WU

Leave "Max # jobs = 1", set "Max # CPUs = 2", set <avg_ncpus>2.0</avg_ncpus> and "read config files" in your client

OR

Leave "Max # jobs = 1", set "Max # CPUs = 4", set <avg_ncpus>4.0</avg_ncpus>, set <cmdline>--memory_size_mb 6000</cmdline> and "read config files" in your client

If this is successful, go to step 3



3. Try several multicore WUs concurrently

Increase "Max # jobs" step by step either with "Max # CPUs = 2" or "Max # CPUs = 4" and set your app_config.xml accordingly.
<max_concurrent>x
<avg_ncpus>y
<cmdline>--memory_size_mb zzzz
<project_max_concurrent>x

Don´t forget the "read config files" before the next WU download.



Always check the logfiles before you go from one step to the next.


At a certain point your host will start to produce errors because of
- faulty WUs -> check the message boards
- other projects also need resources
- a saturated internet connection -> how fast is it?
- a saturated disk IO -> a lot of users don´t check/believe this point
- not enough RAM -> test another combination of #WUs / cores per WU / RAM per WU
- not enough CPUs -> unlikely in your case :-)


Thanks very much :) its got around an hour to go so will be a tomorrow job I would guess.

Will edit the app config file to the changes you said and then go from there via the steps :)

Thanks
10) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30091)
Posted 27 Apr 2017 by BRG
Post:
This happens after every project reset until ATLAS (in this case) is known to your host through the first server response.
Nothing to worry about if you managed to load the app_config.xml before BOINC started the WU.
See number 8 of my list.

You may check the stderr.txt in the slots dir of the running WU.
If "Setting Memory Size for VM. (xxxxMB)" corresponds to your app_config.xml everything is fine.


All sorted now :)

EDIT: found the stderr file-

2017-04-27 16:11:08 (16424): Setting Memory Size for VM. (5000MB)

The WU is 49% complete, if I can sort out what app config file to run from now on I will try it, change witch ever settings you guys recommended within lhc and see what happens :)
11) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30088)
Posted 27 Apr 2017 by BRG
Post:
LHC@home: Notice from BOINC
Your app_config.xml file refers to an unknown application 'ATLAS'. Known applications: None
27/04/2017 3:59:09 PM


Had this come up, however its gone now...

Will run this task through, pause everything and post again.
12) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30085)
Posted 27 Apr 2017 by BRG
Post:
Have you ever worked through Yeti´s checklist?
Fine.

Beside that you may restart the project with conservative settings.

1. Let your local WU cache get empty
2. Reset the project in BOINC
3. Update your VirtualBox software to the most recent version
4. Reboot your host
5. Set "Max # jobs = 1" and "Max # CPUs = 1" on the LHC website
6. Create the following app_config.xml

<app_config>
  <app>
    <name>ATLAS</name>
    <max_concurrent>1</max_concurrent>
    <fraction_done_exact/>
  </app>
  <app_version>
    <app_name>ATLAS</app_name>
    <plan_class>vbox64_mt_mcore_atlas</plan_class>
    <avg_ncpus>1.0</avg_ncpus>
    <cmdline>--memory_size_mb 5000</cmdline>
  </app_version>
  <project_max_concurrent>1</project_max_concurrent>
</app_config> 


7. Request a new WU from the project
8. Reload your configuration (must be done after you got the first WU and before the WU starts)
9. Check the result before you change your settings and request a new WU


I did go through his checklist last night, it was his check list that made me check preferences within lhc computing preferences :)

I have set it to not allow more tasks, will complete these 2 task... follow your list and then post back. thanks :)
13) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30078)
Posted 27 Apr 2017 by BRG
Post:
So, just finished 1 task and deleted 4... removed the app config file and just downloaded 2 WU's, both now running for 1 minute, before it would do seconds then stop... without jumping to conclusions it must be an app data file error?!?!?!
14) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30077)
Posted 27 Apr 2017 by BRG
Post:
Deleted app data file and still no change... closed and opened bionic etc

I suppose you still have tasks in queue, you already got before your changes.


I did delete them, however I left the ones that where running to run, back in work this morning and still only one running and the others saying waiting for memory
15) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30071)
Posted 26 Apr 2017 by BRG
Post:
Deleted app data file and still no change... closed and opened bionic etc
16) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30070)
Posted 26 Apr 2017 by BRG
Post:
Set in your preferences the # of CPU's also to 2 when you have 2 in your app_config.xml


Now done :) thanks


Your hosts are hidden.
Expert users can´t check your logs.
You may change your preferences and make your hosts visible.

Your app_config.xml looks strange.
Is it due to the copy/paste or are there really lines like:
<?xml version="1.0"?>

-<app_config>


-<app>



Your setting
<avg_ncpus>2.000000</avg_ncpus>

overrules the website preference "Max # of CPUs = 24" except the server´s working set size calculation which is now 9000MB per WU.
Reduce the website preferences to not more than the value that you use in your app_config.xml.

A 24 core host would be able to run 3 8-core WUs (3x9000MB = 27000MB).
If you configure 4-core WUs 5800MB would be required per WU.
This would use 20 CPUs.


I will allow computers to show now :)

it maybe due to the copy and paste... hmmm, this is via edit:

<app_config>
<app>
<name>ATLAS</name>
<max_concurrent>6</max_concurrent>
</app>
<app_version>
<app_name>ATLAS</app_name>
<avg_ncpus>2.000000</avg_ncpus>
<plan_class>vbox64_mt_mcore_atlas</plan_class>
<cmdline>--memory_size_mb 4800</cmdline>
</app_version>
</app_config>

What do you advise I do? I was told 2 cores per workunit?!?! and using the config setting above.
17) Message boards : ATLAS application : issues with app config/running multiple tasks (Message 30066)
Posted 26 Apr 2017 by BRG
Post:
Evening all,

Last few days Bionic has only allowed me to run a couple of atlas tasks at once rather than the max set of 6 but normally 4 due to ram...

I have everything set to use 100% (cpu and ram) within Bionic, checked settings on the lhc side of things too that's all at max, jobs set to no limit, cpus I have tried from no limit to 24, now at 24 and its only allowing one task.

24 cores and 32g ram

app config:

<?xml version="1.0"?>

-<app_config>


-<app>

<name>ATLAS</name>

<max_concurrent>6</max_concurrent>

</app>


-<app_version>

<app_name>ATLAS</app_name>

<avg_ncpus>2.000000</avg_ncpus>

<plan_class>vbox64_mt_mcore_atlas</plan_class>

<cmdline>--memory_size_mb 4800</cmdline>

</app_version>

</app_config>
18) Message boards : Sixtrack Application : How to stack 100s of tasks? (Message 30031)
Posted 24 Apr 2017 by BRG
Post:
I have an I7 chip in my machine that is a 4 core processor and utilizes 8 threads, 2 for each processor. Each thread is capable of receiving 14 work units. Thus allowing me to receive 112 tasks when running a full load. Not bad for a 4 core processor! Pick


not quite following how this works Robert?!?!

4 cores... thread for each core so that's 2 threads per core so that's 8 useable cores??
19) Message boards : Sixtrack Application : How to stack 100s of tasks? (Message 29814)
Posted 4 Apr 2017 by BRG
Post:
Clearly im missing a trick here...

Checking some users via the stats system, Its clear to see some members have for example 8 cores, yet have 100 plus tasks in progress... so how are they doing it?

Max amount of WU's someone can get from LHC is 24 unless they have more than 24 cores, so how have these users with less than 24 cores got over 100 tasks?
20) Message boards : ATLAS application : Some Validate errors (Message 29775)
Posted 2 Apr 2017 by BRG
Post:
Yeah, seems to be some kind of faulty WUs ...

Have round about 40 or more of them


yup, you me both... did the config file thing yesterday and all was working well... this morning come to the shop and find pages and pages of error tasks :(


Next 20


©2024 CERN