1) Message boards : Theory Application : strange discrepency in credit points for Theory (Message 38900)
Posted 19 May 2019 by Profile HerveUAE
Post:
I observed a sudden increase in the credits given to Theory jobs on the 09/05/2019. This increase only occurred on 2 of my 3 rigs.

The increase do not seem to accurately reflect CPU usage: if you look at Yeti's top RAC computers, you will find the same spec in the top 3.
Interestingly, the first one (doing mainly Theory jobs) has 4 times more RAC than the next 2 which are doing mainly ATLAS jobs.

See here: https://lhcathome.cern.ch/lhcathome/hosts_user.php?sort=expavg_credit&rev=0&show_all=0&userid=555
2) Message boards : ATLAS application : Download failures (Message 37554)
Posted 5 Dec 2018 by Profile HerveUAE
Post:
On my side, out of the 4 files that are downloaded for each WU, all download OK except the big one (200Mbytes or more).
3) Message boards : Number crunching : Postponed: VM job unmanageble, restarting later ????? (Message 36758)
Posted 18 Sep 2018 by Profile HerveUAE
Post:
Hi,

I had the same issue on one of my computers. I was using:
- BOINC 7.10.2.
- VirtualBox 5.2.8.
- SSD for ProgramData/BOINC
I have tried various tricks to solve the problem with no success. After reading this thread I finally decided to uninstall BOINC and VirtualBox and install the following conf:
- BOINC 7.12.1.
- VirtualBox 5.2.18.
- Hard Disk for ProgramData/BOINC
I am not sure which of those changes actually fixed the problem of those unmanageable VMs, but the new install works fine now for ATLAS.
Hoping it helps.
Herve
4) Message boards : ATLAS application : 4 CPUS tasks will not start on 4x Thread Processor (HyperThreading related) (Message 33076)
Posted 18 Nov 2017 by Profile HerveUAE
Post:
Hi,

I am running 4-core tasks with 2 cores, HyperThreading and 8Gbytes RAM with no problems on that host: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10420599
My settings at LHC@Home preferences are:
- Max # jobs: 6
- Max # CPUs: 4
No specific app_config.xml file.
5) Message boards : ATLAS application : ATLAS jobs failing after longer suspension (Message 32836)
Posted 15 Oct 2017 by Profile HerveUAE
Post:
Same for me, and I have given up trying to run CMS, LHCb and Theory tasks on my machines. But yes, those are matters for other discussion threads.
6) Message boards : ATLAS application : Huge input file! (Message 32743)
Posted 10 Oct 2017 by Profile HerveUAE
Post:
Both tasks are still running, more than 3 days now, athena.py CPU time at 4000 minutes.

Both tasks failed due to lack of disk space as well, but the failure occurred exactly at the time of resuming the task. If you do not suspend it, it seems that the task will continue for ever without failing.
7) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32741)
Posted 10 Oct 2017 by Profile HerveUAE
Post:
<cmdline>--memory_size_mb 7000</cmdline>

I agree with you Erich. If you have sufficient RAM, 7GB is better.
8) Message boards : ATLAS application : Missing Output at Console 2 (Message 32666)
Posted 7 Oct 2017 by Profile HerveUAE
Post:
Event processing information will appear here

Based on my own experience, those tasks never end. So I usually abort them if it is still like this after 1 hour.
9) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32665)
Posted 7 Oct 2017 by Profile HerveUAE
Post:
I have recently had one task that ran slower and slower to the point where it had almost stopped

I guess you make reference to the Progress of the task which does not increase continuously with time, but increase less and less over time. This does not mean that your task stops processing or is processing slower.
What it means is that the initial estimation of the time needed to complete was far less than the actual time needed. Hence in order not to reach a progress above 100%, the progress increases slower and slower when it gets near to the 100%.
This is normal behaviour with ATLAS tasks.
10) Message boards : ATLAS application : Atlas task running over 45 hours, 100% complete (Message 32664)
Posted 7 Oct 2017 by Profile HerveUAE
Post:
Here is an example of an app_config.xml that should work for you if you want to go back to 2 cores:
<?xml version="1.0"?>
<app_config>
<project_max_concurrent>1</project_max_concurrent>
<app>
<name>ATLAS</name>
<max_concurrent>1</max_concurrent>
</app>
<app_version>
<app_name>ATLAS</app_name>
<avg_ncpus>2.000000</avg_ncpus>
<plan_class>vbox64_mt_mcore_atlas</plan_class>
<cmdline>--memory_size_mb 5000</cmdline>
</app_version>
</app_config>
11) Message boards : ATLAS application : WOW 1000 / 5000 events in one WU ? ! (Message 32663)
Posted 7 Oct 2017 by Profile HerveUAE
Post:
It looks like only the native application will succeed to finish those tasks or one has to manipulate the rsc_disk_bound in an early stage of such a task.

So should I abort the tasks that I have been running for more than 3 days? I use VirtualBox and not the native app.
12) Message boards : ATLAS application : Huge input file! (Message 32662)
Posted 7 Oct 2017 by Profile HerveUAE
Post:
How many cores your ATLAS-VM have?
How do you know yours are over 600 after 2 days of running?

The lines showing the progress of the events appear to be sorted, so after one day one can only see the events that were calculated shortly before midnight.
In the window I saw an event number above 300, and since I have 2 cores on the task, I concluded that at least 600 events had been calculated.

Both tasks are still running, more than 3 days now, athena.py CPU time at 4000 minutes.
13) Message boards : ATLAS application : Huge input file! (Message 32641)
Posted 6 Oct 2017 by Profile HerveUAE
Post:
I got 2 of those:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=158597383
https://lhcathome.cern.ch/lhcathome/result.php?resultid=158522054
They have been running for more than 2 days now, and each has processed more than 600 events so far.
14) Message boards : ATLAS application : 2-core ATLAS tasks running with 1 core only (Message 32571)
Posted 30 Sep 2017 by Profile HerveUAE
Post:
Hi David,
The issue mentioned by Erich, and that I experienced myself as well, was apparently related to a batch of short tasks. This batch is over now I think.
I suggest to create a discussion thread dedicated to issues related to the number of athena.py processes within a task.
In this thread could be addressed both cases like this one reported by Erich, and the case I reported of duplicated and triplicated processes.
Regards,
Herve
15) Message boards : ATLAS application : 2-core ATLAS tasks running with 1 core only (Message 32555)
Posted 28 Sep 2017 by Profile HerveUAE
Post:
Indeed I also have some 2-core tasks with 1 CPU core running, but only some of the tasks, not all.
16) Message boards : ATLAS application : 2-core tasks with process "athena.py" running 4 times (Message 32541)
Posted 27 Sep 2017 by Profile HerveUAE
Post:
Another possible odd thing in the logs is the memory assigned to the VM. It looks like 9Gb is assigned for a 3 processor work unit. My 4 processor work units end up with 6,2Gb for the virtual machine. This follows the 2,6Gb + (0,9Gb * # processors)

That configuration is intentional because I had recently a few tasks that did not go through the starting phase with 7 Gbytes.
Could it be that because I give 9GB ATLAS process ends up running multiple times because it finds a lot of memory available?
17) Message boards : ATLAS application : 2-core tasks with process "athena.py" running 4 times (Message 32522)
Posted 26 Sep 2017 by Profile HerveUAE
Post:
Thanks David,
I think I have had this issue for quite some time, but just recently linked it to the duplicated or triplicated athena.py processes.
Are the results of the task valid for you guys when this is happening ?

I just check one of my computers. Out of 6 running tasks: 3 have triplicated log messages, 2 have duplicated log messages and one is OK. So right now this computer is spending the CPU time of 14 ATLAS tasks to actually run only 6 tasks. It would be great of computer's crunching capacity could be better used.

Is there anything I can do to investigate?
18) Message boards : ATLAS application : 2-core tasks with process "athena.py" running 4 times (Message 32513)
Posted 25 Sep 2017 by Profile HerveUAE
Post:
Another 3-cores task with "athena.py" running 6 times within the VM:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=157867841
And this 3-cores task has "athena.py" running 9 times within the VM:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=157867240
19) Message boards : ATLAS application : 2-core tasks with process "athena.py" running 4 times (Message 32511)
Posted 23 Sep 2017 by Profile HerveUAE
Post:
This task is supposed to run with 3-cores only, but actually has "athena.py" running 6 times within the VM:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=157814626
20) Message boards : ATLAS application : 2-core tasks with process "athena.py" running 4 times (Message 32503)
Posted 23 Sep 2017 by Profile HerveUAE
Post:
Examples of such tasks:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=157723308
https://lhcathome.cern.ch/lhcathome/result.php?resultid=157723749


Next 20


©2020 CERN