Message boards : ATLAS application : ATLAS native app
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
Juha

Send message
Joined: 22 Mar 17
Posts: 30
Credit: 360,676
RAC: 0
Message 32183 - Posted: 1 Sep 2017, 20:48:43 UTC

First telling that there is a new app version to test and then correcting that it's for SL6/7 only. Well, challenge accepted :P

To run it on Mint 17 = Ubuntu 14.04 a few things were needed.
- CVMFS from their repo
- Singularity from NeuroDebian
- BOINC client that not only tells that it runs on Mint but carefully points out that it's not SL7 :)

From the configuration script I manually applied those CVMFS, FUSE and Singularity settings that were available with the versions from the repos. I had to change the Singularity settings below because they didn't work on my system, perhaps because the kernel is missing some module:

mount hostfs = no
enable overlay = no


The configuration script sets FUSE option "user_allow_other" which may be a security issue on shared machines. It would be a good idea to warn users of that.

On Debian based systems /bin/sh is Dash which doesn't like the way the output redirects are in the Atlas startup script so a few changes were needed. I could have have corrected them but there was something weird going on and I decided to just drop them. I also had to change the script so that it would accept "--nthreads 1" to run only one worker instead of the default eight. The changes are below:

14,16c14,16
<   os.system("cvmfs_config probe 1>&2>/dev/null")
<   ret1=os.system("cvmfs_config stat atlas.cern.ch 1>&2>/dev/null")
<   ret2=os.system("cvmfs_config stat atlas-condb.cern.ch 1>&2>/dev/null")
---
>   os.system("cvmfs_config probe")
>   ret1=os.system("cvmfs_config stat atlas.cern.ch")
>   ret2=os.system("cvmfs_config stat atlas-condb.cern.ch")
28c28
<   ret=os.system("singularity --version 1>&2>/dev/null")
---
>   ret=os.system("singularity --version")
132,135c132,134
<   if int(THREADS)!=1:
<     prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS
<     sys.stderr.write(prefix)
<     os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix)
---
>   prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS
>   sys.stderr.write(prefix)
>   os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix)


After making the edits I edited client_state.xml, corrected the file size, removed <signature_required/> and <file_signature> tags and also added <md5_cksum> just so that the client has one less reason to reject the edited file.

After these changes my host (old and slow) has happily run the tasks, total three so far.

About the memory usage. BOINC and the app itself has reported about 1,8 GB for all tasks. The first two tasks were run with eight workers. The highest memory usage for the entire system that I saw was about 4 GB. I think file system cache size was negligible at that point but I'm not absolutely certain of that. If I account 1 GB for the OS and other programs (and that's being generous) then Atlas used about 3 GB.

The third task was run with only one worker. The highest memory usage for the entire system was about 2,5 GB. Giving 1 GB for the OS again leaves 1,5 GB for Atlas which seems more consistent with what BOINC and the app report.

This is on 3 GB machine by the way. I have it configured with plenty of swap but the only time there was significant amount of disk activity was when the workers were started. After that the swap usage increased slowly, not often enough to slow down computing.
ID: 32183 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 32243 - Posted: 5 Sep 2017, 11:12:45 UTC

I got two native Atlas tasks on my SuSE Linux Leap 42.2 but they do not seem to do anything. CPU usage is zero.
Tullio
ID: 32243 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 32244 - Posted: 5 Sep 2017, 11:59:46 UTC - in response to Message 32243.  

I got two native Atlas tasks on my SuSE Linux Leap 42.2 but they do not seem to do anything. CPU usage is zero.
Tullio

Does that mean that all Linux Distributions are now officially supported?
ID: 32244 · Report as offensive     Reply Quote
AGLT2
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Project scientist

Send message
Joined: 23 Jun 14
Posts: 12
Credit: 6,352,734,461
RAC: 1,244,402
Message 32273 - Posted: 5 Sep 2017, 15:28:13 UTC - in response to Message 32183.  

Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu..




14,16c14,16
<   os.system("cvmfs_config probe 1>&2>/dev/null")
<   ret1=os.system("cvmfs_config stat atlas.cern.ch 1>&2>/dev/null")
<   ret2=os.system("cvmfs_config stat atlas-condb.cern.ch 1>&2>/dev/null")
---
>   os.system("cvmfs_config probe")
>   ret1=os.system("cvmfs_config stat atlas.cern.ch")
>   ret2=os.system("cvmfs_config stat atlas-condb.cern.ch")
28c28
<   ret=os.system("singularity --version 1>&2>/dev/null")
---
>   ret=os.system("singularity --version")
132,135c132,134
<   if int(THREADS)!=1:
<     prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS
<     sys.stderr.write(prefix)
<     os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix)
---
>   prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS
>   sys.stderr.write(prefix)
>   os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix)


.
ID: 32273 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 32288 - Posted: 5 Sep 2017, 19:22:59 UTC
Last modified: 5 Sep 2017, 19:27:16 UTC

For me, the cobblestones are not so importent, but the work to do for Atlas.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10496403

This SL69 got 100 points, a other SL69 from me more than 1,000 for the same CPU-time and duration time.

https://lhcathome.cern.ch/lhcathome/results.php?hostid=10495075
ID: 32288 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 32297 - Posted: 5 Sep 2017, 23:08:52 UTC - in response to Message 32273.  

Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu.

Very good. It would be nice to have this user-selectable, as has been suggested. I will just run it on a second machine (Ubuntu 16.10) without VirtualBox installed to see if I can get it to work.
ID: 32297 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 32335 - Posted: 7 Sep 2017, 16:53:56 UTC

Are the native apps available for Ubuntu yet? I can't seem to get any on two machines. The ATLAS simulation and the "Run test applications" are enabled on both machines.

Ubuntu 17.04 machine: VirtualBox 5.1.26 is installed on this machine, and all I get are the 1.01 ATLAS simulations.

Ubuntu 16.10 machine: VirtualBox is not installed on this machine, and I get no ATLAS at all.

Both machines check out OK on CVMFS, and both have Singularity 2.3.1-dist installed. Is there anything else I need to do?
ID: 32335 · Report as offensive     Reply Quote
Juha

Send message
Joined: 22 Mar 17
Posts: 30
Credit: 360,676
RAC: 0
Message 32337 - Posted: 7 Sep 2017, 18:57:16 UTC - in response to Message 32273.  
Last modified: 7 Sep 2017, 18:57:29 UTC

Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu..


Thanks. One 2.51 running fine without hacking. It's running super long though. The previous task run for 15 hours and this one looks like it's going to run at least twice as long. Is there normally that much variability in run times? Both tasks have 50 events so it's not that.



Unfortunately I have to report a bug in the app. Knowing that this app uses a ton of memory I had set task switch time to 20 hours so that only one task is in memory at a time.

Once the 20 hours filled BOINC suspended the app and started a task for another project. But athena.py is still running, now sharing a core with the other task.
ID: 32337 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 32348 - Posted: 8 Sep 2017, 11:20:27 UTC - in response to Message 32337.  

... It's running super long though. The previous task run for 15 hours and this one looks like it's going to run at least twice as long. Is there normally that much variability in run times? Both tasks have 50 events so it's not that.

Since there are different tasks in the queue right now, it is normal that there is a variability in run times. I assume that they all process 50 events but the physics behind is different which leads to different running times. I see the same behaviour on my computer as well with differences of up to factor 3.
An overview on which tasks are in queue can be seen here:
https://lhcathome.cern.ch/ATLAS/
More details about them can be seen here for example:
https://bigpanda.cern.ch/task/12065690/
You can see for example on the HITS.xxx file which task (part of it) was processed on your PC.

Unfortunately I have to report a bug in the app. Knowing that this app uses a ton of memory I had set task switch time to 20 hours so that only one task is in memory at a time.

What do you mean with "tons of memory"? Compared to the VM based on, the native app uses way less memory. For example, i run the native app on an 6GB ram machine without any problem. There are still a couple of GB RAM free. With the VM based on this would not be possible (at least with 4 cores)
But the top console shows 4 athena.py processes (on a 4 core processor) with each about 30% memory usage(?????). This is a little bit strange, because also SWAP is shown as unused and 2-3 GB free RAM.
[/quote]
ID: 32348 · Report as offensive     Reply Quote
Juha

Send message
Joined: 22 Mar 17
Posts: 30
Credit: 360,676
RAC: 0
Message 32355 - Posted: 8 Sep 2017, 17:51:52 UTC - in response to Message 32348.  

Since there are different tasks in the queue right now, it is normal that there is a variability in run times.


Thanks. I haven't run enough of ATLAS tasks to know all these details.

What do you mean with "tons of memory"?


The host has only 3 GB of RAM. 1,7 GB of it is, well, maybe not a ton but still quite a large fraction.
ID: 32355 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 32364 - Posted: 9 Sep 2017, 7:00:23 UTC

ID: 32364 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 32457 - Posted: 18 Sep 2017, 7:33:14 UTC
Last modified: 18 Sep 2017, 7:33:39 UTC

after a lot of succesfull WUs, this one had an "EXIT_CHILD_FAILED" error:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=157554366
ID: 32457 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 32467 - Posted: 18 Sep 2017, 17:01:01 UTC

This WU is started with native app, but is finished with Windows Atlas app.
Is it possible to synchronize this tasks for a better performance?
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=75462506
ID: 32467 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 32490 - Posted: 21 Sep 2017, 13:14:31 UTC

I am getting Atlas native tasks on my SuSE 42.3 Linux box which obviously can't run them.
Tullio
ID: 32490 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 32495 - Posted: 21 Sep 2017, 15:48:24 UTC - in response to Message 32490.  

I am getting Atlas native tasks on my SuSE 42.3 Linux box which obviously can't run them.
Tullio

That is better than my two Ubuntu machines, which can't get them at all. By enabling both ATLAS and "Run test applications" I should be getting something, unless they have put a block on some machines.
Whether that is by accident or on purpose is my concern at this point.
ID: 32495 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 32497 - Posted: 22 Sep 2017, 4:03:02 UTC - in response to Message 32495.  
Last modified: 22 Sep 2017, 4:05:38 UTC

...That is better than my two Ubuntu machines, which can't get them at all. By enabling both ATLAS and "Run test applications" I should be getting something...

yes, according to this: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4396&postid=32273#32273

i had another "195 (0x000000C3) EXIT_CHILD_FAILED": https://lhcathome.cern.ch/lhcathome/result.php?resultid=157720743

why does it say "cvmfs not found" although this pc has crunched hundreds of native tasks succsessfully before (hence cvmfs clearly is installed and working)?
ID: 32497 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 32534 - Posted: 27 Sep 2017, 12:07:57 UTC
Last modified: 27 Sep 2017, 12:13:12 UTC

How can I avoid getting native tasks? I have SuSE Linux.
Tullio
Latest single core task ended in 93 hours and got 562 credits, producing a HITS file.
ID: 32534 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 32536 - Posted: 27 Sep 2017, 14:50:59 UTC

Have OpenSuse for WCG and sixtrack.
SL69 on every PC for native app.
See this message from David:

https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4396&postid=31980#31980
ID: 32536 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 32540 - Posted: 27 Sep 2017, 17:27:18 UTC - in response to Message 32536.  

2-core tasks on the Windows 10 PC with 22 GB RAM are completed and validated but I see few HITS files.
Tullio
ID: 32540 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 32550 - Posted: 28 Sep 2017, 3:07:47 UTC
Last modified: 28 Sep 2017, 3:20:41 UTC

I had to put NNT on a SuSE Linux box in order not to be swamped with native tasks.
Tullio
All LHC task on a Windows 10 PC, with mixed results, despite its 22 GB RAM.
ID: 32550 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : ATLAS application : ATLAS native app


©2024 CERN