Message boards :
ATLAS application :
ATLAS native app
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next
Author | Message |
---|---|
Send message Joined: 22 Mar 17 Posts: 30 Credit: 360,676 RAC: 0 |
First telling that there is a new app version to test and then correcting that it's for SL6/7 only. Well, challenge accepted :P To run it on Mint 17 = Ubuntu 14.04 a few things were needed. - CVMFS from their repo - Singularity from NeuroDebian - BOINC client that not only tells that it runs on Mint but carefully points out that it's not SL7 :) From the configuration script I manually applied those CVMFS, FUSE and Singularity settings that were available with the versions from the repos. I had to change the Singularity settings below because they didn't work on my system, perhaps because the kernel is missing some module: mount hostfs = no enable overlay = no The configuration script sets FUSE option "user_allow_other" which may be a security issue on shared machines. It would be a good idea to warn users of that. On Debian based systems /bin/sh is Dash which doesn't like the way the output redirects are in the Atlas startup script so a few changes were needed. I could have have corrected them but there was something weird going on and I decided to just drop them. I also had to change the script so that it would accept "--nthreads 1" to run only one worker instead of the default eight. The changes are below: 14,16c14,16 < os.system("cvmfs_config probe 1>&2>/dev/null") < ret1=os.system("cvmfs_config stat atlas.cern.ch 1>&2>/dev/null") < ret2=os.system("cvmfs_config stat atlas-condb.cern.ch 1>&2>/dev/null") --- > os.system("cvmfs_config probe") > ret1=os.system("cvmfs_config stat atlas.cern.ch") > ret2=os.system("cvmfs_config stat atlas-condb.cern.ch") 28c28 < ret=os.system("singularity --version 1>&2>/dev/null") --- > ret=os.system("singularity --version") 132,135c132,134 < if int(THREADS)!=1: < prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS < sys.stderr.write(prefix) < os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix) --- > prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS > sys.stderr.write(prefix) > os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix) After making the edits I edited client_state.xml, corrected the file size, removed <signature_required/> and <file_signature> tags and also added <md5_cksum> just so that the client has one less reason to reject the edited file. After these changes my host (old and slow) has happily run the tasks, total three so far. About the memory usage. BOINC and the app itself has reported about 1,8 GB for all tasks. The first two tasks were run with eight workers. The highest memory usage for the entire system that I saw was about 4 GB. I think file system cache size was negligible at that point but I'm not absolutely certain of that. If I account 1 GB for the OS and other programs (and that's being generous) then Atlas used about 3 GB. The third task was run with only one worker. The highest memory usage for the entire system was about 2,5 GB. Giving 1 GB for the OS again leaves 1,5 GB for Atlas which seems more consistent with what BOINC and the app report. This is on 3 GB machine by the way. I have it configured with plenty of swap but the only time there was significant amount of disk activity was when the workers were started. After that the swap usage increased slowly, not often enough to slow down computing. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I got two native Atlas tasks on my SuSE Linux Leap 42.2 but they do not seem to do anything. CPU usage is zero. Tullio |
Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,533,875 RAC: 0 |
I got two native Atlas tasks on my SuSE Linux Leap 42.2 but they do not seem to do anything. CPU usage is zero. Does that mean that all Linux Distributions are now officially supported? |
Send message Joined: 23 Jun 14 Posts: 12 Credit: 6,342,744,399 RAC: 2,957,681 |
Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu..
|
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,355 |
For me, the cobblestones are not so importent, but the work to do for Atlas. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10496403 This SL69 got 100 points, a other SL69 from me more than 1,000 for the same CPU-time and duration time. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10495075 |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu. Very good. It would be nice to have this user-selectable, as has been suggested. I will just run it on a second machine (Ubuntu 16.10) without VirtualBox installed to see if I can get it to work. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Are the native apps available for Ubuntu yet? I can't seem to get any on two machines. The ATLAS simulation and the "Run test applications" are enabled on both machines. Ubuntu 17.04 machine: VirtualBox 5.1.26 is installed on this machine, and all I get are the 1.01 ATLAS simulations. Ubuntu 16.10 machine: VirtualBox is not installed on this machine, and I get no ATLAS at all. Both machines check out OK on CVMFS, and both have Singularity 2.3.1-dist installed. Is there anything else I need to do? |
Send message Joined: 22 Mar 17 Posts: 30 Credit: 360,676 RAC: 0 |
Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu.. Thanks. One 2.51 running fine without hacking. It's running super long though. The previous task run for 15 hours and this one looks like it's going to run at least twice as long. Is there normally that much variability in run times? Both tasks have 50 events so it's not that. Unfortunately I have to report a bug in the app. Knowing that this app uses a ton of memory I had set task switch time to 20 hours so that only one task is in memory at a time. Once the 20 hours filled BOINC suspended the app and started a task for another project. But athena.py is still running, now sharing a core with the other task. |
Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,533,875 RAC: 0 |
... It's running super long though. The previous task run for 15 hours and this one looks like it's going to run at least twice as long. Is there normally that much variability in run times? Both tasks have 50 events so it's not that. Since there are different tasks in the queue right now, it is normal that there is a variability in run times. I assume that they all process 50 events but the physics behind is different which leads to different running times. I see the same behaviour on my computer as well with differences of up to factor 3. An overview on which tasks are in queue can be seen here: https://lhcathome.cern.ch/ATLAS/ More details about them can be seen here for example: https://bigpanda.cern.ch/task/12065690/ You can see for example on the HITS.xxx file which task (part of it) was processed on your PC. Unfortunately I have to report a bug in the app. Knowing that this app uses a ton of memory I had set task switch time to 20 hours so that only one task is in memory at a time. What do you mean with "tons of memory"? Compared to the VM based on, the native app uses way less memory. For example, i run the native app on an 6GB ram machine without any problem. There are still a couple of GB RAM free. With the VM based on this would not be possible (at least with 4 cores) But the top console shows 4 athena.py processes (on a 4 core processor) with each about 30% memory usage(?????). This is a little bit strange, because also SWAP is shown as unused and 2-3 GB free RAM. [/quote] |
Send message Joined: 22 Mar 17 Posts: 30 Credit: 360,676 RAC: 0 |
Since there are different tasks in the queue right now, it is normal that there is a variability in run times. Thanks. I haven't run enough of ATLAS tasks to know all these details. What do you mean with "tons of memory"? The host has only 3 GB of RAM. 1,7 GB of it is, well, maybe not a ton but still quite a large fraction. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,355 |
WLCG-Cluster, can someone take a look, because CVMFS is missing: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10454806&offset=0&show_names=0&state=6&appid= |
Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,533,875 RAC: 0 |
after a lot of succesfull WUs, this one had an "EXIT_CHILD_FAILED" error: https://lhcathome.cern.ch/lhcathome/result.php?resultid=157554366 |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,355 |
This WU is started with native app, but is finished with Windows Atlas app. Is it possible to synchronize this tasks for a better performance? https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=75462506 |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I am getting Atlas native tasks on my SuSE 42.3 Linux box which obviously can't run them. Tullio |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
I am getting Atlas native tasks on my SuSE 42.3 Linux box which obviously can't run them. That is better than my two Ubuntu machines, which can't get them at all. By enabling both ATLAS and "Run test applications" I should be getting something, unless they have put a block on some machines. Whether that is by accident or on purpose is my concern at this point. |
Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,533,875 RAC: 0 |
...That is better than my two Ubuntu machines, which can't get them at all. By enabling both ATLAS and "Run test applications" I should be getting something... yes, according to this: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4396&postid=32273#32273 i had another "195 (0x000000C3) EXIT_CHILD_FAILED": https://lhcathome.cern.ch/lhcathome/result.php?resultid=157720743 why does it say "cvmfs not found" although this pc has crunched hundreds of native tasks succsessfully before (hence cvmfs clearly is installed and working)? |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
How can I avoid getting native tasks? I have SuSE Linux. Tullio Latest single core task ended in 93 hours and got 562 credits, producing a HITS file. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,355 |
Have OpenSuse for WCG and sixtrack. SL69 on every PC for native app. See this message from David: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4396&postid=31980#31980 |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
2-core tasks on the Windows 10 PC with 22 GB RAM are completed and validated but I see few HITS files. Tullio |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I had to put NNT on a SuSE Linux box in order not to be swamped with native tasks. Tullio All LHC task on a Windows 10 PC, with mixed results, despite its 22 GB RAM. |
©2024 CERN