Message boards :
ATLAS application :
One Year native-Linux (SL69 and CentOS)
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 1729 Credit: 130,853,399 RAC: 281,979 ![]() ![]() ![]() |
In a few days native Linux for Atlas is running ONE Year. Is it time to leave the TEST-Modus? |
![]() Send message Joined: 15 Jun 08 Posts: 2148 Credit: 175,904,265 RAC: 110,269 ![]() ![]() ![]() |
... Is it time to leave the TEST-Modus? Yes: - would make the project selection more transparent - statistics would appear on the apps page No: - suspend/resume still doesn't work |
Send message Joined: 13 May 14 Posts: 384 Credit: 15,310,589 RAC: 7,325 ![]() ![]() ![]() |
Thank you for reminding us of this anniversary :) The main reason for keeping in test is due to the extra steps required to run in native mode, i.e. installing and configuring CVMFS and Singularity. In the virtualbox mode the BOINC client checks whether vbox is installed and if not it will not download any tasks. But there are no such checks for CVMFS and Singularity so if you don't have them you still download native jobs but they all fail immediately. We think it's better to avoid this behaviour so making people do the extra step of enabling test applications makes it more likely they will set up their hosts correctly. |
![]() Send message Joined: 7 Jan 07 Posts: 39 Credit: 15,700,127 RAC: 460 ![]() ![]() |
Hello, Into the Boinc Manager, I checked the box Leave non-GPU tasks in memory while suspended in the menu Options / Computing preferences... tab Disk and memory and it seems to work. |
![]() Send message Joined: 15 Jun 08 Posts: 2148 Credit: 175,904,265 RAC: 110,269 ![]() ![]() ![]() |
Do your tasks really suspend? I use the same setting but it's only the BOINC client that reports the tasks as suspended. The scientific app always continues running in the background. |
![]() Send message Joined: 7 Jan 07 Posts: 39 Credit: 15,700,127 RAC: 460 ![]() ![]() |
Do your tasks really suspend? You are right, I monitor with htop the task python -tt etc ... still running. The big difference is that the iteration number don't restart from zero after a while. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
The scientific app always continues running in the background.By "scientific app" I assume you mean athena.py? When I suspend native tasks all athena.py processes become zombies and then disappear from top within a few seconds, usually. cvmfs runs for a few secs and then it disappears too. Those are the happy times :) On rare occasion cvmfs jumps from it's normal low CPU and mem usage to much higher usage, the athena.py processes drop from normal ~98% CPU to about 80% and they continue running. Sometimes it goes on like that for 15 minutes. Sometimes it's still that way after 30 minutes at which point I just shake my head, walk away and try to think of something else. Again, that's very rare. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
[You are right, I monitor with htop the task python -tt etc ... still running. I have not observed with htop. I will try it too. |
Send message Joined: 2 May 07 Posts: 1729 Credit: 130,853,399 RAC: 281,979 ![]() ![]() ![]() |
Thank you for reminding us of this anniversary :) What's about 1.000 Collisions for native Linux instead of 200? Fast PC's are possible to do this work in less than one day! Computezrmle's arguments for No in this thread must be realized therefore. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
What's about 1.000 Collisions for native Linux instead of 200? +.5 (500 collisions) |
Send message Joined: 2 May 07 Posts: 1729 Credit: 130,853,399 RAC: 281,979 ![]() ![]() ![]() |
For us Volunteers they need to convert from 1.000 (default) to 200 (in the past only 50). Edit With less than four Cores it is not possible (250 per Core against 50 at the moment). If there are problems in the infrastructure or elsewhere by volunteer than we blow a lot of Energy in the wind! |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
If there are problems in the infrastructure or elsewhere by volunteer than we blow a lot of Energy in the wind! Sometimes many small steps works better than big steps. Just an idea.... Plug 'n Crunch... a complete "live Linux with BOINC for ATLAS native" distro in an .iso that can be burned to DVD or USB stick. Live so install to HD not required. It auto-runs a setup script that analyses system resources and then creates appropriate app_config.xml and makes recommendations to user for webside settings. KISS...just ATLAS native, no other sub-projects, no other projects, prevent users from playing with the settings via BOINC manager and boinccmd, all the cores get used, it's a dedicated 24/7 ATLAS cruncher, no options, no frustrations, no decisions. No hassles with Linux installation. No VBox. No frustration with settings and options and docs spread all over the place. Power off, remove the media, reboot ---> returns to whatever you had. Make the ISO a free download but offer bootable DVD and USB stick for cost of media plus shipping. |
Send message Joined: 2 May 07 Posts: 1729 Credit: 130,853,399 RAC: 281,979 ![]() ![]() ![]() |
native Linux SL69 is running very well, BUT... when rebooting or starting the first time: probing of cvmfs/atlas.cern.ch... ok probing of cvmfs/atlas-condb.cern.ch... ok probing of cvmfs/grid.cern.ch... ok need together about 2-3 minutes for succeeding! Have anyone else made the same experience? BTW: all have openhtc.io and the new Kernel 2.6.32-754.3.5.el6.x86_64 from 18/8/15 |
![]() Send message Joined: 15 Jun 08 Posts: 2148 Credit: 175,904,265 RAC: 110,269 ![]() ![]() ![]() |
After a reboot you may run "cvmfs_config wipecache" before the first WU starts. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 ![]() ![]() |
After a reboot you may run "cvmfs_config wipecache" before the first WU starts. Is it a good idea to wipe the cache every time the client starts? If so and if auto-starting the client via a SystemV init script (the way installing BOINC from most distros' repositories configures the client) then maybe it's a good idea to put a "cvmfs_config wipecache" statement in /etc/init.d/boinc-client? On my Ubuntu rigs /etc/init.d/boinc-client has function start() as below. I am thinking maybe add a wipecache statement as shown by the red text. Should it be enclosed in single quotes, double quotes or not quoted? Sorry, I don't know bash very well. start() { log_begin_msg "Starting $DESC: $NAME" if is_running; then log_progress_msg "already running" else 'cvmfs_config wipecache' if [ -n "$DISPLAY" -a -x /usr/bin/xhost ]; then # grant the boinc client to perform GPU computing xhost +si:localuser:$BOINC_USER || echo -n "xhost error ignored, GPU computing may not be possible" fi if [ -n "$VALGRIND_OPTIONS" ]; then start-stop-daemon --start --quiet --background --pidfile $PIDFILE \ --make-pidfile --user $BOINC_USER --chuid $BOINC_USER \ --chdir $BOINC_DIR --exec /usr/bin/valgrind -- $VALGRIND_OPTIONS $BOINC_CLIENT $BOINC_OPTS else start-stop-daemon --start --quiet --background --pidfile $PIDFILE \ --make-pidfile --user $BOINC_USER --chuid $BOINC_USER \ --chdir $BOINC_DIR --exec $BOINC_CLIENT -- $BOINC_OPTS fi fi log_end_msg 0 if [ "$SCHEDULE" = "1" ]; then schedule fi } |
©2023 CERN