Message boards :
ATLAS application :
ATLAS issues
Message board moderation
Author | Message |
---|---|
Send message Joined: 29 Mar 18 Posts: 8 Credit: 270,958 RAC: 0 |
Hello people, Sinds a few day`s i have issues with ATLAS projects. Do you guy`s know whats going on ? https://imgur.com/a/EKfooOs |
Send message Joined: 27 Sep 08 Posts: 850 Credit: 692,828,532 RAC: 38,232 |
It happens ever now and again for me, I just abort them and it back to normal. |
Send message Joined: 29 Mar 18 Posts: 8 Credit: 270,958 RAC: 0 |
Every ATLAS project results in the same, unmanageable restarting later. No other projects will be loaded aswel resulting your pc is idle all night. |
Send message Joined: 27 Sep 08 Posts: 850 Credit: 692,828,532 RAC: 38,232 |
I agree it's irritating, you could try downgrading to the 5.1.x branch of Virtual Box this seems more reliable. |
Send message Joined: 1 Nov 05 Posts: 1 Credit: 291,028 RAC: 0 |
ATLAS simply not able to download anything. Itworked earlier. Error: 2018.06.06. 17:44:47 | ATLAS@home | [error] No scheduler URLs found in master file What can I do? |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 15,673 |
ATLAS simply not able to download anything. Itworked earlier. You used a retired URL. Reconnect to this one: https://lhcathome.cern.ch/lhcathome/ |
Send message Joined: 4 Feb 18 Posts: 1 Credit: 552,252 RAC: 0 |
I am seeing the same issue. Did you find a resolution at all? |
Send message Joined: 24 Jul 16 Posts: 88 Credit: 239,917 RAC: 0 |
Hello this is maybe an error of communication between the vboxwrapper and virtualbox. ERROR: Vboxwrapper lost communication with VirtualBox, rescheduling task for a later time BOINC will be notified that it needs to clean up the environment. This is a temporary problem and so this job will be rescheduled for another time. To solve it : Go to the VirtualBox Manager and then File/VB Media Manager/ and in that box you may find some vdi's that need to be removed since they can mess up the new tasks trying to get a slot to use. This is what you do not want to see.....and the good and the bad there Erase all the vdis which have a yellow triangle and keep only the one with a green triangle.Don't touch the others. It will cleanup your environment. It happens sometimes to delete manualy some of them in virtualbox manager , when boinc fails to delete the vdis. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 15,673 |
Some basic thoughts. Your computer page shows that you run a 4-core system with 2 GPUs (NVIDIA + INTEL). If your GPUs run on full load, your system needs 2 CPU cores to support the GPUs. Thus you will have only 2 CPU cores left for other work. Your task overview shows that you run ATLAS using a 4-core setup. This may sooner or later cause timing problems. I suggest you may try a 2-core setup for ATLAS. If you do so, you should also use an app_config.xml to avoid errors due to a too low default RAM setting. <app_config> <app> <name>ATLAS</name> <max_concurrent>1</max_concurrent> <report_results_immediately/> </app> <app_version> <app_name>ATLAS</app_name> <plan_class>vbox64_mt_mcore_atlas</plan_class> <avg_ncpus>2.0</avg_ncpus> <cmdline>--nthreads 2 --memory_size_mb 4800</cmdline> </app_version> </app_config> Error 1: "Vboxwrapper lost communication with VirtualBox" "BOINC will be notified that it needs to clean up the environment" This is mostly caused by: - a crash - unclean shutdown - ... All of those causes either that some old files remain in a "slot" folder or that the links to those files are not removed from the VirtualBox control files. You may: 1. stop your BOINC client 2. run your VirtualBox GUI (use the same user that usually runs BOINC!) 3. Remove all VMs that are located in a BOINC slot but not "running" or "paused" 4. restart your BOINC client (better: restart your computer) If this errors appear again, consider to repeat the cleanup and downgrade to the most recent VirtualBox 5.1.x. Error 2: Image files marked with a yellow triangle. This is nasty (I also have lots of them) but not responsible for the errors you notice. Just clean up the list from time to time as described by PHILIPPE. |
Send message Joined: 11 May 07 Posts: 23 Credit: 3,631,975 RAC: 0 |
Guys, why there are 100% of tasks are ending with errors on this host? |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Guys, why there are 100% of tasks are ending with errors on this host? Do you have CVMFS installed and configured? Checking for CVMFS ls: невозможно получить доступ к '/cvmfs/atlas.cern.ch/repo/sw': Нет такого файла или каталога cvmfs_config doesn't exist, check cvmfs with cmd ls /cvmfs/atlas.cern.ch/repo/sw ls /cvmfs/atlas.cern.ch/repo/sw failed,aborting the jobs |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 15,673 |
Looks like a CVMFS error: Checking for CVMFS cvmfs_config doesn't exist, check cvmfs with cmd ls /cvmfs/atlas.cern.ch/repo/sw ls /cvmfs/atlas.cern.ch/repo/sw failed,aborting the jobs You may check your CVMFS installation. Then run: "sudo cvmfs_config wipecache" "cvmfs_config probe" What is the output of "cvmfs_config probe"? |
Send message Joined: 11 May 07 Posts: 23 Credit: 3,631,975 RAC: 0 |
Guys, why there are 100% of tasks are ending with errors on this host? Now I do: eti@DetiPC ~ $ sudo cvmfs_config chksetup OK deti@DetiPC ~ $ cvmfs_config probe Probing /cvmfs/atlas.cern.ch... OK Probing /cvmfs/atlas-condb.cern.ch... OK Probing /cvmfs/grid.cern.ch... OK deti@DetiPC ~ $ Let's see if this helps... |
Send message Joined: 9 Dec 14 Posts: 202 Credit: 2,533,875 RAC: 0 |
cvmfs should now work. Your most recent tasks show that singularity is not installed. If your OS is not SLC6 (which is obviously the case) you also have to install singularity: https://singularity.lbl.gov/ When singularity is working you should finally be good to go. |
Send message Joined: 11 May 07 Posts: 23 Credit: 3,631,975 RAC: 0 |
Finally it works... What a complicated project to participate! I highly doubt that there will be too much of volunteers ready to perform all that quest with cvmfs and singularity just to help scientists for free... What is SLC6? |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 206 |
SLC6 = Scientific Linux Vers.6.9. CentOS need's also no Singularity Installation as SL69. |
Send message Joined: 13 Apr 18 Posts: 9 Credit: 35,148 RAC: 0 |
Hi everyone! Sorry to bother, but I can't understand what's happening here. Lately I've been receiving quite a little amount of requests from LHC, and all of them, right after a few hours of computing gave he same exact result: error while computing. I can't really undestand what's wrong. I'm using BOINC for other projects, and none of them gave this problem. Is there a way you can help me fix it? Thanks a lot! |
Send message Joined: 24 Oct 04 Posts: 1180 Credit: 54,887,670 RAC: 2,609 |
Hi everyone! Your tasks are saying you run out of disc space so maybe try setting your Boinc Manager Options - Computing Preferences - Disc and Memory - Try that and see if it helps |
Send message Joined: 13 Apr 18 Posts: 9 Credit: 35,148 RAC: 0 |
Thanks a lot, as soon as I get a new ATLAS request I'll let you know if it worked. |
Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0 |
Your tasks are saying you run out of disc space so maybe try setting your Boinc Manager Yes, it is a disk space problem but not the kind of disk space problem you are thinking of. It's the "196 (0x000000C4) EXIT_DISK_LIMIT_EXCEEDED" error which seems to fool a lot of people. Maybe the solution you offered will work but not likely. Harri Liljeroos explains the cause of this error thoroughly in https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4773&postid=36371#36371. Note it is not a problem of BOINC not having enough disk space assigned in user preferences. The solution comes from deducing why <rsc_disk_bound>xxx</rsc_disk_bound> is being exceeded. Possible reasons are (but not limited to): 1) ATLAS tasks are being pre-empted by other project/tasks and causing an ultra-large snapshot file to be saved in the slot folder 2) old snapshots or other garbage left behind by previous tasks not being deleted 3) combination of 1) and 2) I would try the following: 1) set "no new tasks" for all projects and drain the cache completely 2) delete all the slot folders in the BOINC data folder 3) set "switch between tasks every __ minutes" to a very large value to ensure that ATLAS tasks are not pre-empted 4) do not allow the OS to install updates and reboot the system whenever it wishes 5) install updates manually and give VBox ample time to shutdown running tasks before rebooting |
©2025 CERN