Message boards :
ATLAS application :
ATLAS tasks fail after 10 min
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
computezrmle thanks as always. Using my monkey-see monkey-do powers I made these changes for my US location but I'm still getting nothing but "Validate errors." sudo xed /etc/cvmfs/default.local CVMFS_REPOSITORIES="atlas,atlas-condb,grid,cernvm-prod,sft,alice" CVMFS_SEND_INFO_HEADER=yes CVMFS_QUOTA_LIMIT=4096 CVMFS_CACHE_BASE=/scratch/cvmfs CVMFS_HTTP_PROXY=DIRECT sudo xed /etc/cvmfs/domain.d/cern.ch.local CVMFS_SERVER_URL="http://s1unl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/@fqrn@;http://s1ihep-cvmfs.openhtc.io/cvmfs/@fqrn@" CVMFS_USE_GEOAPI=no sudo xed /etc/cvmfs/config.d/atlas-nightlies.cern.ch.local CVMFS_SERVER_URL="http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@" CVMFS_USE_GEOAPI=no sudo cvmfs_config reloadIs there something else wrong? This is on a new build computer where I configured ATLAS as before from my notes: wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb ; \ sudo dpkg -i cvmfs-release-latest_all.deb ; \ rm -f cvmfs-release-latest_all.deb ; \ sudo apt-get update ; \ sudo apt-get install cvmfs ; \ sudo apt install glibc-doc open-iscsi watchdog sudo wget https://lhcathomedev.cern.ch/lhcathome-dev/download/default.local -O /etc/cvmfs/default.local ; \ sudo cvmfs_config setup ; \ sudo echo "/cvmfs /etc/auto.cvmfs" > /etc/auto.master.d/cvmfs.autofs ; \ sudo systemctl restart autofs ; \ cvmfs_config probe sudo cvmfs_config reload |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
aurum@Rig-32:~$ cvmfs_config showconfig -s atlas.cern.ch CVMFS_REPOSITORY_NAME=atlas.cern.ch CVMFS_BACKOFF_INIT=2 # from /etc/cvmfs/default.conf CVMFS_BACKOFF_MAX=10 # from /etc/cvmfs/default.conf CVMFS_BASE_ENV=1 # from /etc/cvmfs/default.conf CVMFS_CACHE_BASE=/scratch/cvmfs # from /etc/cvmfs/default.local CVMFS_CACHE_DIR=/scratch/cvmfs/shared CVMFS_CHECK_PERMISSIONS=yes # from /etc/cvmfs/default.conf CVMFS_CLAIM_OWNERSHIP=yes # from /etc/cvmfs/default.conf CVMFS_CONFIG_REPOSITORY=cvmfs-config.cern.ch # from /etc/cvmfs/default.d/50-cern-debian.conf CVMFS_DEFAULT_DOMAIN=cern.ch # from /etc/cvmfs/default.d/50-cern-debian.conf CVMFS_HOST_RESET_AFTER=1800 # from /etc/cvmfs/default.conf CVMFS_HTTP_PROXY=DIRECT # from /etc/cvmfs/default.local CVMFS_IGNORE_SIGNATURE=no # from /etc/cvmfs/default.conf CVMFS_KEYS_DIR=/etc/cvmfs/keys/cern.ch # from /etc/cvmfs/domain.d/cern.ch.conf CVMFS_LOW_SPEED_LIMIT=1024 # from /etc/cvmfs/default.conf CVMFS_MAX_RETRIES=1 # from /etc/cvmfs/default.conf CVMFS_MOUNT_DIR=/cvmfs # from /etc/cvmfs/default.conf CVMFS_MOUNT_RW=no # from /etc/cvmfs/default.conf CVMFS_NFILES=65536 # from /etc/cvmfs/default.conf CVMFS_NFS_SOURCE=no # from /etc/cvmfs/default.conf CVMFS_PAC_URLS=http://wpad/wpad.dat # from /etc/cvmfs/default.conf CVMFS_PROXY_RESET_AFTER=300 # from /etc/cvmfs/default.conf CVMFS_QUOTA_LIMIT=4096 # from /etc/cvmfs/default.local CVMFS_RELOAD_SOCKETS=/var/run/cvmfs # from /etc/cvmfs/default.conf CVMFS_REPOSITORIES=atlas,atlas-condb,grid,cernvm-prod,sft,alice # from /etc/cvmfs/default.local CVMFS_SEND_INFO_HEADER=yes # from /etc/cvmfs/default.local CVMFS_SERVER_URL='http://s1unl-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1fnal-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1bnl-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1cern-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/atlas.cern.ch;http://s1ihep-cvmfs.openhtc.io/cvmfs/atlas.cern.ch' # from /etc/cvmfs/domain.d/cern.ch.local CVMFS_SHARED_CACHE=yes # from /etc/cvmfs/default.conf CVMFS_STRICT_MOUNT=no # from /etc/cvmfs/default.conf CVMFS_TIMEOUT=5 # from /etc/cvmfs/default.conf CVMFS_TIMEOUT_DIRECT=10 # from /etc/cvmfs/default.conf CVMFS_USE_GEOAPI=no # from /etc/cvmfs/domain.d/cern.ch.local CVMFS_USER=cvmfs # from /etc/cvmfs/default.conf |
Send message Joined: 18 Dec 15 Posts: 1749 Credit: 115,501,579 RAC: 88,270 |
so, I wanted to give it another try. However, still negative. The task again errored out after about 10 minutes, with CPU time only 1 min 48 secs. When looking at the stderr, my eye caught one strange thing 43 seconds after start: 2020-06-14 13:31:54 (2632): Guest Log: 00:00:00.003737 main Error: Service 'control' failed to initialize: VERR_INVALID_PARAMETER So I guess that from this time on the task was lost. for the complete information, see here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=277703088 Does no-one from the experts here have any idea what's going wrong? As said before, also on two other systems the same things happens since 2 days. Whereas before, ATLAS crunching under same settings was no problem at all. For me, that's even more annoying as I recently bought some additional RAM to upgrade another of my machines for ATLAS crunching :-( |
Send message Joined: 30 Aug 14 Posts: 145 Credit: 10,847,070 RAC: 0 |
Does no-one from the experts here have any idea what's going wrong? I'm not really sure that they are aware of the current situation. Normally when things go wrong David Cameron informs us what's going on pretty fast. The problems occured Thursday and since then there is not a single word from the experts or official moderators. Why mine when you can research? - GRIDCOIN - Real cryptocurrency without wasting hashes! https://gridcoin.us |
Send message Joined: 18 Dec 15 Posts: 1749 Credit: 115,501,579 RAC: 88,270 |
The problems occured Thursday and since then there is not a single word from the experts or official moderators.you say it; this is somewhat strange :-( At any rate, for the time being I am no longer trying ATLAS; I've switched to CMS. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
Normally when things go wrong David Cameron informs us what's going on pretty fast. Sorry about this mess, I made the mistake of taking a day off right after a major update of one of the ATLAS systems and this update seemed to break BOINC tasks... I have just reverted the BOINC tasks back to use the previous version of this particular software so I hope new tasks will succeed. I'll investigate in the next days what the problem was. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
I turned ATLAS back on hoping to find beautiful bosons but so far all I get are Validation Errors. Do we need to crank through the degenerates first??? |
Send message Joined: 18 Dec 15 Posts: 1749 Credit: 115,501,579 RAC: 88,270 |
David, thanks for the explanation. So I downloaded a new ATLAS task, but again it errored out after 14 minutes - see here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=277712354 Still something doesn't seem to work the way it's supposed to. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Native ATLAS is not working for me either, but I am assuming that they are still on restricted staffing at CERN and won't get to it until the middle of the week at the earliest. CMS is fine. |
Send message Joined: 18 Dec 15 Posts: 1749 Credit: 115,501,579 RAC: 88,270 |
BTW, I just notice some interesting figures regarding ATLAS on the Server Status Page: 4.496 tasks in process - 42 users within past 24 hours - ??? |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
I find it strange that the average runtime is still around an hour. It should be close to zero. But maybe they just don't count the invalids? |
Send message Joined: 18 Dec 15 Posts: 1749 Credit: 115,501,579 RAC: 88,270 |
I find it strange that the average runtime is still around an hour.something seems rather wrong with the entries for ATLAS. |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,615,718 RAC: 126,629 |
@Aurum On Sat you posted this: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5445&postid=42864 It makes me guess you are running a local squid. On Sun you posted your CVMFS configuration: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5368&postid=42872 This configuration bypasses your local squid: CVMFS_HTTP_PROXY=DIRECT # from /etc/cvmfs/default.local You may edit /etc/cvmfs/default.local and set: # if you have a reliable local hostname resolution # replace "hostname_of_your_squid" with the hostname of your squid box :-) # replace 3128 with the TCP port your squid is listening to (3128 is the default) CVMFS_HTTP_PROXY="http://hostname_of_your_squid:3128" # as an option use the IP of your squid box # replace the example IP with the one you are using CVMFS_HTTP_PROXY="http://203.0.113.77:3128" In addition your local squid has to be set in your BOINC client as this is used to configure all LHC vbox tasks as well as ATLAS (native)'s Frontier client. |
Send message Joined: 13 May 14 Posts: 387 Credit: 15,314,184 RAC: 0 |
The new tasks submitted since yesterday are working ok, however it takes some time to flush out the bad tasks so you will still see a mixture of success and failure at the moment. |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
computezrmle, Yes you talked me into installing a squid :-) You had me put this in the squid.conf: # INSERT YOUR OWN RULE(S) HERE TO ALLOW ACCESS FROM YOUR CLIENTS # see ACL definition above # Examples: # http_access allow crunchers # http_access allow localnet http_access allow localhost http_access deny all http_access allow crunch # http_port # Don't bind it to an IP accessible from outside unless you know what you're doing. E.g., http_port 192.168.1.227:3128And trust me I don't know what I'm doing. So I assume my /etc/cvmfs/default.local should be this on all my computers: CVMFS_REPOSITORIES="atlas,atlas-condb,grid,cernvm-prod,sft,alice" CVMFS_SEND_INFO_HEADER=yes CVMFS_QUOTA_LIMIT=4096 CVMFS_CACHE_BASE=/scratch/cvmfs CVMFS_HTTP_PROXY="http://192.168.1.227:3128" I do not run vbox and have these two lines in my BOINC cc_config (same on all computers): <dont_use_vbox>1</dont_use_vbox> <vbox_window>0</vbox_window> In addition your local squid has to be set in your BOINC client as this is used to configure all LHC vbox tasks as well as ATLAS (native)'s Frontier client.I don't see any line in my BOINC cc_config file that might do this. How do I do this??? (In thinking about squids I'm reminded of what my physics professors used to say a century ago, "Don't worry the exam will be conceptual." :-) |
Send message Joined: 18 Dec 15 Posts: 1749 Credit: 115,501,579 RAC: 88,270 |
David Cameron wrote: The new tasks submitted since yesterday are working ok, however it takes some time to flush out the bad tasks so you will still see a mixture of success and failure at the moment.thanks, David, for the information. I now got tasks which worked well. Something seems to have happened to the credit calculation though: I git between 10 and 12 points per task :-( |
Send message Joined: 15 Jun 08 Posts: 2500 Credit: 248,615,718 RAC: 126,629 |
Just stumbled over your unanswered question: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5368&postid=42886 Sorry for the delay, the last weeks were very busy. http_access deny all http_access allow crunch The 2nd line will never be evaluated. Since order matters evaluation will stop at the 1st line as it becomes true in all cases. To give "crunch" a chance you may at least switch both lines: http_access allow crunch http_access deny all A better idea would be to check your squid.conf against the revised version here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5473 |
©2024 CERN