Posts by Aurum

41) Message boards : ATLAS application : ATLAS tasks fail after 10 min (Message 42878) Posted 14 Jun 2020 by Aurum Post: I turned ATLAS back on hoping to find beautiful bosons but so far all I get are Validation Errors. Do we need to crank through the degenerates first???
42) Message boards : ATLAS application : ATLAS tasks fail after 10 min (Message 42872) Posted 14 Jun 2020 by Aurum Post: aurum@Rig-32:~$ cvmfs_config showconfig -s atlas.cern.ch CVMFS_REPOSITORY_NAME=atlas.cern.ch CVMFS_BACKOFF_INIT=2 # from /etc/cvmfs/default.conf CVMFS_BACKOFF_MAX=10 # from /etc/cvmfs/default.conf CVMFS_BASE_ENV=1 # from /etc/cvmfs/default.conf CVMFS_CACHE_BASE=/scratch/cvmfs # from /etc/cvmfs/default.local CVMFS_CACHE_DIR=/scratch/cvmfs/shared CVMFS_CHECK_PERMISSIONS=yes # from /etc/cvmfs/default.conf CVMFS_CLAIM_OWNERSHIP=yes # from /etc/cvmfs/default.conf CVMFS_CONFIG_REPOSITORY=cvmfs-config.cern.ch # from /etc/cvmfs/default.d/50-cern-debian.conf CVMFS_DEFAULT_DOMAIN=cern.ch # from /etc/cvmfs/default.d/50-cern-debian.conf CVMFS_HOST_RESET_AFTER=1800 # from /etc/cvmfs/default.conf CVMFS_HTTP_PROXY=DIRECT # from /etc/cvmfs/default.local CVMFS_IGNORE_SIGNATURE=no # from /etc/cvmfs/default.conf CVMFS_KEYS_DIR=/etc/cvmfs/keys/cern.ch # from /etc/cvmfs/domain.d/cern.ch.conf CVMFS_LOW_SPEED_LIMIT=1024 # from /etc/cvmfs/default.conf CVMFS_MAX_RETRIES=1 # from /etc/cvmfs/default.conf CVMFS_MOUNT_DIR=/cvmfs # from /etc/cvmfs/default.conf CVMFS_MOUNT_RW=no # from /etc/cvmfs/default.conf CVMFS_NFILES=65536 # from /etc/cvmfs/default.conf CVMFS_NFS_SOURCE=no # from /etc/cvmfs/default.conf CVMFS_PAC_URLS=http://wpad/wpad.dat # from /etc/cvmfs/default.conf CVMFS_PROXY_RESET_AFTER=300 # from /etc/cvmfs/default.conf CVMFS_QUOTA_LIMIT=4096 # from /etc/cvmfs/default.local CVMFS_RELOAD_SOCKETS=/var/run/cvmfs # from /etc/cvmfs/default.conf CVMFS_REPOSITORIES=atlas,atlas-condb,grid,cernvm-prod,sft,alice # from /etc/cvmfs/default.local CVMFS_SEND_INFO_HEADER=yes # from /etc/cvmfs/default.local CVMFS_SERVER_URL='http://s1unl-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1fnal-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1bnl-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1cern-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/atlas.cern.ch;http://s1ihep-cvmfs.openhtc.io/cvmfs/atlas.cern.ch' # from /etc/cvmfs/domain.d/cern.ch.local CVMFS_SHARED_CACHE=yes # from /etc/cvmfs/default.conf CVMFS_STRICT_MOUNT=no # from /etc/cvmfs/default.conf CVMFS_TIMEOUT=5 # from /etc/cvmfs/default.conf CVMFS_TIMEOUT_DIRECT=10 # from /etc/cvmfs/default.conf CVMFS_USE_GEOAPI=no # from /etc/cvmfs/domain.d/cern.ch.local CVMFS_USER=cvmfs # from /etc/cvmfs/default.conf
43) Message boards : ATLAS application : ATLAS tasks fail after 10 min (Message 42871) Posted 14 Jun 2020 by Aurum Post: computezrmle thanks as always. Using my monkey-see monkey-do powers I made these changes for my US location but I'm still getting nothing but "Validate errors." sudo xed /etc/cvmfs/default.local CVMFS_REPOSITORIES="atlas,atlas-condb,grid,cernvm-prod,sft,alice" CVMFS_SEND_INFO_HEADER=yes CVMFS_QUOTA_LIMIT=4096 CVMFS_CACHE_BASE=/scratch/cvmfs CVMFS_HTTP_PROXY=DIRECT sudo xed /etc/cvmfs/domain.d/cern.ch.local CVMFS_SERVER_URL="http://s1unl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1fnal-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1ral-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/@fqrn@;http://s1ihep-cvmfs.openhtc.io/cvmfs/@fqrn@" CVMFS_USE_GEOAPI=no sudo xed /etc/cvmfs/config.d/atlas-nightlies.cern.ch.local CVMFS_SERVER_URL="http://s1cern-cvmfs.openhtc.io/cvmfs/@fqrn@;http://s1bnl-cvmfs.openhtc.io/cvmfs/@fqrn@" CVMFS_USE_GEOAPI=no sudo cvmfs_config reload Is there something else wrong? This is on a new build computer where I configured ATLAS as before from my notes: wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb ; \ sudo dpkg -i cvmfs-release-latest_all.deb ; \ rm -f cvmfs-release-latest_all.deb ; \ sudo apt-get update ; \ sudo apt-get install cvmfs ; \ sudo apt install glibc-doc open-iscsi watchdog sudo wget https://lhcathomedev.cern.ch/lhcathome-dev/download/default.local -O /etc/cvmfs/default.local ; \ sudo cvmfs_config setup ; \ sudo echo "/cvmfs /etc/auto.cvmfs" > /etc/auto.master.d/cvmfs.autofs ; \ sudo systemctl restart autofs ; \ cvmfs_config probe sudo cvmfs_config reload
44) Message boards : ATLAS application : ATLAS tasks fail after 10 min (Message 42865) Posted 13 Jun 2020 by Aurum Post: Searching for "strftime" at the top of this page will deliver the answer. Hmm, was that answer retracted??? Sorry, couldn't find anything matching your search query.
45) Message boards : ATLAS application : Squid proxies may need restart (Message 42864) Posted 13 Jun 2020 by Aurum Post: You may also insert the following line in your squid.conf and do a "squid -k reconfigure". shutdown_lifetime 3 seconds This avoids the 60 seconds default delay when you shutdown/restart squid but I'm not 100% sure if changing this timeout requires a squid -k restart. At least Squid will be prepared for the next restart. Not sure where to stick it but this spot felt oh so right: # You don't believe this is enough? For sure, it is! cache_mem 192 MB maximum_object_size_in_memory 24 KB memory_replacement_policy heap GDSF shutdown_lifetime 3 seconds No idea what I'm doing. Is there an LHC Squid Care & Feeding Guide anywhere?
46) Message boards : ATLAS application : Validate error on all tasks, and short run time with 1 core only (Message 42854) Posted 12 Jun 2020 by Aurum Post: In addition to the validating errors on ATLAS i have now troubles getting other LHC workunits. I had trouble too until I read LHC BOINC Messages and saw that no CPU was requested because queue was full and none needed. I suspended unstarted WUs and LHC immediately DLed a boatload. Hopefully this batch won't fail instantly. Edit: Not looking good: Valids zero, Invalids 73. Validation error.
47) Message boards : Theory Application : New Native Theory Version 1.1 (Message 40834) Posted 7 Dec 2019 by Aurum Post: The local BOINC client will simply ignore xml tags that are not defined for app_config.xml. Among those ignored tags are: <maintain>18</maintain> <priority>1</priority> Duh. So invent them.
48) Message boards : Theory Application : New version 300.00 (Message 40831) Posted 7 Dec 2019 by Aurum Post: Set 'No Limit' and you will get as many tasks as you have cores or # of jobs, if the latter is less. None of my rigs have been supplied with more than ten Theory WUs and all specify No Limit/No Limit in Prefs..
49) Message boards : Theory Application : New Native Theory Version 1.1 (Message 40830) Posted 7 Dec 2019 by Aurum Post: On computers with lots of cores it might be worth to set up additional BOINC client instances. This is more work than I'm willing to do. I have a much better idea. Add BOINC commands that tell the server what to do: <app_config> <app> <name>ATLAS</name> <!-- Xeon E5-2699 v4 22c44t 32 GB RAM L3 Cache = 55 MB --> <maintain>18</maintain> <max_concurrent>16</max_concurrent> </app> <app_version> <app_name>ATLAS</app_name> <plan_class>native_mt</plan_class> <avg_ncpus>1</avg_ncpus> <cmdline>--nthreads 1</cmdline> </app_version> <app> <name>sixtrack</name> <maintain>9</maintain> <max_concurrent>6</max_concurrent> </app> <app> <name>Theory</name> <maintain>44</maintain> </app> <app> <name>CMS</name> <maintain>0</maintain> </app> </app_config> And even better would be: <app_config> <app> <name>ATLAS</name> <!-- Xeon E5-2699 v4 22c44t 32 GB RAM L3 Cache = 55 MB --> <priority>1</priority> <max_concurrent>16</max_concurrent> </app> <app_version> <app_name>ATLAS</app_name> <plan_class>native_mt</plan_class> <avg_ncpus>1</avg_ncpus> <cmdline>--nthreads 1</cmdline> </app_version> <app> <name>sixtrack</name> <priority>3</priority> </app> <app> <name>Theory</name> <priority>2</priority> </app> <app> <name>CMS</name> <priority>0</priority> </app> </app_config>
50) Message boards : Theory Application : New Native Theory Version 1.1 (Message 40826) Posted 7 Dec 2019 by Aurum Post: I thought nT was in production but it's limited to a fearful 10 WUs per rig. Because of the RAM-hungry ATLAS WUs I have to run ST to fill out my threads. I think BOINC runs best with fewer projects but stuck with running three now.
51) Message boards : Theory Application : New Native Theory Version 1.1 (Message 40824) Posted 7 Dec 2019 by Aurum Post: Thanks Gunde, I checked the Preferences/Theory Simulation box and 300s started flowing down.
52) Message boards : Theory Application : New Native Theory Version 1.1 (Message 40820) Posted 6 Dec 2019 by Aurum Post: But maeax appears to be the world record holder for longest running nTheory WU :-) Point is there's only one left as shown on Server Stats. Are you saying nT 1.1 is done? Just wondering if we'll get more nT WUs.
53) Message boards : Theory Application : New Native Theory Version 1.1 (Message 40815) Posted 6 Dec 2019 by Aurum Post: Now that maeax has the last nT WU running will nT 1.1 WUs be released to the public???
54) Message boards : Theory Application : Sherpa - longest runtime with Success - native (Message 40811) Posted 6 Dec 2019 by Aurum Post: Ten days, you must be a very patient person :-)
55) Message boards : Sixtrack Application : Wrong Factor sent by Project Server (Message 40768) Posted 3 Dec 2019 by Aurum Post: For a home made fix can we add to our app_configs??? <app_config> <app> <name>ATLAS</name> <!-- Xeon E5-2699 v4 22c44t L3 Cache = 55 MB --> <max_concurrent>6</max_concurrent> </app> <app_version> <app_name>ATLAS</app_name> <plan_class>native_mt</plan_class> <avg_ncpus>6</avg_ncpus> <cmdline>--nthreads 6</cmdline> </app_version> <app> <name>sixtrack</name> <max_concurrent>38</max_concurrent> </app> <app_version> <app_name>sixtrack</app_name> <plan_class>avx</plan_class> <avg_ncpus>1</avg_ncpus> </app_version> <app_version> <app_name>sixtrack</app_name> <plan_class>sse2</plan_class> <avg_ncpus>1</avg_ncpus> </app_version> </app_config> And how do we handle the multiple plan_classes???
56) Message boards : Number crunching : Max # jobs and Max # CPUs (Message 40708) Posted 27 Nov 2019 by Aurum Post: Max #tasks - should act like <project_max_concurrent> Be even better if Max#tasks behaved liked like: <max_concurrent> and there was a setting in Preferences for each project.
57) Message boards : ATLAS application : ATLAS native version 2.73 (Message 40671) Posted 25 Nov 2019 by Aurum Post: maeax, I don't think we're talking about the same dog's breakfast :-) computezrmle, Thx! That's the behavior I've seen and now I know what's going on in the blackbox :-) I think I'll go with: <app_config> <app> <name>ATLAS</name> <!-- Xeon E5-2699v4 22c44t L3 Cache = 55 MB --> <max_concurrent>2</max_concurrent> </app> <app_version> <app_name>ATLAS</app_name> <plan_class>native_mt</plan_class> <avg_ncpus>6</avg_ncpus> <cmdline>--nthreads 6</cmdline> </app_version> </app_config> and leave No Limit/No Limit with ST & nT checked for a few days see how it shakes out.
58) Message boards : ATLAS application : ATLAS native version 2.73 (Message 40666) Posted 25 Nov 2019 by Aurum Post: I wish I understood the impact of changing the number of CPUs in an ATLAS WU. Does an ATLAS WU run faster with more CPUs? Since the return file size remains the same regardless of whether I have 1, 2, 3, or 16 CPUs in that WU I might as well just use 16c WUs and run only one at a time to reduce the number of files I have to upload. Which means I'll get far too many 16c ATLAS WUs DLed and none to not enough nT and ST WUs. Fixing Preferences so that one could specify maximum number of WUs each project would send to a computer (like WCG does) would give crunchers flexibility. ATLAS would have the added field #CPUs. E.g., if I set the limit for ATLAS to 3 WUs and I have 4 on my computer LHC@home would not send me another ATLAS WU until I got down to two WUs and then send just one.
59) Message boards : ATLAS application : ATLAS native version 2.73 (Message 40663) Posted 25 Nov 2019 by Aurum Post: If CERN wants to maximize BOINC work then they should see if they can reduce the size of the largest return files. Atlas is optimized, you can find some News therefore in the Atlas-Folder. I read the titles for last 2 years and found nothing relevant. Be glad to read it if I knew what you suggest I read. Have 30 MBits upload, and Cern connect with 1.5 MBits for the Moment under Atlas. And this is ok. 3 Min. for 240 MByte! How many ATLAS WUs are you uploading at once??? I'm trying to UL a couple hundred from the same IP. By "ATLAS is optimized" do you mean the file size is as small as it humanly can be and it can never get smaller, or something else???
60) Message boards : ATLAS application : ATLAS native version 2.73 (Message 40658) Posted 25 Nov 2019 by Aurum Post: Yea, shortly after I said that I saw four 4c WUs that were 250 MB. No rhyme or reason. Since the return files are so large the slow upload speed of ADSL connections is easily swamped. I have to cut my ATLAS work in half if I stand any chance of clearing my upload logjam. If CERN wants to maximize BOINC work then they should see if they can reduce the size of the largest return files.

Previous 20 · Next 20

LHC@home