1) Message boards : ATLAS application : ATLAS native version 2.72 (Message 40131)
Posted 4 days ago by Jim1348
Post:
At this point, switching between one singularity an another probably won't save it. The problem seems to be at a deeper level.
I will wipe out one of the machines and start over. That might fix it. Thanks
2) Message boards : ATLAS application : ATLAS native version 2.72 (Message 40118)
Posted 5 days ago by Jim1348
Post:
While 2.72 is running fine on my Ubunu 16.04 machine now, the same cannot be said for my Ubuntu 18.04 machines.
The work units basically do not run, but error out in 10 minutes.

i7-8700:https://lhcathome.cern.ch/lhcathome/results.php?hostid=10612434
Note that the stderr output says that "Singularity is installed, version singularity version 3.2.1-1".

i7-9700:https://lhcathome.cern.ch/lhcathome/results.php?hostid=10607999
Note that the stderr output says that "Singularity is not installed, using version from CVMFS".

Also note that there is a high error rate on all of these work units on other machines too, but I see a few that work.
Something is amiss.
3) Message boards : ATLAS application : ATLAS native version 2.72 (Message 40116)
Posted 5 days ago by Jim1348
Post:
OK, this is a bit of a long story, but after removing various directories I never could get the CVMFS version to work that I can see.
So I installed the one-line version https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5151&postid=40040, and at least got singularity version 3.2.1-1.
But even that did not appear to run, though I may not have waited long enough; it was slow starting.

So I just did "sudo apt install singularity", and got a large bunch of files I had never seen before, and probably are not necessary.
But it now works, so that will do.

Thanks to all.
4) Message boards : ATLAS application : ATLAS native version 2.72 (Message 40112)
Posted 6 days ago by Jim1348
Post:
To solve the problems on my hosts I removed the local singularity and fresh tasks now use singularity from CVMFS.

What commands did you use to remove it? I tried that on two other machines, and ended up with nothing that worked.
"singularity --version" then showed nothing.
5) Message boards : ATLAS application : ATLAS native version 2.72 (Message 40105)
Posted 6 days ago by Jim1348
Post:
The 2.72 are all failing for me after 10 minutes, with
195 (0x000000C3) EXIT_CHILD_FAILED.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10609230&offset=0&show_names=0&state=6&appid=14

I had no problems with 2.71.
6) Message boards : ATLAS application : ATLAS native version 2.70 (Message 40068)
Posted 13 days ago by Jim1348
Post:
Not sure exactly why that doesn't work for you. If you have singularity locally installed then the CVMFS one is not used so I don't think there is an incompatibility. Maybe you could try the version from CVMFS - you can easily test without uninstalling your local version by putting /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin at the start of your PATH.

Thanks, but I think the problem is deeper on this machine. I tried your change, and even uninstalled my previous version of singularity, but I still get the errors.
Looking back at my results, I see the errors go back to singularity 2.6.0, so they may not have been introduced by my upgrading to 3.2.1.1 (which is still what "singularity --version" shows by the way).

I think I will give it a rest and try something else later. There is nothing you can do on your end. I may need to re-install the OS.

EDIT: Thinking back, I believe I tried to upgrade from 2.6.0 earlier by compiling a later version myself. That was doomed to fail, and undoubtedly borked the machine, I just had not noticed yet. Thanks for your input.
7) Message boards : ATLAS application : ATLAS native version 2.70 (Message 40059)
Posted 14 days ago by Jim1348
Post:
Sorry, I didn't explain very well what I rolled back in 2.71. I changed the singularity image that we use back to the one used with 2.60, because the unpacked image didn't work with older singularity releases.

The change to use singularity from CVMFS is still present in 2.71, so if you do not have singularity installed ("singularity" is not found in your PATH) then the CVMFS one will be used.

OK, so 2.71 should work with singularity version 3.2.1-1, which I have installed as suggested here:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5151&postid=40040#40040

And CVMFS checks out OK:
$ cvmfs_config probe
Probing /cvmfs/atlas.cern.ch... OK
Probing /cvmfs/atlas-condb.cern.ch... OK
Probing /cvmfs/grid.cern.ch... OK
Probing /cvmfs/cernvm-prod.cern.ch... OK
Probing /cvmfs/sft.cern.ch... OK
Probing /cvmfs/alice.cern.ch... OK


But all I get are errors.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10607999&offset=0&show_names=0&state=6&appid=14

If it were bad work units, I would expect a lot of complaints here. So maybe it is an incompatibility between my installed singularity, and the one with CVMFS?
8) Message boards : ATLAS application : ATLAS native version 2.70 (Message 40042)
Posted 17 days ago by Jim1348
Post:
But the combination of 2.71 with singularity version 3.2.1-1 does not work for me. I can't tell whether this is the same problem previously reported, or a new one.
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10612434

Note 1: They don't work on any of the vbox machines that have tried them either.
Note 2: It is getting hard to edit this forum. Maybe the servers are going out.
Note 3: #2 would be useful to give them time to fix the already-identified problems.
9) Message boards : ATLAS application : ATLAS native version 2.70 (Message 40041)
Posted 17 days ago by Jim1348
Post:
This is untested but might be a workaround if all required libs are present on your system:
[sudo] ln -s /cvmfs/atlas.cern.ch/repo/containers/sw/singularity/x86_64-el7/current/bin/singularity /usr/bin/singularity

I uninstalled singularity 2.6.1 and ran that, but got:
ln: failed to create symbolic link '/usr/bin/singularity': File exists

EDIT: I then removed the singularity link in /usr/bin/, and tried again. It worked then.
Now I have: singularity version 3.2.1-1
Very nice. Thanks.
10) Message boards : ATLAS application : ATLAS native version 2.70 (Message 40029)
Posted 19 days ago by Jim1348
Post:
If you have singularity locally installed this will be preferred to the CVMFS version.
- The singularity image is now an "unpacked filesystem image" which speeds up the starting time of singularity.

Would it be beneficial to uninstall a locally installed singularity, and install the CVMFS version?
11) Message boards : ATLAS application : No tasks are available for ATLAS Simulation (native ATLAS) (Message 40023)
Posted 20 days ago by Jim1348
Post:
Thanks.
12) Message boards : ATLAS application : No tasks are available for ATLAS Simulation (native ATLAS) (Message 40021)
Posted 20 days ago by Jim1348
Post:
I have been running all 8 cores of my i7-9700 on native ATLAS for a week, and now can't get any.
I have four cores free.
13) Message boards : ATLAS application : Misconfigured Machine? (Message 40018)
Posted 21 days ago by Jim1348
Post:
I borrowed this title from CPDN. It seems to describe this situation: Invalid (4696)
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10587000

Perhaps there should be a system for alerting the owner, or banning such machines?
14) Message boards : Number crunching : computation errors (Message 40000)
Posted 23 days ago by Jim1348
Post:
You won't get much help until you show your computers. There are experts here (probably not me) who can then diagnose it.
15) Message boards : CMS Application : CMS ignores zero resource share? (Message 39981)
Posted 26 days ago by Jim1348
Post:
See for example this thread. Resource share is not an instantaneous setting, and it has its foibles.

OK, that explains it more or less. I have seen that happen in a limited way before - whenever I attach to Rosetta for the first time with zero resource share, it will send a single work unit immediately, but I had never seen it operate over several hours before. It will straighten itself out shortly it seems.

You have a good memory.
16) Message boards : CMS Application : CMS ignores zero resource share? (Message 39976)
Posted 26 days ago by Jim1348
Post:
I am running BOINC 7.16.1 on an Ubuntu 18.04 machine (i7-8700 with 12 cores) and have set CMS as my only project on a given location with "zero resource share".
However, over the course of three hours, I have now picked up nine CMS work units, even though I am still running WCG on all cores, and won't finish for a few hours.

Is this a bug in LHC or BOINC?
Or an undocumented feature?
17) Message boards : Theory Application : Simple Bash script which sets everything up automatically to run native apps (Message 39916)
Posted 13 Sep 2019 by Jim1348
Post:
CVMFS is a local cache.

OK, thanks. I was not sure whether I had it already or not, since he lists it as a separate item.
(I have had to wipe out my machines since last doing a squid proxy, and will need to give it a go again.)
18) Message boards : Theory Application : Simple Bash script which sets everything up automatically to run native apps (Message 39912)
Posted 13 Sep 2019 by Jim1348
Post:
That is a lot of good work on your part. I already have CVMFS up and running on two machines, one for native Theory and the other for native ATLAS.
But I would like to implement a "local cache" for CVMFS.

Is there a way that I could add a local cache to an existing Ubuntu 18.04 installation?

Thanks.
19) Message boards : Theory Application : New version 263.90 (Message 39806)
Posted 2 Sep 2019 by Jim1348
Post:
It appears that it could affect different projects differently. I look mainly at the execution times, and see no obvious difference thus far for LHC on my i7-9700. But I could go to my Ryzens if necessary.
https://www.extremetech.com/computing/291649-intel-performance-amd-spectre-meltdown-mds-patches

It is another juggling act we have to do.
20) Message boards : Theory Application : New version 263.90 (Message 39804)
Posted 2 Sep 2019 by Jim1348
Post:
Recent linux kernels usually activate a couple of mitigation settings against various malware at the expense of (much) lower benchmark results.
As the benchmark results are used for credit calculation it will result in lower credits.

I suppose that is the speculative execution problem (Spectre/Meltdown) affecting the Intel CPUs.
So do the benchmarks change for the AMD CPUs?


Next 20


©2019 CERN