Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2)
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46032 - Posted: 10 Jan 2022, 10:50:35 UTC
Last modified: 10 Jan 2022, 10:53:30 UTC

I run WSL2, which means I can't run ATLAS tasks in Virtual Box. However, I have been running other BOINC tasks in WSL2 for some time now. So I looked at the sticky thread in this part of the forum for running ATLAS natively (https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4840), and my heart sank when I saw the very long list of instructions. However, these instructions are for someone who wants to build everything from scratch, including BOINC, so I looked to see if there was a more straightforward way, and there is.

The instructions below are based on information gathered from the instructions for ATLAS natively and the instructions for running Theory natively (https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4971). I've assumed you have BOINC installed and running, and that you are familiar with entering commands into a bash shell. I have written up a guide to getting up and running with BOINC on WSL2 on the Universe@Home forum, but at the time of writing that website is down for extensive repairs. These instructions should also work running Ubuntu and other Debian distributions natively, with minor adjustments (no need to keep starting CVMFS!).

    Install and setup the CERN Virtual Machine File System (CVMFS):
      Enter the following commands:
      wget https://ecsft.cern.ch/dist/cvmfs/cvmfs-release/cvmfs-release-latest_all.deb
      sudo dpkg -i cvmfs-release-latest_all.deb
      rm -f cvmfs-release-latest_all.deb
      sudo apt-get update
      sudo apt-get install cvmfs
      cvmfs_config setup
      

      Edit the CVMFS configuration file...
      sudo nano /etc/cvmfs/default.local
      

      ...and add the following lines
      CVMFS_REPOSITORIES=atlas.cern.ch,atlas-condb.cern.ch,grid.cern.ch
      CVMFS_CACHE_BASE=/scratch/cvmfs
      CVMFS_QUOTA_LIMIT=4096
      CVMFS_HTTP_PROXY=DIRECT
      

      On WSL2 you will need to start CVMFS, now and every time the WSL2 virtual machine starts, using:
      sudo cvmfs_config wsl2_start
      

      You can check all is OK with:
      cvmfs_config probe
      


    Install Singularity:


      Enter the following into the command line:
      wget https://github.com/sylabs/singularity/releases/download/v3.9.1/singularity-ce_3.9.1+6-g38b50cbc5-focal_amd64.deb
      sudo dpkg -i singularity-ce_3.9.1+6-g38b50cbc5-focal_amd64.deb
      rm singularity-ce_3.9.1+6-g38b50cbc5-focal_amd64.deb
      

      Note that the filename for the Singularity package will change over time. You can find its current name by looking at https://github.com/sylabs/singularity/releases, from where you can also, of course, download it using your web browser if you prefer.


    Activate:


    Enjoy!

      You should find tasks crunch to successful completion, and you should find you can shut down both the BOINC client and WSL without blitzing your ATLAS tasks.


ID: 46032 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 34,609
Message 46035 - Posted: 10 Jan 2022, 21:07:11 UTC - in response to Message 46032.  

Some comments


CVMFS_CACHE_BASE=/scratch/cvmfs
It's not wrong to change the base directory, but why do you change the default set in /etc/cvmfs/default.conf?

CVMFS_QUOTA_LIMIT=4096
May have to be set a bit higher (6-10GB).
Check "cvmfs_config stat atlas.cern.ch" once a day.
The long term value for HITRATE(%) should settle around 95-99%.
If the quota is set too low the hitrate drops.


These options must be set for LHC@home:
CVMFS_HTTP_PROXY="auto;DIRECT"
CVMFS_USE_CDN=yes

Wrong or missing values cause
- wrong servers to be used, e.g. cernvmfs.gridpp.rl.ac.uk instead of ...openhtc.io
- the requests being send via fallback proxies at CERN/Fermilab (http://<proxy_ip>:3126)
ID: 46035 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46036 - Posted: 11 Jan 2022, 9:33:29 UTC - in response to Message 46035.  

As I said in the my original post, I took my information, including the settings for default.conf, from https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4840. So please address your questions to the author of that post, and let me know what is the conclusion.
ID: 46036 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 34,609
Message 46037 - Posted: 11 Jan 2022, 10:24:21 UTC - in response to Message 46036.  

The OP (https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4840) has been written more than 3 years ago and the author is currently not active.
Unfortunately the BOINC forum software doesn't allow to edit old posts.
Nonetheless, most of the suggestions in that post are still valid but it's essential to follow the whole thread as well as other threads that describe how to correctly setup a CVMFS client:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5594
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5595

Why?
Just 1 example (out of many):
CVMFS_USE_CDN was not present in 2018 but it is now and makes it easy to switch to openhtc.io.


Sharing experience with other users is always appreciated but insisting on outdated settings is not helpful.
ID: 46037 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46038 - Posted: 11 Jan 2022, 10:56:26 UTC - in response to Message 46036.  
Last modified: 11 Jan 2022, 10:58:17 UTC

Brummig,
you have finished a ATLAS with WSL2, get Creditpoints, but no Hits file was produced.
Ubuntu 20.04.3 LTS [5.10.60.1-microsoft-standard-WSL2|libc 2.31 (Ubuntu GLIBC 2.31-0ubuntu9.2)]
[2022-01-11 09:18:57] No HITS result produced
ID: 46038 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46039 - Posted: 11 Jan 2022, 10:57:48 UTC - in response to Message 46037.  

I'm not insisting on anything. I was simply following a sticky thread, on the assumption that it was still appropriate. If it's out of date, then it needs to be unstuck and new instructions posted. It's not reasonable to expect people to search many threads and posts (including much discussion) to find all the information needed, not knowing whether those posts are outdated.

Since editing my post appears to be impossible, I'm happy to create a new thread (and I'm happy for others to replace it with updated instructions when that becomes necessary). Is there a download URL for the latest default.local, or will be people forever be condemned to searching the forum to ensure they have the correct entries in it? I see no point in linking to https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5594, since that could become outdated and replaced with another thread, completely unknown to the person reading it.
ID: 46039 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 34,609
Message 46040 - Posted: 11 Jan 2022, 11:39:47 UTC - in response to Message 46039.  

As explained before the OP still contains useful information that makes it worth to keep it sticky.

Also mentioned:
The forum software is not perfect.
All of us are affected in the same way and have to deal with this.
It makes no sense that you complain about it here since it is part of the BOINC software.
Feel free to visit their website and develop a better forum suite.
https://github.com/BOINC/boinc

The default.local is in your responsibility.
A download is not necessary since it usually contains just a few lines.
Best would indeed be to use the search function to locate the most recent post(s) about it.
Now, just in case that's not what you expect - you'll have to deal with it.


Meanwhile 2 tasks reported "No HITS result produced" which means they didn't return useful scientific data.
=> needs to be investigated whether this is caused by running the tasks under WSL(2) or not.

Both tasks also report "This job has been restarted, cleaning up previous attempt".
Since ATLAS doesn't use checkpoints all work from all starts except the last one is also lost.
=> there are lots of posts explaining this: ATLAS native must not be suspended
ID: 46040 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 8 Dec 19
Posts: 37
Credit: 7,587,438
RAC: 0
Message 46041 - Posted: 11 Jan 2022, 11:58:43 UTC

Brummig, it's good to see someone else using WSL2 for BOINC and trying it for LHC. I've used it to run multiple projects. LHC is even harder to set up on WSL2 than on regular Linux, probably due to its custom kernel. I only recently figured out how to get native Theory to run on WSL2, still can't run multithread ATLAS (runs ok single thread only). The previous responses are correct and you have a point too. I should get a chance to post what I did to get things to work within a few days.

On a quick note...

-- maeax and computezrmle are right your ATLAS tasks aren't working correctly. To me it looks like Singularity isn't working properly. Based on your instructions, I'm guessing it wasn't installed correctly. I'd suggest you carefully go through the SingularityCE "Quick Start" up to (but not including) the "Overview of the SingularityCE Interface" section. https://sylabs.io/guides/3.9/user-guide/quick_start.html#

-- As mentioned above, the following entries are important, I'd definitely modify the PROXY one and add the CDN one.
CVMFS_HTTP_PROXY="auto;DIRECT"
CVMFS_USE_CDN=yes
ID: 46041 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46042 - Posted: 11 Jan 2022, 14:31:25 UTC - in response to Message 46041.  

Thank you, AndreyOR, that's helpful. I would be interested to see what you have done to get it working. If it's helpful, feel free to copy, paste, and edit my post to use as a template for the parts that are correct (it was very tedious and time-consuming getting it formatted clearly). I don't really care if I'm limited to single-threaded tasks; I just want a simple way to contribute something. I have no desire to spend days of time I don't have becoming an expert in somebody else's software.

I have searched the original guide for "default.local" and "cvmfs" to see if any of the many posts in the thread indicate that the instructions for default.local should not be followed, and they do not. How is anyone supposed to know which parts of that thread are currently applicable and which are not? I hope nobody is going to suggest searching the forum. How can you search for something if you don't know what you're looking for? Even if you do stumble across something, how will you know if it is currently valid? Or are contributors just expected to first read every post on the forum?

It would be really helpful if the software, like for other projects, returned an error code if something went awry, rather than granting credit and placing an obscure message in a log file that most crunchers will not look at (let alone understand), given that they have been granted credit. Even if I did take a read through the log, given that credit was granted I would have assumed that having no HITS result (whatever that means) is a good thing.
ID: 46042 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46044 - Posted: 12 Jan 2022, 6:55:41 UTC

Made a test with WSL2 Installation on Win10pro.
Ubuntu 20.04 is default, OpenSUSE Leap 42.2 is possible to install.
Have SUSE installed. But, Hardware-Acceleration is after deinstall no longer avalaible for Virtualbox 5.2.44 and Boinc 7.16.20.
BIOS say yes, but Windows don't have it no longer.
ID: 46044 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 8 Dec 19
Posts: 37
Credit: 7,587,438
RAC: 0
Message 46045 - Posted: 12 Jan 2022, 7:26:00 UTC - in response to Message 46044.  

maeax, yes, they're incompatible. WSL2 runs off of Hyper-V architecture. Hyper-V is a Type 1 hypervisor that comes as part of the Windows package. VirtualBox is a type 2 hypervisor and it doesn't work with Hyper-V enabled. There's work being done on that and I believe the newest version of VB might be able to work with Hyper-V although I don't know how well. To get back to being able to use VB I think you'll have to disable all 3 of the following in Windows features: Hyper-V, Virtual Machine Platform, and Windows Subsystem for Windows. Restart and try VB again.

I don't believe there's a default distribution but it has, I believe, 11 distributions available, if you include the various versions.
ID: 46045 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46047 - Posted: 12 Jan 2022, 9:49:18 UTC - in response to Message 46041.  

I had a look through those instructions for Singularity, AndreyOR, and unless I've missed something they are for people who want to build Singularity from the source code. What I did (I think) was install Singularity from a pre-compiled package, so all being well it should just work. It may prove necessary to install an earlier version, but the only way to find out is to try it.

In creating the instructions at the top of this thread, I'm trying to create the simplest possible way of being able to run Theory natively. Whilst rolling up your sleeves and building from the source code is sometimes the only way to get something that works, too often in the Linux world I have come across guides that would have you struggling to compile dozens of libraries and packages, when all you actually need to do is install with a single command a package that was built by someone who understood what they were doing.
ID: 46047 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 46051 - Posted: 12 Jan 2022, 17:46:13 UTC

SuSE Leap 42.2 is very old. I have a Leap 15.0 installed on a laptop and SuSE Tumbleweed, which is a development version. on a Virtual Machine on a windows 10 host. Its kernel 1s 5.15.8 and is updated very frequently, so I have to reboot.
Tullio
ID: 46051 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46057 - Posted: 14 Jan 2022, 8:55:50 UTC - in response to Message 46051.  

Yes, 42.2 is a Test-OS for WSL.
After deinstall of WSL (Ubuntu and OpenSUSE also), get this messages from Boinc after reboot Win10pro:
13.01.2022 22:36:39 | | Local time is UTC +1 hours
13.01.2022 22:36:39 | | WSL detected:
13.01.2022 22:36:39 | | [openSUSE-42] (default): W (W [W])
ID: 46057 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 8 Dec 19
Posts: 37
Credit: 7,587,438
RAC: 0
Message 46058 - Posted: 14 Jan 2022, 10:21:30 UTC - in response to Message 46057.  

There's some kind of cross-communication happening between WSL2 & Windows BOINCs. I noticed that if BOINC is running on WSL2 first and then I try to start BOINC in Windows I get a Connection Error: Invalid client RPC password. BOINC has to be started on Windows first and then on WSL2, no problems if done in this order. I haven't tried to figure out what's going on yet.
ID: 46058 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46059 - Posted: 14 Jan 2022, 11:15:44 UTC - in response to Message 46058.  

@AndreyOR: I've found the same. Yesterday I ended up in an even bigger mess. I used my Windows BOINC Manager to check on one of my Pies, but when I switched back to the Windows client I got Connection Error: Invalid client RPC password. That sometime happens. Normally it requires a reboot, but mindful that LHC@Home was running under WSL, I tried to sort it out without a reboot. The best I could achieve when selecting "localhost" was having the Windows BOINC Manager reporting the WSL client activity.
ID: 46059 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 8 Dec 19
Posts: 37
Credit: 7,587,438
RAC: 0
Message 46072 - Posted: 16 Jan 2022, 3:21:00 UTC - in response to Message 46059.  

I played around with WSL2 BOINC configurations, to be able to control it from Windows BOINC Manager and it doesn't seem to work cleanly. Configurations seem to be crisis-crossing somehow, WSL2 BOINC data directory seems to be getting changed or duplicated (still not sure which happens). Proxy settings on one affect the other. I'm still not all clear on what's going on but it doesn't seem to function like I expected, where you can manage both independently without one affecting the other. Unlike you, I don't have anything else connected, so I can see how your set up can get even more complicated. Do you have your BOINC clients set up to listen to different ports? That's what I had to do to prevent a problem in the last post. I changed the WSL2 one as it was easier but that made managing it through command line more difficult.
ID: 46072 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 34,609
Message 46073 - Posted: 16 Jan 2022, 5:33:34 UTC - in response to Message 46072.  

Do you have your BOINC clients set up to listen to different ports?

This is a MUST since manager and client communicate via RPC.
Default client port is 31415.
If the Windows client and the WSL client on the same box both listen to the same port, how should they decide which one is meant when the manager sends a command to change anything, e.g. suspend/resume a task or setting the proxy?


I changed the WSL2 one ... but that made managing it through command line more difficult.

Better this than to mess both client's configuration.
ID: 46073 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46081 - Posted: 17 Jan 2022, 9:17:29 UTC

WSL2 and Windows are on two different IP addresses. They're not even on the same subnet. Windows gets its IP address from the DHCP server, whilst WSL2 creates its own IP address.

I have two BOINC managers running, one under Windows and one under WSL2. Mostly they communicate with the appropriate client, but sometimes odd things happen. I think that is caused by the Windows BOINC Manager communicating via the "wrong" adapter. ipconfig on Windows reports two adapters, one for WSL2 and one for Windows, whilst ifconfig on WSL2 reports only the one adapter (so it can't see the Windows client).

BTW, if you're wondering why I run the two clients, it's because the WSL2 client can't run GPU tasks.
ID: 46081 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46082 - Posted: 17 Jan 2022, 9:44:56 UTC - in response to Message 46081.  

+1
ID: 46082 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2)


©2024 CERN