Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2)
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2541
Credit: 254,608,838
RAC: 34,609
Message 46083 - Posted: 17 Jan 2022, 10:00:27 UTC - in response to Message 46081.  

Recent results
Some important lines from the task logs.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=338724266

Good. CVMFS connects to an openhtc.io server.
[2022-01-14 09:25:39] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2022-01-14 09:25:39] 2.9.0.0 18586 0 28648 99637 3 1 1727673 4096001 0 130560 0 0 0.000 531 47 http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch DIRECT 1



09:25:13 (18382): wrapper (7.7.26015): starting
.
.
.
11:08:12 (14183): wrapper (7.7.26015): starting  ### Not good. All work before this line is lost.
.
.
.
[2022-01-15 13:54:07] No HITS result produced ### Not good. this task did not return scientific results.





https://lhcathome.cern.ch/lhcathome/result.php?resultid=338724363

Good. CVMFS connects to an openhtc.io server.
[2022-01-14 09:26:56] VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE
[2022-01-14 09:26:56] 2.9.0.0 18586 1 32364 99637 2 6 1727742 4096001 28 130560 0 2224 99.865 535 47 http://s1ral-cvmfs.openhtc.io/cvmfs/atlas.cern.ch DIRECT 1



09:26:54 (19776): wrapper (7.7.26015): starting
.
.
.
11:05:35 (26511): wrapper (7.7.26015): starting
.
.
.
13:54:06 (30268): wrapper (7.7.26015): starting
.
.
.
09:14:50 (23117): wrapper (7.7.26015): starting  ### Not good. All work before this line is lost.
.
.
.
[2022-01-17 08:53:30] No HITS result produced ### Not good. this task did not return scientific results.



If it can't be ensured that ATLAS native runs without restarts it should not be run on this computer.
Beside that, even tasks that finish (and get credits) still don't return any valid scientific result.
ID: 46083 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46084 - Posted: 17 Jan 2022, 11:08:46 UTC - in response to Message 46057.  

After deinstall of WSL (Ubuntu and OpenSUSE also),

Hardware Acceleration is active in BIOS, but not in Win10pro. Hyper-V is disabled.
ID: 46084 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46087 - Posted: 18 Jan 2022, 9:21:55 UTC - in response to Message 46084.  
Last modified: 18 Jan 2022, 9:24:59 UTC

This is a Theory-task, but only for understanding AMD-V active or not:
Waiting for VM "boinc_3b02640724ecbcc1" to power on...
VBoxManage.exe: error: Not in a hypervisor partition (HVP=0) (VERR_NEM_NOT_AVAILABLE).
VBoxManage.exe: error: AMD-V is disabled in the BIOS (or by the host OS) (VERR_SVM_DISABLED)
VBoxManage.exe: error: Details: code E_FAIL (0x80004005), component ConsoleWrap, interface IConsole

In BIOS disabled - OS say active??
Hyper-V enabled BIOS disabled - OS say active??
So far it's not about logic.
WSL need Hardware-Acceleration.
Virtualbox 6.1.30 installed with this tests.
Need more testing to see how WSL2 is stable to confirm Hardware-Acceleration and let other Tasks in Boinc(7.16.20) also running.
OS is Win10pro with ALL updates ftm.
https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10567798
ID: 46087 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 46088 - Posted: 18 Jan 2022, 11:34:37 UTC - in response to Message 46083.  

The host isn't ordinarily restarting, but as far as I could tell one of the ATLAS tasks I received that day had a massive memory leak that crippled WSL2.

The current problem is that tasks are failing with "Looping job killed by pilot".
ID: 46088 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46089 - Posted: 18 Jan 2022, 15:19:12 UTC - in response to Message 46088.  
Last modified: 18 Jan 2022, 16:09:41 UTC

After two new Win10pro updates one hour ago,
Hardware Acceleration is running for Boinc with Virtualbox (Theory and Cosmology) ftm, nothing changed from my side.
2022-01-18 16:30:22 (2528):
NOTE: VirtualBox has reported an improperly configured virtual machine. It was configured to require
hardware acceleration for virtual machines, but your processor does not support the required feature.
Please report this issue to the project so that it can be addresssed.
Error Code: ERR_CPU_VM_EXTENSIONS_DISABLED
Hyper-V and WSL2 testing in the next days.
ID: 46089 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46092 - Posted: 20 Jan 2022, 3:48:10 UTC - in response to Message 46089.  

Don't know why Hardware-Acceleration isn't checked for the Host and changed:
Allways shown as not active:
<p_ncpus>4</p_ncpus>
<p_vendor>AuthenticAMD</p_vendor>
<p_model>AMD A10-7850K Radeon R7, 12 Compute Cores 4C+8G [Family 21 Model 48 Stepping 1]</p_model>
<p_features>fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 popcnt aes f16c syscall nx lm avx sse4a osvw xop wdt fma4 topx page1gb rdtscp fsgsbase bmi1</p_features>
<p_fpops>3042417877.204644</p_fpops>
<p_iops>10181622946.708857</p_iops>
<p_membw>250000000.000000</p_membw>
<p_calculated>1641807555.935040</p_calculated>
<p_vm_extensions_disabled>0</p_vm_extensions_disabled>
<m_nbytes>33223143424.000000</m_nbytes>
<m_cache>2097152.000000</m_cache>
<m_swap>37417447424.000000</m_swap>
<d_total>2000377872384.000000</d_total>
<d_free>1568356171776.000000</d_free>
<os_name>Microsoft Windows 10</os_name>
<os_version>Professional x64 Edition, (10.00.19044.00)</os_version>
<n_usable_coprocs>1</n_usable_coprocs>
<wsl_available>0</wsl_available>
<virtualbox_version>6.1.32</virtualbox_version>
<coprocs>
ID: 46092 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 8 Dec 19
Posts: 37
Credit: 7,587,438
RAC: 0
Message 46095 - Posted: 20 Jan 2022, 11:03:21 UTC

Brummig, I had the chance to try installing Singularity like you described, from a package, instead of from source like the Singularity documentation describe, and it seems to be working. So far I have 2 completed single core ATLAS tasks with "HITS file was successfully produced" https://lhcathome.cern.ch/lhcathome/result.php?resultid=340312764.

It's good to see that there's something that can be simplified for a project that's anything but simple. I mean, besides BOINC, for (native) Theory and ATLAS you need a specific OS (Linux), a software distribution service (CVMFS), a container (Singularity for ATLAS, runc for Theory), and if you regularly contribute 5 or more CPU threads - a caching proxy (Squid). Each must be properly configured but even then things don't always work right. The projects tried to simplify things by bundling up the containers but that doesn't always work and one still has to install them separately (mostly a problem in ATLAS). WSL2 presents additional challenges as it has a custom kernel and it's init.d, not systemd. Single thread ATLAS usually runs fine but I only recently was able to figure out how to get Theory to run and still don't know how to get multi-thread ATLAS to run.

Something isn't working for you though. All of your ATLAS tasks are either errors or produce no HITS file. I don't know why sometimes tasks that should be errors show ups as Completed and Validated, that's definitely misleading and makes things even more complicated. May I suggest you switch to Theory for now until you can troubleshoot and figure out ATLAS. Otherwise your CPU time is of no benefit to you or the project. Theory won't work on WSL2 as is, you need to make one modification to WSL2 configuration (.wslconfig file). I assume you're familiar with it, if not, read here: https://docs.microsoft.com/en-us/windows/wsl/wsl-config. Add the following line at the end of .wslconfig
kernelCommandLine = vsyscall=emulate

Exit and shut down (wsl --shutdown) Ubuntu. Wait until it completely shuts down, Microsoft recommends 8 seconds but I've seen it take longer. Restart Ubuntu and run the following command to make sure the configuration took:
cat /proc/self/maps | egrep 'vdso|vsyscall'

If the output looks something like the following two lines - the configuration took. If the output is only like the first line - shutdown again and wait a little longer before restarting.

7fffe03fe000-7fffe0400000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Make sure CVMFS is running, change your settings on the website to accept Theory and you should be good to go.

I wonder why single thread ATLAS isn't working for you. I installed SingularityCE 3.9.3, maybe try uninstalling your older one and install this one. If it doesn't work, perhaps uninstall Singularity altogether and try running with the Singularity that comes bundled. Maybe one of the older versions will work; I'm considering trying various older versions in attempts to get multi-thread ATLAS to work.
ID: 46095 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46100 - Posted: 22 Jan 2022, 18:36:33 UTC - in response to Message 46092.  

Had made a new Installation of the PC with Win10pro.
Wondering, why AMD is not supported for WSL2, seeing in this Documentation:
https://boxofcables.dev/trying-wsl2-on-hyper-v
ID: 46100 · Report as offensive     Reply Quote
AndreyOR

Send message
Joined: 8 Dec 19
Posts: 37
Credit: 7,587,438
RAC: 0
Message 46101 - Posted: 23 Jan 2022, 8:59:24 UTC - in response to Message 46100.  

I don't think nested virtualization is supported for AMD processors until Windows 11 (as a host OS), and I believe running WSL2 on Hyper-V Windows VM is considered nested virtualization.
ID: 46101 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46102 - Posted: 23 Jan 2022, 12:00:10 UTC - in response to Message 46101.  
Last modified: 23 Jan 2022, 12:11:19 UTC

Now Cosmology@home with Docker is running without HardwareAcceleration,
Theory, Atlas and CMS doesnt' downloading any task.
Message is, there are no new tasks, but Taskmanager say Hardware Acceleration is active?
Boinc 7.16.20, first with including Virtualbox(6.0.14-Boinc Homepage have the Info 6.1.12) from Boinc, now with 5.2.44,
testing Virtualbox(6.1.32) in the next hours.
ID: 46102 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 456
Message 46103 - Posted: 23 Jan 2022, 14:13:26 UTC - in response to Message 46102.  

Now Virtualbox(6.1.32) and first CMS Task is running.
WSL2, when better Informations are avalaible about the useful Windows-OS.
ID: 46103 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 46106 - Posted: 24 Jan 2022, 16:03:13 UTC

Hi all,

Firstly, thank you Brummig for your excellent and helpful instructions. As you and others have said, it is unfortunate that BOINC forums do not allow editing previous posts. But in an attempt to provide clearer information on the CVMFS configuration I have created a new sticky post with what I think is the most appropriate default.local content. Please let me know if anyone sees anything that should be corrected there, as an admin I do have the rights to edit old posts so I can keep this one up to date if anything changes in the future.
ID: 46106 · Report as offensive     Reply Quote
kotenok2000
Avatar

Send message
Joined: 21 Feb 11
Posts: 72
Credit: 570,086
RAC: 1
Message 46534 - Posted: 26 Mar 2022, 16:24:57 UTC - in response to Message 46045.  

You don't have to disable : Hyper-V, Virtual Machine Platform, and Windows Subsystem for Windows.
You just have to issue this command

bcdedit /set {current} hypervisorlaunchtype off
You can copy boot entry, so yopu will have two boot entries: one with hyperv and other without.
bcdedit /copy {current} /d "windows 11 hyper-v"
ID: 46534 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 46535 - Posted: 26 Mar 2022, 19:39:27 UTC - in response to Message 46072.  

I played around with WSL2 BOINC configurations, to be able to control it from Windows BOINC Manager and it doesn't seem to work cleanly. Configurations seem to be crisis-crossing somehow, WSL2 BOINC data directory seems to be getting changed or duplicated (still not sure which happens). Proxy settings on one affect the other. I'm still not all clear on what's going on but it doesn't seem to function like I expected, where you can manage both independently without one affecting the other. Unlike you, I don't have anything else connected, so I can see how your set up can get even more complicated. Do you have your BOINC clients set up to listen to different ports? That's what I had to do to prevent a problem in the last post. I changed the WSL2 one as it was easier but that made managing it through command line more difficult.

If this is still an issue for anyone, I used BoincTasks to control both the Windows and Linux (Ubuntu 20.04) side:
https://www.cpdn.org/cpdnboinc/forum_thread.php?id=9025&postid=63488#63488
https://www.cpdn.org/cpdnboinc/forum_thread.php?id=9025&postid=63489#63489

I am sure WSL2 has changed in the meantime, but that part of it should still be the same I expect.
ID: 46535 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2)


©2024 CERN