1) Message boards : ATLAS application : ATLAS badges (Message 47052)
Posted 30 Jul 2022 by AndreyOR
Post:
One thing to note about ATLAS badges is that very few people have achieved the 5 million points to get the highest badge, as of now only 119 users, top 1% at least. While it's true that there's a big variance among the 119, having 5 million being the minimum for the highest badge doesn't seem unreasonable. Even 1 million points (second highest badge) puts you in at least top 5%, only 557 users have at least 1 million.

What I'd like to see is that the badge system be redone and the total contribution to LHC@home, credit for all sub-projects combined, be counted. If I'm not mistaking, ATLAS badging is a holdover from the days when ATLAS@home was its own thing and I don't think it makes sense anymore. David, or other admin, would that be too much of an undertaking?
2) Message boards : ATLAS application : app_config.xml parameters question (Message 46805)
Posted 19 May 2022 by AndreyOR
Post:
It seems like you're using the VBox version of ATLAS not native. According to a post above, that argument is valid in the VBox version not native though.
3) Message boards : ATLAS application : app_config.xml parameters question (Message 46799)
Posted 19 May 2022 by AndreyOR
Post:
I checked that file and it seems like the only argument that's available is --nthreads. I thought that some time ago when I was trying to figure out how to set up native ATLAS I saw the usage of --memory_size parameter in the forums. I added it to the app_config file (can't remember what made me think I needed it) but it seems like it's invalid and is just ignored. I'll have to delete it then. Was it ever used in the past?
4) Message boards : ATLAS application : app_config.xml parameters question (Message 46798)
Posted 19 May 2022 by AndreyOR
Post:
captainjack, yes, I'd expect your app_config to work. There seem to be some redundant/unnecessary entries though. Since you're only trying to modify the native ATLAS version you'd only need the app_version portion of the app_config. In addition, <cmdline>--nthreads 5</cmdline> is redundant since you're already specifying that you want to use 5 CPUs via avg_ncpus. I also believe that the app section is ignored since it's incomplete. Try the following. It's cleaner, shorter and so if you want to change things you're less likely to make an accidental mistake.
<app_config>
    <app_version>
      <app_name>ATLAS</app_name>
      <plan_class>native_mt</plan_class>
      <avg_ncpus>5</avg_ncpus>
    </app_version>
</app_config>
5) Message boards : ATLAS application : app_config.xml parameters question (Message 46793)
Posted 18 May 2022 by AndreyOR
Post:
Thank you for explaining some more. It seems my suspicion was right that you wouldn't use avg_ncpus and --nthreads command line parameter for the same app. It'll either be redundant or detrimental (if different values are used). avg_ncpus would be the way to control thread usage of multithread apps. I was curious and was able to find a list of command line paramaters for MilkyWay N-Body Simulation but couldn't find them for LHC ATLAS (native). Could you provide a link? Thank you.
6) Message boards : ATLAS application : app_config.xml parameters question (Message 46789)
Posted 18 May 2022 by AndreyOR
Post:
I'm familiar with that page and have read it before but it doesn't clarify things much. I understand the difference between cmdline and avg_ncpus in general. I'm specifically wondering about <cmdline>--nthreads x</cmdline> not just cmdline in general. --nthreads in cmdline and avg_ncpus seems to be specifying the same thing and thus seem redundant. However, I've seen people post their app_config files with both entries and I couldn't see why. Would you ever use both in the same app_config and if so under what circumstances? Also, how does one know what cmdline parameters a given program understands?
7) Message boards : ATLAS application : app_config.xml parameters question (Message 46784)
Posted 18 May 2022 by AndreyOR
Post:
What is the difference between the following 2 parameters in the app_version section of app_config.xml file, especially as it pertains to multithread apps (LHC ATLAS, MilkyWay N-Body Simulation)?
<app_version>
   <avg_ncpus>x</avg_ncpus>
   <cmdline>--nthreads x</cmdline>
</app_version>
8) Message boards : Number crunching : Tasks stuck at 99.99% with run time of 1 day+ (Message 46554)
Posted 29 Mar 2022 by AndreyOR
Post:
Disabling macOS time sync early in the process helped as I stopped getting those kinds of messages and the last batch of tasks just completed successfully. As I was looking into your suggestion I noticed that when setting up a VM in VBox there's an option under System/Motherboard to specify "Hardware Clock in UTC Time". It's checked by default so I unchecked it and turned time sync in macOS back on (which is the default anyway). I'm curious to see if this simpler solution will also work as changing BIOS time and updating Windows registry and making sure that VMs are set up right is a bit more involved.
9) Message boards : Number crunching : Tasks stuck at 99.99% with run time of 1 day+ (Message 46523)
Posted 23 Mar 2022 by AndreyOR
Post:
Thanks for the suggestions. I found a VBox command in the manual to make VM sync time with host frequently but that didn't seem to make a difference. So I disabled time checking/syncing on macOS to see if that'll help. If it doesn't I'll try your suggestions. Is following your suggestions going to make my PC run on UTC time instead of local time?
10) Message boards : Number crunching : Tasks stuck at 99.99% with run time of 1 day+ (Message 46516)
Posted 22 Mar 2022 by AndreyOR
Post:
greg_be,
It seems like you've had time discrepancy issues with Rosetta on VM. I've recently been dealing with this issue on a different project. I've recently started running MacOS Mojave on VBox to process 32-bit tasks for climateprediction.net. I've been getting a message in BOINC event log that reads (numbers in parenthesis vary):
New system time (1647911207) < old system time (1648063920); clearing timeouts

Following this, task progress bars freeze but the time counting continues. I can get things going again by doing suspend/resume on each task but so far tasks error out at the very end. Which sucks since these are very long running tasks, take days to weeks to run. I've never seen this kind of issues before but I also don't use VBox much. I rarely run apps that are VBox only since I use WSL2 to run Linux apps which uses Hyper-V and those don't really work well together.
11) Message boards : ATLAS application : Error with 2 CPUs (Message 46364)
Posted 25 Feb 2022 by AndreyOR
Post:
Change the following line to a high number in global_prefs_override.xml to prevent BOINC from switching between tasks. If the line is not there just add it.:
<cpu_scheduling_period_minutes>10080.000000</cpu_scheduling_period_minutes>

I have mine set to 10080 minutes (1 week). I have the same setting on my Windows and Linux BOINC setups. I don't see a good reason to switch between tasks mid-task, just let a task finish before moving to the next one. For a project like ATLAS and maybe Theory this setting is necessary.
12) Message boards : ATLAS application : Error with 2 CPUs (Message 46359)
Posted 25 Feb 2022 by AndreyOR
Post:
The reason is because you're running it on WSL2. WSL2 is not exactly the same as regular Linux (Ubuntu in your case) because it has a custom kernel and it's init.d, not systemd.

It's good to see others use WSL2 for BOINC projects but it does have its quirks in LHC. One of them is that native ATLAS can only be ran single core in WSL2 and fails when you try to run it multi-core. I've tried to figure out a solution to that a few times in the past but no success so far.

Another quirk is that native Theory doesn't run on WSL2 without a modification (it's an easy one though), which took me a while to figure out. If you're thinking of running Theory check out this post: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5777&postid=46031, it should be at the bottom of the thread.

I like WSL2 because you can run Linux projects on Windows machines with minimal resources compared to regular virtual machines. I used to use Hyper-V before learning about WSL2 and now very rarely use Hyper-V.
13) Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2) (Message 46101)
Posted 23 Jan 2022 by AndreyOR
Post:
I don't think nested virtualization is supported for AMD processors until Windows 11 (as a host OS), and I believe running WSL2 on Hyper-V Windows VM is considered nested virtualization.
14) Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2) (Message 46095)
Posted 20 Jan 2022 by AndreyOR
Post:
Brummig, I had the chance to try installing Singularity like you described, from a package, instead of from source like the Singularity documentation describe, and it seems to be working. So far I have 2 completed single core ATLAS tasks with "HITS file was successfully produced" https://lhcathome.cern.ch/lhcathome/result.php?resultid=340312764.

It's good to see that there's something that can be simplified for a project that's anything but simple. I mean, besides BOINC, for (native) Theory and ATLAS you need a specific OS (Linux), a software distribution service (CVMFS), a container (Singularity for ATLAS, runc for Theory), and if you regularly contribute 5 or more CPU threads - a caching proxy (Squid). Each must be properly configured but even then things don't always work right. The projects tried to simplify things by bundling up the containers but that doesn't always work and one still has to install them separately (mostly a problem in ATLAS). WSL2 presents additional challenges as it has a custom kernel and it's init.d, not systemd. Single thread ATLAS usually runs fine but I only recently was able to figure out how to get Theory to run and still don't know how to get multi-thread ATLAS to run.

Something isn't working for you though. All of your ATLAS tasks are either errors or produce no HITS file. I don't know why sometimes tasks that should be errors show ups as Completed and Validated, that's definitely misleading and makes things even more complicated. May I suggest you switch to Theory for now until you can troubleshoot and figure out ATLAS. Otherwise your CPU time is of no benefit to you or the project. Theory won't work on WSL2 as is, you need to make one modification to WSL2 configuration (.wslconfig file). I assume you're familiar with it, if not, read here: https://docs.microsoft.com/en-us/windows/wsl/wsl-config. Add the following line at the end of .wslconfig
kernelCommandLine = vsyscall=emulate

Exit and shut down (wsl --shutdown) Ubuntu. Wait until it completely shuts down, Microsoft recommends 8 seconds but I've seen it take longer. Restart Ubuntu and run the following command to make sure the configuration took:
cat /proc/self/maps | egrep 'vdso|vsyscall'

If the output looks something like the following two lines - the configuration took. If the output is only like the first line - shutdown again and wait a little longer before restarting.

7fffe03fe000-7fffe0400000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Make sure CVMFS is running, change your settings on the website to accept Theory and you should be good to go.

I wonder why single thread ATLAS isn't working for you. I installed SingularityCE 3.9.3, maybe try uninstalling your older one and install this one. If it doesn't work, perhaps uninstall Singularity altogether and try running with the Singularity that comes bundled. Maybe one of the older versions will work; I'm considering trying various older versions in attempts to get multi-thread ATLAS to work.
15) Message boards : Number crunching : Setting up a local Squid to work with LHC@home - Comments and Questions (Message 46085)
Posted 17 Jan 2022 by AndreyOR
Post:
I'm trying to install a newer version (4.17) of Squid from source but ./configure has many different options and I'm not sure which to use (except 2). Which should be used for a 1 to 2 PC home network to run LHC? I've previously, on a different system, successfully installed an older version that's available prepackaged and used the configuration file with appropriate modifications found on the forum.
16) Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2) (Message 46072)
Posted 16 Jan 2022 by AndreyOR
Post:
I played around with WSL2 BOINC configurations, to be able to control it from Windows BOINC Manager and it doesn't seem to work cleanly. Configurations seem to be crisis-crossing somehow, WSL2 BOINC data directory seems to be getting changed or duplicated (still not sure which happens). Proxy settings on one affect the other. I'm still not all clear on what's going on but it doesn't seem to function like I expected, where you can manage both independently without one affecting the other. Unlike you, I don't have anything else connected, so I can see how your set up can get even more complicated. Do you have your BOINC clients set up to listen to different ports? That's what I had to do to prevent a problem in the last post. I changed the WSL2 one as it was easier but that made managing it through command line more difficult.
17) Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2) (Message 46058)
Posted 14 Jan 2022 by AndreyOR
Post:
There's some kind of cross-communication happening between WSL2 & Windows BOINCs. I noticed that if BOINC is running on WSL2 first and then I try to start BOINC in Windows I get a Connection Error: Invalid client RPC password. BOINC has to be started on Windows first and then on WSL2, no problems if done in this order. I haven't tried to figure out what's going on yet.
18) Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2) (Message 46045)
Posted 12 Jan 2022 by AndreyOR
Post:
maeax, yes, they're incompatible. WSL2 runs off of Hyper-V architecture. Hyper-V is a Type 1 hypervisor that comes as part of the Windows package. VirtualBox is a type 2 hypervisor and it doesn't work with Hyper-V enabled. There's work being done on that and I believe the newest version of VB might be able to work with Hyper-V although I don't know how well. To get back to being able to use VB I think you'll have to disable all 3 of the following in Windows features: Hyper-V, Virtual Machine Platform, and Windows Subsystem for Windows. Restart and try VB again.

I don't believe there's a default distribution but it has, I believe, 11 distributions available, if you include the various versions.
19) Message boards : ATLAS application : Guide to Getting Quickly Started Running Native ATLAS (Ubuntu 20.04 on WSL2) (Message 46041)
Posted 11 Jan 2022 by AndreyOR
Post:
Brummig, it's good to see someone else using WSL2 for BOINC and trying it for LHC. I've used it to run multiple projects. LHC is even harder to set up on WSL2 than on regular Linux, probably due to its custom kernel. I only recently figured out how to get native Theory to run on WSL2, still can't run multithread ATLAS (runs ok single thread only). The previous responses are correct and you have a point too. I should get a chance to post what I did to get things to work within a few days.

On a quick note...

-- maeax and computezrmle are right your ATLAS tasks aren't working correctly. To me it looks like Singularity isn't working properly. Based on your instructions, I'm guessing it wasn't installed correctly. I'd suggest you carefully go through the SingularityCE "Quick Start" up to (but not including) the "Overview of the SingularityCE Interface" section. https://sylabs.io/guides/3.9/user-guide/quick_start.html#

-- As mentioned above, the following entries are important, I'd definitely modify the PROXY one and add the CDN one.
CVMFS_HTTP_PROXY="auto;DIRECT"
CVMFS_USE_CDN=yes
20) Message boards : Theory Application : Unable to run native Theory in WSL2 Ubuntu 20.04 (Message 46031)
Posted 10 Jan 2022 by AndreyOR
Post:
With computezrmle's troubleshooting help I was able to find a solution. Basically one needs to add a line to WSL2 configuration file (.wslconfig) to enable vsyscall emulation which solves the memory access violation (exit code 139) problem. See https://docs.microsoft.com/en-us/windows/wsl/wsl-config for details on WSL2 configurations. A simple .wslconfig file that works for native Theory might look like this:

[wsl2]
memory=16GB
processors=8
kernelCommandLine = vsyscall=emulate


Next 20


©2022 CERN