1) Questions and Answers : Getting started : "Postponed: VM job unmanageable, restart later." Status Message (Message 37271)
Posted 7 Nov 2018 by Juha
Post:
Check that in BOINC Manager. Click Options -> Computing Preferences -> Disk and Memory, in the Memory box check settings for "When computer is in use, use at most__%" and "When computer is not in use, use at most__%". Set them both to at least 95%. Remember that even if those are set to 100% BOINC won't necessarily gobble up all the RAM and starve other apps.


That's actually a bad advice. Those settings should be set up so that there is enough memory for the non-BOINC apps you wish to run. If you allow BOINC to use 95% of RAM and then try to use other programs you'll drive to system to the swapping you like to talk about so much.

And if you run VM tasks using 95% or more of RAM for BOINC is actually dangerous. As far as I can tell, the memory VirtualBox uses for virtual machines is locked in physical memory. They won't be swapped out. If you use all the RAM on the host for virtual machines you'll end up crashing the host. Been there, done that.
2) Questions and Answers : Getting started : Broken Community / Website?! (Message 37183)
Posted 2 Nov 2018 by Juha
Post:
My authenticator has the same pattern as this one (all characters changed):
<authenticator>uhr43kref02d342f3424rgw25o4q42n</authenticator>


Excellent. The form with a number and an underscore in front is a weak authenticator and can't be used to log in to the website. But the kind of authenticator you have can be used to do that.

For projects that use the basic website code:

Set cookie with name=auth and value=your_authenticator. The name must be exactly auth.
Opera: Menu -> Developer -> Developer tools -> Application tab -> left side Storage -> Cookies -> domain. Double click on empty row and fill Name and Value.
Firefox: Hamburger menu -> Web Developer -> Storage Inspector -> left side Cookies -> domain. Click plus and fill Name and Value.

Or go to project_url/login_auth.php and fill in Authenticator.

For projects that use Drupal website code (Einstein):

Go to Login -> Request new password -> authenticator-based login (project_url/user/login/auth) and fill in Log in with authenticator.

If those don't work or have been taken away:
Go to project_url/account_finish.php?auth=your_authenticator .


Once you have logged in to the SU managed account you could try changing email and password to what you like. SU will use authenticator to manage the account and you can't change authenticator. I don't know what happens if you decide to leave SU for good and delete the SU account. Will SU then delete all the accounts it created or does it just forget about them.

The accounts SU creates are supposed to be sort of anonymous. The concept of anonymous accounts is not yet well defined and implemented but the idea is that those accounts won't store anything that GDPR regulates. I haven't got the foggiest idea what happens when you log in to an anonymous account and then make it less anonymous. You might need to set an email and password, log out and back in with the email and password so that you can accept terms of use to make the account non-anonymous. It's also possible that it doesn't work quite right.

Totally unsupported all of this. Break something and you own all the pieces. If somebody asks I didn't tell you anything.
3) Questions and Answers : Getting started : Broken Community / Website?! (Message 37174)
Posted 2 Nov 2018 by Juha
Post:
BOINC has a bit of a problem with spammers. Climate Prediction for instance just removed over 1,4 million !! spam accounts. There is really no way for a human to clean up those so an automated solution is needed and used. Your account doesn't have credits and doesn't have any computers added. If that's not enough throw in a post with a link or two and you will look like a spammer.

To prevent your latest account from being deleted add a computer to it and if possible get some credits. You may need to temporarily leave Science United, remove LHC and re-add LHC but with the account you now have. If you then want to go back to Science United you'll probably need to remove LHC first so that SU can then manage it again.

As for taking over the SU created account. Open account_lhcathome.cern.ch_lhcathome.xml in BOINC's data directory and tell me what kind of authenticator is there. Is it like

<authenticator>123abc</authenticator>

or like

<authenticator>456_123abc</authenticator>

Don't copy it here!
4) Message boards : ATLAS application : Atlas tasks "Postponed: VM job unmanageable, restarting later." (Message 34513)
Posted 28 Feb 2018 by Juha
Post:
Atlas uses an older version of vboxwrapper that doesn't support VirtualBox 5.2 though VirtualBox COM API. So vboxwrapper falls back to using VBoxManage. But this old version doesn't work correctly with VBoxManage either.

Solution: If you are on Windows and want to run Atlas, stick to VirtualBox 5.1.
5) Questions and Answers : Getting started : Can't create LHC@home account (Message 34016)
Posted 21 Jan 2018 by Juha
Post:
I think this is something for admins to figure out.

From BOINC forums:

Well, I tried to create an account at LHC@home today, and failed. Attempts were made using both BOINC Manager and the LHC web site. Both said to check my email address and password and try again. I did, several times. Nothing seemed funny about the password (~20 char. {A-Z}[a-z][0-9]), and it's a new account, so I'm guessing the issue is the email address, which contains an underscore (_). Could that be the problem? It's an account that I've used for several years in general, and also used to set up all my other BOINC accounts. Is there something different about the LHC validity check that doesn't allow valid email addresses?

Since I don't have an account, I was not able to post to the LHC forum, so I'm hoping someone will see it here.

Ardis

Win10, BOINC Mgr 7.8.3


The simplest solution first, check that your email address isn't already registered: https://lhcathome.cern.ch/lhcathome/get_passwd.php


Yeah, I tried that, but I just tried it again to be sure. Email address not registered.

So I started playing with the password: 18 characters, 16... all the way down to six, same error message - "check your email and password and try again." Then I tried a 5 char. password (minimum is 6) and got a different error message about the password being too short. All this would suggest that it's the email address that is causing the problem, not the password.

I guess I could try to register with a different email account, but I'd rather not, and would I be able to change it later? No clue what the issue might be with the email address, other than the underscore. Why would LHC have different requirements than the other projects, and why would they reject a valid address?


http://boinc.berkeley.edu/dev/forum_thread.php?id=12254
6) Message boards : Sixtrack Application : AVX Sixtrack version (Message 33987)
Posted 20 Jan 2018 by Juha
Post:
@Erich56

Your hosts:

ID: 10388905
GenuineIntel Intel(R) Core(TM) i7-4930K CPU @ 3.40GHz [Family 6 Model 62 Stepping 4] (12 processors)
[2] NVIDIA GeForce GTX 980 Ti (4095MB) driver: 361.75
Microsoft Windows XP Professional x64 Edition, Service Pack 2, (05.02.3790.00)

Processor supports AVX but OS doesn't. IIRC Win 7 SP1 was the first one to support AVX.

ID: 10450564
GenuineIntel Intel(R) Core(TM)2 Quad CPU Q9550 @ 2.83GHz [Family 6 Model 23 Stepping 10] (4 processors)
NVIDIA GeForce GTX 750 Ti (2048MB) driver: 388.13 OpenCL: 1.2
Microsoft Windows 10 Professional x64 Edition, (10.00.16299.00)

OS supports AVX but CPU doesn't.

ID: 10452404
GenuineIntel Intel(R) Core(TM) i5 CPU M 480 @ 2.67GHz [Family 6 Model 37 Stepping 5] (4 processors)
AMD ATI Radeon HD 5400/R5 210 series (Cedar) (512MB) driver: 1.4.696
Microsoft Windows 7 Professional x64 Edition, Service Pack 1, (06.01.7601.00)

Again, OS supports AVX but processor doesn't.


@Others

Any Linux user having problems, check /proc/cpuinfo. That's where BOINC gets its information from.


edit: Toby beat me by 49 seconds :)
7) Message boards : ATLAS application : ATLAS native app (Message 32937)
Posted 30 Oct 2017 by Juha
Post:
Back when I was running the native app on Mint I had changed BOINC to report kernel version as 3.13.0-123-generic.not.really.sl6. I changed the kernel version back to what it really is two or three weeks ago and haven't seen any native tasks since then.
8) Message boards : LHCb Application : LHCb application detects wrong Boinc version (Message 32762)
Posted 10 Oct 2017 by Juha
Post:
2017-10-09 10:22:46 (3196): Detected: BOINC client v7.7


It's a misunderstanding of the coder. While the message talks about client it's actually BOINC API version that is used in vboxwrapper. It's already fixed but the code has not made it to the vboxwrapper used here.

@Philippe, or everyone really

BOINC 7.8.2 has a bug in cleaning slot directories. Upgrade to 7.8.3. Technically 7.8.3 is still in testing but I think it will be promoted to recommended status soon.
9) Message boards : ATLAS application : 'drive limit' error (Message 32730)
Posted 9 Oct 2017 by Juha
Post:
@Thomas

Your issue is different and is discussed in Huge input file! and WOW 1000 / 5000 events in one WU ? ! threads.
10) Message boards : ATLAS application : 'drive limit' error (Message 32717)
Posted 9 Oct 2017 by Juha
Post:
Could you share stderr of a task that failed with this error?

BOINC source code doesn't have a string "drive limit reached" anywhere.
11) Message boards : ATLAS application : ATLAS native app (Message 32630)
Posted 4 Oct 2017 by Juha
Post:
The startup script runs these commands to check CVMFS:

cvmfs_config probe
cvmfs_config stat atlas.cern.ch
cvmfs_config stat atlas-condb.cern.ch


The first command is not checked if it succeeds or fails. If the second or third command fails for any reason you get the "CVMFS not found" error. I don't know why exactly the connection to atlas.cern.ch would fail but the connection to atlas-condb.cern.ch succeeds.

There could be some clues in syslog. I get messages like these for all CVMFS repositories the app accesses:

Oct  3 17:08:23 mint cvmfs2: (atlas.cern.ch) failed to resolve IP addresses for ca-proxy.cern.ch (4 - unknown host name)
Oct  3 17:08:23 mint cvmfs2: (atlas.cern.ch) geographic order of servers retrieved from cernvmfs.gridpp.rl.ac.uk
Oct  3 17:08:32 mint cvmfs2: (atlas.cern.ch) CernVM-FS: linking /cvmfs/atlas.cern.ch to repository atlas.cern.ch



For completeness, the other checks the script runs are:

singularity --version
singularity exec -B /cvmfs /cvmfs/atlas.cern.ch/repo/images/singularity/x86_64-slc6.img hostname


These are run on non SL6 hosts. If they fail the error message is "Singularity is not installed" for the first one and "Singularity isnt working..." for the second.
12) Message boards : LHC@home Science : Sixtrack apps - Bad "api_version" - BOINC 7.8.2 display grids failing (Message 32426)
Posted 14 Sep 2017 by Juha
Post:
Sorry to have kept you waiting for an answer.

<api_version> is supposed to contain only API version number:

<api_version>7.7.0</api_version>


So it's wrong in 32-bit app version too.

This is now documented in #2121 and fixed in #2122

since we declared as deprecated the 45107 exes, then only the 4630 should be distributed, and I see this is happening; so I guess that the issue that you raise is less than a concern, isn't it?


It's not a critical issue. At some point someone is going to notice these in log:

12-Sep-2017 02:53:45 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe
13-Sep-2017 03:15:26 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe
13-Sep-2017 14:15:32 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe
14-Sep-2017 02:47:09 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe
14-Sep-2017 15:40:46 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe


And then they wonder what's going on. That's all.
13) Message boards : LHC@home Science : Sixtrack apps - Bad "api_version" - BOINC 7.8.2 display grids failing (Message 32398)
Posted 12 Sep 2017 by Juha
Post:
I got some SixTrack tasks and this is in sched_reply:

<app_version>
    <app_name>sixtrack</app_name>
    <version_num>4630</version_num>
    <api_version>7.7.0
API_VERSION
LPAPI_VERSION</api_version>
<file_ref>
    <file_name>sixtrack_win64_4630_sse2.exe</file_name>
    <main_program/>
</file_ref>
    <platform>windows_x86_64</platform>
    <plan_class>sse2</plan_class>
    <avg_ncpus>1.000000</avg_ncpus>
    <max_ncpus>1.000000</max_ncpus>
    <flops>11772905691.719425</flops>
</app_version>


The <api_version> garbage is client_state.xml is truncated because the client stores the api version in max 16 byte string.

IIRC, <app_version> is copied from XML blob in DB as is. Do you use update_version or something else to add app versions?



Not related to the api_version but I'll mention it anyway because you are going to get questions about it. Previous SixTrack version was 451.07 and the new version is 46.30, that is, less than previous. The client keeps only the app version that has the highest version number. Ones with lesser version numbers are deleted as soon as no task refers to them.

Because of that the client keeps re-downloading the app's files over and over again if at any moment it runs out of SixTrack tasks. The only way out of it is to reset the project so that the client forgets about the 451.07 version or you re-release 46.30 with a higher version number.
14) Message boards : LHC@home Science : Sixtrack apps - Bad "api_version" - BOINC 7.8.2 display grids failing (Message 32391)
Posted 11 Sep 2017 by Juha
Post:
Could you check how it's in sched_reply_lhcathome.cern.ch_lhcathome.xml? You need to have SixTrack tasks assigned in that reply for the app_version to be included.
15) Message boards : ATLAS application : Failed Tasks with cvmf_config not found error (Message 32375)
Posted 9 Sep 2017 by Juha
Post:
These are ATLAS tasks that run without VirtualBox. The app is still in testing so you probably have "Run test applications" selected in your project preferences. If you want to run only SixTrack and VM apps you can deselect test apps.

If you are interested in running these tasks then you need to install CVMFS and Singularity. These may be available in your distros package manager. They may need some additional configuration besides just being installed.

See the discussion in New ATLAS app version released for Linux hosts and ATLAS native app.
16) Message boards : ATLAS application : ATLAS native app (Message 32355)
Posted 8 Sep 2017 by Juha
Post:
Since there are different tasks in the queue right now, it is normal that there is a variability in run times.


Thanks. I haven't run enough of ATLAS tasks to know all these details.

What do you mean with "tons of memory"?


The host has only 3 GB of RAM. 1,7 GB of it is, well, maybe not a ton but still quite a large fraction.
17) Message boards : News : Deadline change for ATLAS jobs (Message 32353)
Posted 8 Sep 2017 by Juha
Post:
I think admins have said that the VM and native tasks come from the same pool and therefore have the same deadlines.

edit: And in the first post it says one week. Guess they have really urgent tasks, then. :)
18) Message boards : ATLAS application : ATLAS native app (Message 32337)
Posted 7 Sep 2017 by Juha
Post:
Based on this, we released a new version v2.51, so you do not need to hack the wrapper to run it on ubuntu..


Thanks. One 2.51 running fine without hacking. It's running super long though. The previous task run for 15 hours and this one looks like it's going to run at least twice as long. Is there normally that much variability in run times? Both tasks have 50 events so it's not that.



Unfortunately I have to report a bug in the app. Knowing that this app uses a ton of memory I had set task switch time to 20 hours so that only one task is in memory at a time.

Once the 20 hours filled BOINC suspended the app and started a task for another project. But athena.py is still running, now sharing a core with the other task.
19) Message boards : ATLAS application : ATLAS native app (Message 32183)
Posted 1 Sep 2017 by Juha
Post:
First telling that there is a new app version to test and then correcting that it's for SL6/7 only. Well, challenge accepted :P

To run it on Mint 17 = Ubuntu 14.04 a few things were needed.
- CVMFS from their repo
- Singularity from NeuroDebian
- BOINC client that not only tells that it runs on Mint but carefully points out that it's not SL7 :)

From the configuration script I manually applied those CVMFS, FUSE and Singularity settings that were available with the versions from the repos. I had to change the Singularity settings below because they didn't work on my system, perhaps because the kernel is missing some module:

mount hostfs = no
enable overlay = no


The configuration script sets FUSE option "user_allow_other" which may be a security issue on shared machines. It would be a good idea to warn users of that.

On Debian based systems /bin/sh is Dash which doesn't like the way the output redirects are in the Atlas startup script so a few changes were needed. I could have have corrected them but there was something weird going on and I decided to just drop them. I also had to change the script so that it would accept "--nthreads 1" to run only one worker instead of the default eight. The changes are below:

14,16c14,16
<   os.system("cvmfs_config probe 1>&2>/dev/null")
<   ret1=os.system("cvmfs_config stat atlas.cern.ch 1>&2>/dev/null")
<   ret2=os.system("cvmfs_config stat atlas-condb.cern.ch 1>&2>/dev/null")
---
>   os.system("cvmfs_config probe")
>   ret1=os.system("cvmfs_config stat atlas.cern.ch")
>   ret2=os.system("cvmfs_config stat atlas-condb.cern.ch")
28c28
<   ret=os.system("singularity --version 1>&2>/dev/null")
---
>   ret=os.system("singularity --version")
132,135c132,134
<   if int(THREADS)!=1:
<     prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS
<     sys.stderr.write(prefix)
<     os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix)
---
>   prefix="export ATHENA_PROC_NUMBER=%s;"%THREADS
>   sys.stderr.write(prefix)
>   os.system("sed -i -e '/set -x/a\%s' start_atlas.sh"%prefix)


After making the edits I edited client_state.xml, corrected the file size, removed <signature_required/> and <file_signature> tags and also added <md5_cksum> just so that the client has one less reason to reject the edited file.

After these changes my host (old and slow) has happily run the tasks, total three so far.

About the memory usage. BOINC and the app itself has reported about 1,8 GB for all tasks. The first two tasks were run with eight workers. The highest memory usage for the entire system that I saw was about 4 GB. I think file system cache size was negligible at that point but I'm not absolutely certain of that. If I account 1 GB for the OS and other programs (and that's being generous) then Atlas used about 3 GB.

The third task was run with only one worker. The highest memory usage for the entire system was about 2,5 GB. Giving 1 GB for the OS again leaves 1,5 GB for Atlas which seems more consistent with what BOINC and the app report.

This is on 3 GB machine by the way. I have it configured with plenty of swap but the only time there was significant amount of disk activity was when the workers were started. After that the swap usage increased slowly, not often enough to slow down computing.
20) Message boards : ATLAS application : ATLAS native app (Message 32126)
Posted 27 Aug 2017 by Juha
Post:
The scratch directory is cvmfs' cache directory. See Cache Settings. You can check your current configuration with:

cvmfs_config showconfig atlas.cern.ch
...
CVMFS_CACHE_BASE=/scratch/cvmfs    # from /etc/cvmfs/default.local
CVMFS_QUOTA_LIMIT=4096    # from /etc/cvmfs/default.local


For me, the difference between empty cache and all needed files in cache is about 40 minutes. I definitely want the files cached.


Next 20


©2024 CERN