1) Message boards : ATLAS application : Set process priority of VBoxHeadless process (Message 44349)
Posted 20 Feb 2021 by MPI für Physik
Post:
BOINC can check whether distinct processes are running and pause it's own tasks.

yes, but boinc only counts processes with niceness < boinc job niceness, i.e. processes with higher priority. We want to use this boinc feature, but we have to make sure that boinc jobs run with niceness > htcondor jobs niceness. We want to give our local users htcondor jobs priority over boinc jobs.

Global BOINC niceness can be set via the systemd service options
https://www.freedesktop.org/software/systemd/man/systemd.exec.html

Yes, but this does not determine the boinc jobs niceness, in particular if they are run in vbox VMs. In this case vbox starts the VM with some niceness according to vbox configs. For us the boinc jobs run with niceness = 35. But since we don't know who and where this number is set we worry that with some change in boinc or the ATLAS@home job wrapper this could change, our local users htcondor jobs don't run and we get grilled for it.

Some BOINC priority options can be influenced via cc_config.xml
https://boinc.berkeley.edu/wiki/Client_configuration

Yes, but they are already by default at "lowest priority", so no need to change them.

Vboxwrapper can't be configured. It's options are hardwired.

skluth@atlas246:~/boinc$ less /local/scratch/boinc/slots/0/vbox_replay.txt
...
VBoxManage -q controlvm "boinc_50e1e3a46f263637" cpuexecutioncap 75
...

And that line is only there when I set in my account settings 75% CPU load. So this propagates to the vboxwrapper somehow.

vbox can do this: VBoxManage controlvm vm-process-priority default|flat|low|normal|high so we could possibly use that to control what vbox is doing with the niceness. But that would need work on the vboxwrapper so its not an option now.

Our solution now is to set the htcondor jobs niceness < boinc niceness to make sure htcondor jobs push boinc jobs away. htcondor has a config parameter for that.
2) Message boards : ATLAS application : Set process priority of VBoxHeadless process (Message 44326)
Posted 17 Feb 2021 by MPI für Physik
Post:
Hi,

since we also have htcondor jobs by local users on our pool we would like the boinc jobs to suspend when htcondor jobs start. In order for this to work we must make sure that "nice" value of the boinc job (VBoxHeadless) is larger than the "nice" value of the condor jobs.

We currently observe that boinc runs VBoxHeadless with nice P=35, but have no way of controlling (.i.e. setting a different value) that through ATLAS@home or boinc configs.

Now vbox has https://docs.oracle.com/en/virtualization/virtualbox/6.0/user/vboxmanage-controlvm.html

...
vm-process-priority default|flat|low|normal|high: Changes the priority scheme of the VM process. See Section 7.12, “VBoxManage startvm”.
...

So something like

VBoxManage controlvm <vm-name> vm-process-priority low

could be added to the vboxwrapper on request from a config?

Cheers, Stefan
3) Message boards : ATLAS application : Use existing Tier-2 cvmfs squid for boinc ATLAS@home hosts (Message 44325)
Posted 17 Feb 2021 by MPI für Physik
Post:
Hi

before we ramp up our pool of boinc hosts for ATLAS we would like to set up the cvmfs squid.

I did have a look at the threads dealing with this, but they are not so helpful, since these cover setting up your own squid instance. We do already have a cvmfs squid for our local WLCG Tier2 cluster and the obvious plan is to make our boinc hosts connect to it.

Does anybody have a working configuration for this scenario and could share it?

Thanks a lot, Stefan
4) Message boards : ATLAS application : Suse tumbleweed boinc can't start virtualbox VM (Message 44321)
Posted 16 Feb 2021 by MPI für Physik
Post:
It was simple, boinc needs to be member of group vboxusers in order for the vbox handling to work.

A simple fix is to add user boinc to group vboxusers, and restart the boinc daemon.

For a large pool of machines that doesn't really scale, so we let systemd do the work with

skluth@atlas246:~$ cat /etc/systemd/system/boinc-client.service
...
[Service]
...
User=boinc
SupplementaryGroups=vboxusers
...
That makes systemd start boinc as a member of group vboxusers. After that it worked fine. Its a simple config file, and thus easily distributed.
5) Message boards : ATLAS application : Suse tumbleweed boinc can't start virtualbox VM (Message 44304)
Posted 12 Feb 2021 by MPI für Physik
Post:
Hi,

we are setting up shop on our new machines with Suse tumbleweed. We install boinc and vbox as usual, and we get tasks after joining the machine to my account.

However, they all fail with

...
2021-02-12 16:57:13 (62116): Starting VM. (boinc_1ce17a18ec64a970, slot#0)
2021-02-12 16:57:13 (62116): Error in start VM for VM: -2135228411
Command:
VBoxManage -q startvm "boinc_1ce17a18ec64a970" --type headless
Output:
VBoxManage: error: Could not launch the VM process for the machine 'boinc_1ce17a18ec64a970' (VERR_ACCESS_DENIED)
VBoxManage: error: Details: code VBOX_E_IPRT_ERROR (0x80bb0005), component MachineWrap, interface IMachine, callee nsISupports
VBoxManage: error: Context: "LaunchVMProcess(a->session, sessionType.raw(), ComSafeArrayAsInParam(aBstrEnv), progress.asOutParam())" at line 726 of file VBoxManageMisc.cpp
...

This is AFAICT the same error as here

https://forums.virtualbox.org/viewtopic.php?f=3&t=99518

https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5499#43250

We tried

skluth@atlas246:~$ groups boinc
boinc : boinc vboxusers

but it does not fix the problem.

It looks like something Suse t-weed specific in how boinc and vbox are configured.

Any great ideas?

Cheers, Stefan
6) Message boards : ATLAS application : Console monitoring (Message 29723)
Posted 30 Mar 2017 by MPI für Physik
Post:
Thank you David!

Unfortunately i saw it during the whole task, on each PC, so there where a huge amount of E-Mails.
I will check if it is fine now.
7) Message boards : ATLAS application : Console monitoring (Message 29709)
Posted 29 Mar 2017 by MPI für Physik
Post:
It seems like that the new information output produces also a lot of mails.
Every time when a event is processed you are doing some grep on the events, but the location is wrong, so the postmaster is sending everytime a mail.


Subject: Cron <root@localhost> grep -h "Event nr" /home/atlas01/RunAtlas/Panda_Pilot_*/PandaJob_*/athenaMP-workers-EVNTtoHITS-sim/worker_*/AthenaMP.log|sort > /dev/tty2
grep: /home/atlas01/RunAtlas/Panda_Pilot_*/PandaJob_*/athenaMP-workers-EVNTtoHITS-sim/worker_*/AthenaMP.log: No such file or directory

It would be great if that could be fixed!



©2024 CERN