Message boards : Number crunching : Work unit hold a slot but doing nothing
Message board moderation

To post messages, you must log in.

AuthorMessage
pls

Send message
Joined: 22 Oct 07
Posts: 27
Credit: 808,821
RAC: 0
Message 37075 - Posted: 22 Oct 2018, 6:29:56 UTC

I just saw the notice on Sixtrack News about lack of work. Ok. I'd like to talk about the lack of work on LHCb and the vbox projects.

Lack of work on those projects has a greater impact then lack of work on Sixtrack. This is because the vbox projects hold a BOINC slot while waiting for work, not only is the vbox project doing nothing, it's preventing BOINC from running a different project. It also holds a slot while doing the very low value tasks of uploading and downloading, which BOINC could do nicely without holding a slot.

I just looked at my BOINC processes. I am running 6 cpu slots, all occupied by an LHC vbox task, but only 2 using CPU. 3 active is probably more common, but still not very efficient.

I understand why you use vbox and Linux, and I have no complaint with that. But I wish the project were set up to work in a more typical BOINC fashion:
1. BOINC downloads input files.
2. Spin up a VM and process the input files.
3. Exit the VM.
4. BOINC uploads the results.

I'm donating my machine resources on the assumption that they are being used productively. Holding slots while doing nothing or almost nothing isn't making good use of my resources.

++PLS
ID: 37075 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1608
Credit: 94,645,489
RAC: 98,633
Message 37076 - Posted: 22 Oct 2018, 7:15:24 UTC - in response to Message 37075.  

ATLAS already behaves like you request it (1..4).

Theory's intermediate results are very small.
As long as you don't get a bad Sherpa job it should use nearly 100% of your CPU.

LHCb:
Very annoying!
I can't remember any sustainable efficiency improvement nor any comment from the LHCb team although a couple of volunteers asked for it again and again.
ID: 37076 · Report as offensive     Reply Quote
pls

Send message
Joined: 22 Oct 07
Posts: 27
Credit: 808,821
RAC: 0
Message 37147 - Posted: 1 Nov 2018, 4:16:38 UTC - in response to Message 37076.  

ATLAS already behaves like you request it (1..4).

Theory's intermediate results are very small.
As long as you don't get a bad Sherpa job it should use nearly 100% of your CPU.



Not really.,

I've had multi-cpu tasks for ATLAS and Theory running..From looking at the CPU and elapsed time, and checking on CPU used, both go long periods using zero CPU. I think both are holding resources while waiting for tasks to become available. Actually, holding 4 or 6 slots while doing nothing but waiting.

My whole point was that this is not an efficient way to work. Let BOINC download the files, And don't spin up a 6 CPU VM unless at least 6 tasks are downloaded and waiting. And when those tasks are done, leave politely.

++PLS
ID: 37147 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 37150 - Posted: 1 Nov 2018, 4:57:00 UTC - in response to Message 37147.  

You're right. They do hold resources while waiting for sub-tasks to be made available. It's not efficient but that's not going to change. Single core tasks are said to be considerably more efficient than 6 core tasks.
ID: 37150 · Report as offensive     Reply Quote
Henry Nebrensky

Send message
Joined: 13 Jul 05
Posts: 144
Credit: 14,665,277
RAC: 3,025
Message 37152 - Posted: 1 Nov 2018, 9:54:26 UTC - in response to Message 37147.  
Last modified: 1 Nov 2018, 9:56:00 UTC

, And don't spin up a 6 CPU VM unless at least 6 tasks are downloaded and waiting.


But AIUI it's the stuff within the VM that requests the actual jobs/sub-tasks and downloads the payload.

This issue is why the advice that single-CPU VMs are more efficient is repeated often on these boards.

(In my experience single-CPU VMs also avoid a bunch of spurious heartbeat errors, but that's another story)
ID: 37152 · Report as offensive     Reply Quote

Message boards : Number crunching : Work unit hold a slot but doing nothing


©2021 CERN