1) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49969)
Posted 21 hours ago by ivan
Post:
I see that one WU allocates 32 cores and then inside there is 6 processes cmsExternalGene using 1 core each.

What is the expected max number inside, as CP say seems like maybe 8?

As far as I know, the tasks that run the new 4-core jobs should run on 4 cores no matter how many above that number you have allowed in your locale preferences. My experience is that the main process, cmsRun, spawns four threads, each running cmsExternalGenerator, so in your "top" display (Alt-F3) you should see four cmsExternalGenerator processes running at nearly 100% each, with the occasional appearance of the cmsRun master process as it gets its share of the resources..
2) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49961)
Posted 1 day ago by ivan
Post:
When you look at your computers tasks on the website, are they vbox64 or vbox64_mt_mcore_cms? Ben's computer was showing one off each outstanding / in progress

Just vbox64, I'm afraid.
3) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49957)
Posted 2 days ago by ivan
Post:
I'm wondering if we got the multi-core config right on the production CMS@Home. Of the machines I'm running in a locale where I've selected 4-CPU tasks, all are running just a single-core VM. A machine running a 4-core locale at CMS@Home-dev has started a 4-core VM, though.
4) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49956)
Posted 2 days ago by ivan
Post:
If the #cores must be configured at batch creation time, then please make a decision.

Either keep only the singlecore app
or drop the singlecore app and send out a multicore with a fix #cores that is in sync with the backend.

Do not mix batches having different core settings as this would break BOINC's work fetch, runtime estimation, credit system ...

A fair point, I'll keep it in mind.
5) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49952)
Posted 2 days ago by ivan
Post:
Any word on release to production yet?

Yes, in fact. We activated multi-core in production this afternoon, and there are some 4-core jobs queued up ready to run. You can try setting your preferences for your favourite CMS@Home locale to using 4-core VMs, and see if you pick up a task. There will still be single-core jobs hanging around, so the current choices are just single- or quad-core tasks. Note that we haven't tuned the 4-core jobs yet, so you might run into bandwidth, memory or time-out problems.
6) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49880)
Posted 16 days ago by ivan
Post:
Yes, I am *clearly* a trouble maker after all. Utterly incorrigible. ;)
I'll take your word for it...

Thanks for that, I'm looking for ward to seeing how these run in production.

You can see my poor underpowered machine's tasks here.
I think Laurence still has some holidays in his pocket, so we may not turn on multicore in production until next week.
7) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49877)
Posted 17 days ago by ivan
Post:
So do the multicore work units actually get 4 times the work done using four cores?
I'd check myself but when I asked to join -dev I was told I wasn't required.

Sorry about that, the decision was made by LHC@Home management.
Yes, since our jobs are mostly "embarrassingly parallel" the trend is to run multithreaded jobs so that each core runs over events individually. There are memory savings because of all the shared resources which only need to be loaded once.
8) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49854)
Posted 21 days ago by ivan
Post:
It needs to be clarified whether

1. a workflow batch at the backend must be configured to run on n cores before any work is sent out

2. a task on a volunteer VM can forward it's own #cores to the CMS app and CMS uses this #cores.
Like:
2-core VM -> 2-core CMS
4-core VM -> 4-core CMS


Sending out something like fix n-core CMS tasks to a VM not running n cores makes no sense.

Tja, the permutations become exponentially weird. Currently
o "Normal" single core jobs are available. These will run on "standard" LHC@Home machines, which cannot specify multicore VMs, and on LHC@Home-dev machines which specify NCPUs >=1 -- only using one core in the VM of course.
o A workflow specifying 4-core jobs is also available. These can run on LHC@Home-dev machines specifying NCPUs >=4.

I don't think we can specify workflows to "run on however many cores are available". The relevant parameter in the .json config file is "Multicore", which as far as I know takes an integer parameter at submission time.
9) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49852)
Posted 22 days ago by ivan
Post:
Yes, we're working on it, but it takes time. Currently we have some two-core and some 4-core jobs in the queue. These will only run in -dev. Let us know how you get on. I'll put some single-core jobs up as well, so people not in -dev can get some work too.

Hmm, there is a little problem with that -- The workflow manager is holding that batch in acquired status while the 2- and 4-core batches run and probably won't start it running until those queues start running dry, which could take some time!

We still don't have any 1-core jobs available for CMS@Home "production". Federica is going to kill her 2-core workflow, so then hopefully the workflow-manager will notice that there aren't many jobs in the queue, and will move my workflow into the "running" state.

OK, aborting the rather large two-core workflow has allowed my latest single-core batch to get its foot in the door. Currently we are running a 4-core w/f, accessible only to users of CMS@Home-dev who have set their MaxCPUsPerTask to >=4, and a single-core w/f that will run on "production" CMS@Home VMs, and at reduced efficiency (for N>1) for -dev Volunteers running N-core VMs
I'll try to keep the mix tuned over the holidays. We plan to allow multicore on production tasks after the break, I'll let you know when you can start experimenting with the number of cores as that happens. I think we'll concentrate mainly on 4-core, as that's where CMS seems to be focussing its efforts.
10) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49851)
Posted 22 days ago by ivan
Post:
Yes, we're working on it, but it takes time. Currently we have some two-core and some 4-core jobs in the queue. These will only run in -dev. Let us know how you get on. I'll put some single-core jobs up as well, so people not in -dev can get some work too.

Hmm, there is a little problem with that -- The workflow manager is holding that batch in acquired status while the 2- and 4-core batches run and probably won't start it running until those queues start running dry, which could take some time!

We still don't have any 1-core jobs available for CMS@Home "production". Federica is going to kill her 2-core workflow, so then hopefully the workflow-manager will notice that there aren't many jobs in the queue, and will move my workflow into the "running" state.
11) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49845)
Posted 24 days ago by ivan
Post:
Yes, we're working on it, but it takes time. Currently we have some two-core and some 4-core jobs in the queue. These will only run in -dev. Let us know how you get on. I'll put some single-core jobs up as well, so people not in -dev can get some work too.

Hmm, there is a little problem with that -- The workflow manager is holding that batch in acquired status while the 2- and 4-core batches run and probably won't start it running until those queues start running dry, which could take some time!
12) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49842)
Posted 24 days ago by ivan
Post:
Yes, we're working on it, but it takes time. Currently we have some two-core and some 4-core jobs in the queue. These will only run in -dev. Let us know how you get on. I'll put some single-core jobs up as well, so people not in -dev can get some work too.
13) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49823)
Posted 25 days ago by ivan
Post:
We may have a breakthrough. A previous workflow that was supposed to run on 4-core(+) machines didn't start because it was assigned to a wrong "team". Federica has recently submitted a w/f that calls for two cores -- my home machine (running a 4-core -dev VM) has picked up one of her jobs and is currently running on two cores Please let us know if you have a multicore -dev VM that is running jobs on more than one core. If my understanding is correct, single-core mainstream VMs won't run the two-core jobs but I have single-core jobs in the queues.
14) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49815)
Posted 27 days ago by ivan
Post:
OK, the new batch is "running" with 500 jobs pending. My -dev machine is running a 4-core VM but is only running a single-core job. There are ~400 single-core jobs still pending so we'll have to see what happens when that queue dries up.
15) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49814)
Posted 27 days ago by ivan
Post:
I've submitted a new batch of jobs, specifying Multicore=4 instead of 1. Unlike the "true" 4-core workflow that never got into "running" status, this batch has progressed that far and has 500 jobs "pending". If you have a CMS@Home-dev setup specifying 4-core VMs, please see if you get a multicore job while we investigate how HTCondor is coping with the new batch. Thanks.
16) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49787)
Posted 17 Mar 2024 by ivan
Post:
...the best-laid plans...
New single-core jobs on the way.
17) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49780)
Posted 16 Mar 2024 by ivan
Post:
I think multi-core not yet arrived. Not sure what I could see, because the Consoles do not display usefull info.
I created a dual core VM (not 4 cause other duties on that laptop), but I see only 1 cmsRun using 100% CPU and some other cpu-usage from other processes.
Total 102% CPU after 24 minutes.

Yes, the 4-core batch didn't make it past "acquired" and into "running". I'm submitting smaller single-core job batches for the next few days while we work out why the multicore jobs didn't start. There may be disruptions if I don't arrange my waking hours to coincide with the need to submit new workflows...
18) Message boards : CMS Application : no new WUs available (Message 49779)
Posted 16 Mar 2024 by ivan
Post:
new new tasks since last night :-(
sorry, should read "NO new tasks ..."

Tja, I was waiting to see if Daniele's multi-core workflow would spawn new jobs, in case my old one was holding it back. Turned out not to happen, so I'm submitting smaller job batches until we work out how to get the multi-core jobs into the system. There may be intermittent disruptions over the next few days if my sleep cycle disagrees with the job queues' needs for attention.
19) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49774)
Posted 15 Mar 2024 by ivan
Post:
Ivan wrote on Feb. 14:
We've been having some problems lately as we prepare to allow multi-core jobs to be run in CMS@Home (you've probably noticed...). Unfortunately some of the configurations are beyond our control, and we have to request changes as we find problems and determine a potential fix for them.
We ask for your patience at this time while we work through the difficulties, and would fully understand if you chose to pause your participation in the project while we try to get on top of things.
Ivan, any progress yet ?

We've got a workflow lined up for 4-core jobs, but it hasn't progressed to running status yet. I suspect the WMAgent is waiting for my current single-core jobs to run down, so I'm holding on to see if it does start later on today. If it does, people with LHC@Home-dev access can try to enable 4-core jobs in their computing preferences -- this option is not yet available for mainstream LHC@Home volunteers, so they will continue to run just single-core jobs if they are available. If you do have -dev membership and enable 4-core jobs, you will (at the moment) start a 4-core VM but it will only run a single-thread job.
I have my home PC already set up to run 4-core -dev jobs; when (if...) I see 4-core jobs in the queue I will try to acquire one and let you know if it runs. If that doesn't fly, I'll have to submit a new batch of single-core jobs -- there may be a period with no jobs available if I don't juggle the submissions just so.
Ah, Daniele has just submitted another 4-core workflow. It's currently in "staging" so it's just a matter of hurry up and wait.
20) Message boards : CMS Application : no new WUs available (Message 49762)
Posted 12 Mar 2024 by ivan
Post:
queue is empty :-(

Sorry, misjudged the job queue.


Next 20


©2024 CERN