Message boards :
CMS Application :
no new WUs available
Message board moderation
Previous · 1 . . . 21 · 22 · 23 · 24
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1827 Credit: 119,535,315 RAC: 42,794 |
Since 2.5 hour CMS-Task is waiting for Job inside (Win11pro).based on what you were saying, I tested a new task. Same thing happened as within the past 10 days or so: no jobs, no CPU, the task ended and got uploaded after about 27 minutes (without value for the science): https://lhcathome.cern.ch/lhcathome/result.php?resultid=417686337 |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 4,896,063 RAC: 2,401 |
Its the weekend. No one is monitoring or at the lab. They will see it monday morning some time. Same for the other projects. |
Send message Joined: 18 Dec 15 Posts: 1827 Credit: 119,535,315 RAC: 42,794 |
Its the weekend. No one is monitoring or at the lab.no, it is NOT (only) the weekend. LHC&home lies in agony since almost 2 weeks ago. From what I remember, it's never been that bad in the 9 years I have been contributing. And, even worse, no one gives us volunteers any information whatsoever about what's going on - that's really sad :-( |
Send message Joined: 24 May 23 Posts: 46 Credit: 2,628,250 RAC: 1,625 |
no, it is NOT (only) the weekend. LHC&home lies in agony since almost 2 weeks ago. From what I remember, it's never been that bad in the 9 years I have been contributing. And, even worse, no one gives us volunteers any information whatsoever about what's going on - that's really sad :-( The sad truth: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6250 ;-) God is dead. Marx is dead. And I don't feel so well myself. ;-) -- Bye, Lem |
Send message Joined: 18 Dec 15 Posts: 1827 Credit: 119,535,315 RAC: 42,794 |
some time this morning, I noticed that the server status page showed zero unsent CMS tasks. So I was pleased to see that someone finally stopped the useless distribution of tasks as long as there are no jobs available. About half an hour ago, the server status page showed 200 "unsent" CMS. So my logical thought was that someone obviously switched on the task distribution after having insured the availability of jobs. However, to my surprise, still no jobs (like in the past ~10 days), so the task ended after about half an hour without value for the science. Can anyone explain to me the rationale behind what's happening over there? Is LHC@home falling apart? |
Send message Joined: 28 Dec 08 Posts: 341 Credit: 4,896,063 RAC: 2,401 |
Its the weekend. No one is monitoring or at the lab.no, it is NOT (only) the weekend. LHC&home lies in agony since almost 2 weeks ago. From what I remember, it's never been that bad in the 9 years I have been contributing. And, even worse, no one gives us volunteers any information whatsoever about what's going on - that's really sad :-( Try Rosetta. They don't say anything about anything on the pages anymore. They don't really support the BOINC version much anymore since they have a nice new computer with AI to do all the discovery work. BOINC gets scraps. If something goes wrong your SOL. |
Send message Joined: 22 Mar 17 Posts: 66 Credit: 14,580,818 RAC: 489 |
Its the weekend. No one is monitoring or at the lab.no, it is NOT (only) the weekend. LHC&home lies in agony since almost 2 weeks ago. From what I remember, it's never been that bad in the 9 years I have been contributing. And, even worse, no one gives us volunteers any information whatsoever about what's going on - that's really sad :-( Yeah, occasionally a casual BOINC user at my home team will say how long its been since any LHC work (meaning only SixT work. I've said several times that LHC pretty much always has had some sort of CMS or Theory work. But not the past few weeks. It's sad as I have been wanting to complete some BOINC goals at LHC but now there's no work. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,988,818 RAC: 7,494 |
Today migrated to Win11pro 24H2. CMS-Task running, BUT no Job inside the Task. Is there a Timestamp to fill the input from Cern-IT? https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10821223 |
Send message Joined: 28 Sep 04 Posts: 735 Credit: 49,844,204 RAC: 35,579 |
The queue has become empty again. |
Send message Joined: 4 Sep 22 Posts: 95 Credit: 16,208,357 RAC: 17,011 |
I really do wish someone would explain to me just why it is so important whether or not there are any tasks in any particular queue. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,884,757 RAC: 11,261 |
I really do wish someone would explain to me just why it is so important whether or not there are any tasks in any particular queue. Quick answer, subject to its being the end of a long day...): - When I submit a new workflow, WMAgent creates appropriate jobs which are sent to the job queue (or pool) in the condor server - A periodic cron job on the BOINC server monitors the available jobs in the condor pool - If it finds available jobs, and the BOINC task queue is not full (i.e. < 200), it creates new tasks to replenish that queue - When your BOINC client is allocated a new task from that queue this creates a new virtual machine (under VirtualBox) on your host that becomes a condor client, sequentially extracting and running jobs from the condor pool until various time constraints are exceeded Complications: - the WMAgent has its own queues: created, pending, and running - I believe that each of these has a limit of 2,000 jobs - if (running < runlimit and pending >0) pending jobs are moved to be available for running (i.e. into the condor pool) - if (pending < pendlimit and created >0) created jobs are moved to pending - if (created < createlimit and number_created < workflow_target) new jobs are created and added to the created queue I'll revisit this answer tomorrow after I've had some sleep! |
Send message Joined: 4 Sep 22 Posts: 95 Credit: 16,208,357 RAC: 17,011 |
I really do wish someone would explain to me just why it is so important whether or not there are any tasks in any particular queue. Clearly I could have phrased my post a lot better than I did. :D I meant, why is it so important that it be reported in the forums every time one queue or another empties -- then again to announce when tasks are again available in the queue. Hope you had a good sleep. |
Send message Joined: 28 Sep 04 Posts: 735 Credit: 49,844,204 RAC: 35,579 |
Mainly this is to alert people who run the tasks one sub project at a time. Different subprojects may need different settings to run efficiently. Also the task feeding (like CMS) is a manual task and queue becoming empty might be missed sometimes (or the process gets stuck for some reason). So this way the person(s) in charge can get an email alarm when someone posts to this thread. I don't know if there are any other alarms available to people working in the background if something goes wrong.. |
Send message Joined: 29 Aug 05 Posts: 1065 Credit: 7,884,757 RAC: 11,261 |
Sorry for misinterpreting. It's basically what Harri said upthread, to alert people to the possibility of adjusting their machines to maximise their satisfaction with what they accomplish (and also to cut down on the number of messages I get when there is an outage:-). Especially when some central facility (such as the WMAgent) is about to go down for maintenance, I like to warn volunteers that they might wish to divert their attentions elsewhere until the disruption is over. |
Send message Joined: 4 Sep 22 Posts: 95 Credit: 16,208,357 RAC: 17,011 |
Mainly this is to alert people who run the tasks one sub project at a time. Different subprojects may need different settings to run efficiently. Set all projects/subprojects with no maximum on the number of concurrent tasks, and let boinc handle it. Occasionally, it may be convenient to restrict one particular subjproject to 1 or 2 concurrent tasks, in which case one can create an app_config file. Also the task feeding (like CMS) is a manual task and queue becoming empty might be missed sometimes (or the process gets stuck for some reason). So this way the person(s) in charge can get an email alarm when someone posts to this thread. I don't know if there are any other alarms available to people working in the background if something goes wrong.. Fine, IF the original post also contains enough information to point those in charge to a possible resolution -- AND if everyone else just stays out of the picture unless those in charge ask for additional info. |
©2025 CERN