Message boards :
CMS Application :
New Version 50.00
Message board moderation
Author | Message |
---|---|
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
This new CMS version updates the configuration of CVMFS and refreshes the cached files. |
Send message Joined: 15 Jun 08 Posts: 2528 Credit: 253,722,201 RAC: 62,755 |
Server restarted? The BOINC client downloads the old CMS_2019_03_25.vdi |
Send message Joined: 2 May 07 Posts: 2240 Credit: 173,894,884 RAC: 3,757 |
We had no finished CMS-Task in -dev for Vers.50.00. Is it possible to upgrade Condor to CentOs 7? |
Send message Joined: 1 Sep 04 Posts: 52 Credit: 11,767,629 RAC: 0 |
My v49 CMS tasks were consistently crashing recently, not 100% but a large majority. When I read here that v50 was available I aborted all the v49's in my queue, but they were replaced by more of the same. I then changed my preferences to exclude CMS tasks (temporarily), aborted the new V49's I'd been sent, and updated again. What I got was more CMS v49 tasks. 3 questions: 1) How long does it take for preference changes to take effect on the server? 2) When can I expect to be able to download v50 tasks to see if that helps? 3) ... or, is there something else stupid I'm doing wrong. Thanks. |
Send message Joined: 15 Jun 08 Posts: 2528 Credit: 253,722,201 RAC: 62,755 |
There's still no "GO" from Ivan: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5334&postid=41826 In addition something went wrong with this version update but nonetheless it would only be the envelope. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5339&postid=41854 |
Send message Joined: 1 Sep 04 Posts: 52 Credit: 11,767,629 RAC: 0 |
There's still no "GO" from Ivan: Thank you, hadn't seen that. (The other problem (downloading CMS) was my error.) |
Send message Joined: 18 Dec 15 Posts: 1810 Credit: 118,214,160 RAC: 26,862 |
any idea when CMS will be up and running again? |
Send message Joined: 18 Dec 15 Posts: 1810 Credit: 118,214,160 RAC: 26,862 |
On March 17, I wrote: any idea when CMS will be up and running again?in the past few days, the server status page showed zero tasks available, today the queue was refilled. What does this mean? There has been no "go ahead" from Ivan so far; so I guess one would crunch CMS tasks at one's own risk only, right? |
Send message Joined: 2 May 07 Posts: 2240 Credit: 173,894,884 RAC: 3,757 |
If Cern-IT is testing, they need also Data from Boinc-Server for CMS. So be patient and wait. |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
I am running a CMS now, with two more in the buffer. YMMV |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
The next ten were empty, so I am back to native ATLAS only. |
Send message Joined: 18 Dec 15 Posts: 1810 Credit: 118,214,160 RAC: 26,862 |
thanks, guys, for the information. So no CMS at this point of time :-( |
Send message Joined: 12 Jun 18 Posts: 126 Credit: 53,906,164 RAC: 0 |
Is there a native CMS now? |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Is there a native CMS now? That is asked from time to time. The answer is no, because it is too complicated. Apparently CMS covers a variety of experimental groups, all doing their own thing. It is hard enough to get it to work with VirtualBox, which is relatively easier since it packages them all up the same way. Beyond that, a real expert will need to answer. But I would hope they they could find a way too, preferably with "runc", which is what they use for Theory. It avoids the need for singularity, and just requires CVMFS. . |
Send message Joined: 5 Mar 06 Posts: 13 Credit: 30,894,768 RAC: 566 |
I've crunched a few dozen CMS version 50.00 WUs now and I've noticed they consume a lot of network bandwidth. In fact, when 15+ of them run at the same time, they completely use up 5 Mbit/s upload limit of my home connection (download is fine). What is the total download and upload size each WU generates during the entire run? |
Send message Joined: 15 Jun 08 Posts: 2528 Credit: 253,722,201 RAC: 62,755 |
I've crunched a few dozen CMS version 50.00 WUs now and I've noticed they consume a lot of network bandwidth. In fact, when 15+ of them run at the same time, they completely use up 5 Mbit/s upload limit of my home connection (download is fine). What is the total download and upload size each WU generates during the entire run? Each new CMS task downloads about 200MB. Most of that can be served from a local squid proxy: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5473 A typical CMS subtask writes a result file of roughly 110 MB within 1-3 h (average 2 h). This file requires about 3 min to be uploaded on your line (5 Mbit/s). Based on the average 15 concurrently running CMS tasks require 38% of your total upload capacity. A proxy doesn't help to reduce those uploads. |
Send message Joined: 5 Mar 06 Posts: 13 Credit: 30,894,768 RAC: 566 |
A typical CMS subtask writes a result file of roughly 110 MB within 1-3 h (average 2 h). Thanks for the info. It's not so rosy in practice though, it seems the crunching stalls until the result file is completely uploaded. And when 15+ WUs upload at the same time, each upload speed fluctuates aroud 0.3 Mbit/s (or 30 kB/s, welcome to the dialup era), so the upload takes over hour. In the meantime, other WUs complete another result and try to upload it. The end result is that only 3 or 4 WUs of the 15 actually crunch, the rest are waiting for upload. Or at least it will become stuck in this vicious cycle if some other program uses up the upload bandwidth for 20 minutes or so. I need to limit the number of CMS tasks via app_config.xml. Are you sure the CMS result data is only 55 MB/hour on average? |
Send message Joined: 15 Jun 08 Posts: 2528 Credit: 253,722,201 RAC: 62,755 |
it seems the crunching stalls until the result file is completely uploaded. Right. The CPU remains idle until the upload (~110 MB) has finished and a new job (3-4 MB) has been downloaded. This is how CMS tasks always work. And when 15+ WUs upload at the same time, each upload speed fluctuates aroud 0.3 Mbit/s Right. This happens all the time since the internet line is a shared medium and each active connection gets a fraction of the total bandwidth. Fortunately you normally don't notice it since most uploads are much smaller than the CMS result files. The 38% are an average value. While uploads are in progress - even just 1 - you should see a 100% bandwidth usage. ... the upload takes over hour. In the meantime, other WUs complete another result and try to upload it. The end result is that only 3 or 4 WUs of the 15 actually crunch. Right. Very likely that this happens. ... the rest are waiting for upload No. They are not waiting. Their uploads are slow but in progress. Are you sure the CMS result data is only 55 MB/hour on average? That's an average. Each job result is around 110 MB (+- a few MB). The fastest computers require about 1 h to complete a job (=subtask), slower ones may need up to 3 h. |
Send message Joined: 29 Aug 05 Posts: 1060 Credit: 7,734,854 RAC: 2,594 |
You can see some data on job timings, etc., in the job graphs. I grabbed graphs that I felt were most useful, but you can play around with the parameters if you like (in particular, if you click on the back-arrow within a plot, you can see a whole lot of other plots that you can view in full by clicking on the plot title and selecting "View" on the drop-down menu). Note that not all of these graphs are properly populated, CMS@Home is not a high priority for the monitoring crew. My initial aim when this all started was to run jobs (or sub-tasks as some call them) that ran for 1-2 hours and returned up to 100 MB of results. This was mainly based on my connection at the time, which was 5-6 Mbps download and 1 Mbps upload, and the assumption that most people would only run one task at a time, or at least adjust the number of tasks to suit their connectivity. There has always been the problem of people being over-enthusiastic about their contribution and running into the sort of problem being discussed here. We also have to choose our tasks carefully, I could easily send you jobs that would tax a 100 Mbps link! |
Send message Joined: 5 Mar 06 Posts: 13 Credit: 30,894,768 RAC: 566 |
Unfortunately, all Virtualbox apps have always been quite opaque when it comes to actual memory and bandwidth requirements. Boinc Manager is unable to display them and you can't google this information (I tried before I asked here). The VM console doesn't display what the workunit actually does and the Virtualbox Manager has no graphs or statistics, either. So it's very easy to become "over-enthusiastic" that way, because the average user is left to guesswork with Windows task manager or similar tools. :-/ |
©2024 CERN