Message boards : Number crunching : Imbalance between Subprojects
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3

AuthorMessage
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 30670 - Posted: 6 Jun 2017, 19:09:40 UTC - in response to Message 30654.  

* You remember that your VMs download large packets after starting a WU. Not every user has enough bandwith for this and look at CMS, they need so much upload-speed that this seem to be a bigger problem

I apologise for the upload size, the workflow we've been given is marginal -- the lambda-zero decay we ran last year with CRAB was a much better fit. I'm trying to get a workload that's a much more suitable, but I don't know how to create a WMAgent workflow myself. One of our major supporters has suggested a Higgs decay scheme that his analysis group wants to study which should fit us much better, but he's so busy that he never gets around to having someone create the workflow.
ID: 30670 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,376,395
RAC: 102,054
Message 30673 - Posted: 6 Jun 2017, 19:43:31 UTC - in response to Message 30670.  

... Not every user has enough bandwith for this

I am really kind of surprised to read over and over about problems with bandwidth and/or downlaod/upload speed and transfer limits.

In times of cable modem and flat rates, all this should not be a problem at all.
ID: 30673 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 10,065
Message 30675 - Posted: 6 Jun 2017, 20:12:15 UTC - in response to Message 30673.  

In times of cable modem and flat rates, all this should not be a problem at all.

This depends on the location where you are and even in Germany there are places with really poor Internet-Lines


Supporting BOINC, a great concept !
ID: 30675 · Report as offensive     Reply Quote
Dave Peachey

Send message
Joined: 9 May 09
Posts: 17
Credit: 772,975
RAC: 0
Message 30677 - Posted: 6 Jun 2017, 21:01:07 UTC - in response to Message 30673.  

I am really kind of surprised to read over and over about problems with bandwidth and/or download/upload speed and transfer limits.

In times of cable modem and flat rates, all this should not be a problem at all.

Apologies but I'm afraid I'm going to take issue with you on this one ... whilst it may not be a problem where you are, this is part of the imbalance between (and sometimes within) many countries and also impacts what the average and non-average DC cruncher has (or is prepared) to pay to indulge in this hobby.

Yes, in the UK, cable modem is available as part of many ISPs' flat rate packages and the rates can be very good. However, I've seen comments elsewhere on other project and from other DC crunchers, e.g. in some parts of the US (where you wouldn't necessarily expect this to be an issue), who subsist on very low bandwidth and/or ridiculously low data rates neither of which are conducive to running LHC sub-projects (other than SixTrack) with their high bandwidth requirements.

So whilst it's possible, in the UK, to get good quality cable and ADSL/VDSL (usually as part of a package including other services and provided you live in the heart of a major conurbation), the majority of UK ISPs are not known for their generosity when it comes to these things ... either they cost a lot of money or else the quality of service is so poor (especially outside the major population centres) that they are as good as useless for high bandwidth, always-on DC work.

However, and in spite of living in central London where these things are readily available, I can't use cable and I don't want the majority of spurious features or services which come as part of such packages. Nor do I want to be tied into a rigid contract with an ISP which has little flexibility and penalties for over-use.

Hence, I prefer to pay for a high quality, basic service (with only the features I need/want) but that comes with a monthly bandwidth limit (350GB per month download - which is double what I was using three months ago before I started crunching ATLAS WUs 24x7) and that is pretty much all eaten up by DC projects (LHC amongst them). The cost of this is three times what I used to pay for unlimited ADSL bandwidth just a few years ago but which the ISP I used at the time was not prepared to continue to supply.

On which basis, and as I said in my previous post, the choice to run DC full-time (especially projects with the high bandwidth requirements which LHC has on most of its sub-projects) is one which not many average cruncher would be prepared to indulge. Thus it is a limiting factor and contributes, I'm sure, to the less than spectacular take-up of the VM-based sub-projects on LHC.

Dave
ID: 30677 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 30683 - Posted: 7 Jun 2017, 9:06:44 UTC - in response to Message 30677.  

I support Dave. I live within the M25, but my ADSL only gives me 6 Mbps download/1 Mbps upload. Yes, I *should* have fibre broadband -- the telecoms cabinet stands in front of my house -- but my ISP is yet to offer it to me. They were taken over by BT over a year ago and I expected a special upgrade offer as a result, but it hasn't happened yet. We also have Virgin in our street; theoretically I could have 49 Mbps from fibre and something approaching 100 Mbps from cable, were I to squander my pension pot on such things.
ID: 30683 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,376,395
RAC: 102,054
Message 30684 - Posted: 7 Jun 2017, 11:46:10 UTC - in response to Message 30677.  

Apologies but I'm afraid I'm going to take issue with you on this one ... whilst it may not be a problem where you are, this is part of the imbalance between (and sometimes within) many countries and also impacts what the average and non-average DC cruncher has (or is prepared) to pay to indulge in this hobby. ...

thanks for your posting, Dave.
I really was not aware that, at many places, the situation is so much different to how it is here with me in Vienna, Austria.
My cable modem from UPC Austria provides me with 150Mbits download / 15Mbits upload, and am am paying some 53 Euro per month for unlimited traffic, plus TV with about 130 stations (many of them in HD quality), plus many Radio stations, plus landline telephone (unlimited within Austria).

So, I seem to be rather lucky :-)

Hence, my limitations are dictated by the power and capacity of my PCs, not by my Internet line.
ID: 30684 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,936,497
RAC: 137,523
Message 30747 - Posted: 12 Jun 2017, 7:57:59 UTC

After the discussions from last week (see the links below) I modifyed my project setup to see how it behaves.

https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4297
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4293

The main objective was to balance the wall time usage between the different subprojects.
This can be achieved with the following measures:
1. Each subproject is mapped to it's own venue at the project server
2. The host starts an extra client instance for each subproject
3. Change between subprojects is done by a local script that uses a weighted random factor
3.1 The currently running subproject is locked for the next request
3.2 Weight factors are calculated from the individual wall time medians and then compared against the other candidates

Advantages:
- Each subproject gets equal ressources (time). It's now in the developer's responsibility to send efficient applications.
- If a subproject sends errors it is automatically ignored in at least the next request.

Disadvantages:
- Monitoring of the local clients gets more complex
- # of subprojects is limited to # of venues (currently 4)

@Laurence
Is it possible to implement more venues (see: primegrid)?
ID: 30747 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,936,497
RAC: 137,523
Message 30802 - Posted: 16 Jun 2017, 9:29:50 UTC - in response to Message 30747.  

Reminder:
@Laurence
Is it possible to implement more venues (see: primegrid)?
ID: 30802 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 242
Credit: 5,800,306
RAC: 0
Message 30910 - Posted: 20 Jun 2017, 14:59:31 UTC

We have done some tuning of our scheduler and feeder parameters today, that should improve dispatching of tasks.
ID: 30910 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,936,497
RAC: 137,523
Message 30911 - Posted: 20 Jun 2017, 15:42:15 UTC - in response to Message 30910.  

We have done some tuning of our scheduler and feeder parameters today, that should improve dispatching of tasks.

Would this mean that the following suggestion could work?

1. Collect the time driven subprojects - currently CMS, LHCb, Theory - under the same client instance and connect it via a separate venue, e.g. work.
All 3 subprojects have the same target walltime (12 h).
The server has to ensure that all 3 get the same long term share.

2. Run ATLAS under it's own client instance - connected to another venue, e.g. home - where it can be configured as multicore or whatever.

3. Run sixtrack under a 3rd client instance and connect it to a venue that becomes available now.


The local scheduling between the client instances has to be done by a local script similar to what I'm doing right now.
ID: 30911 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3

Message boards : Number crunching : Imbalance between Subprojects


©2024 CERN