Message boards :
Number crunching :
can't download
Message board moderation
Author | Message |
---|---|
Send message Joined: 23 Dec 05 Posts: 2 Credit: 228 RAC: 0 |
I'm trying to connect to LHC@Home for over 2 days, and all i get is "connection refused" error. Is it server fault, or just my bad configuration. SETI, Rosseta, and climatepredicton are working just fine, only LHC can't connect |
Send message Joined: 29 Sep 04 Posts: 196 Credit: 207,040 RAC: 0 |
Their servers appear fine: (Log) 12/27/2005 12:56:59 PM|LHC@home|Fetching master file 12/27/2005 12:57:04 PM|LHC@home|Master file download succeeded 12/27/2005 12:57:10 PM|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi 12/27/2005 12:57:10 PM|LHC@home|Reason: Requested by user 12/27/2005 12:57:10 PM|LHC@home|Requesting 8640 seconds of new work 12/27/2005 12:57:15 PM|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded 12/27/2005 12:57:25 PM|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi 12/27/2005 12:57:25 PM|LHC@home|Reason: To fetch work 12/27/2005 12:57:25 PM|LHC@home|Requesting 8640 seconds of new work 12/27/2005 12:57:30 PM|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded 12/27/2005 12:57:30 PM|LHC@home|No work from project There's no work until after the new year, but you should still be able to connect. Perhaps you have a firewall application? Good luck. |
Send message Joined: 23 Dec 05 Posts: 2 Credit: 228 RAC: 0 |
Problem solved: 28/12/2005 01:23:48|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi 28/12/2005 01:23:48|LHC@home|Reason: To fetch work 28/12/2005 01:23:48|LHC@home|Requesting 8640 seconds of new work 28/12/2005 01:23:53|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded 28/12/2005 01:23:53|LHC@home|No work from project |
Send message Joined: 2 Sep 04 Posts: 545 Credit: 148,912 RAC: 0 |
Problem solved: Yea! There will be no work for some time ... I moved the share of LHC over to WCG till there is ... :) |
Send message Joined: 27 Dec 05 Posts: 3 Credit: 674 RAC: 0 |
There isn't work for me either. Is it actually using the computer's resources even though theres no work? Aren't the computer's resources spread over to the other applications such as seti until there is? |
Send message Joined: 22 Oct 04 Posts: 39 Credit: 46,748 RAC: 0 |
There isn't work for me either. Is it actually using the computer's resources even though theres no work? Aren't the computer's resources spread over to the other applications such as seti until there is? Yes, any resources from a project with no work are equally spread throughout the other projects with work. |
Send message Joined: 13 Jul 05 Posts: 23 Credit: 22,567 RAC: 0 |
There isn't work for me either. Is it actually using the computer's resources even though theres no work? Aren't the computer's resources spread over to the other applications such as seti until there is?Yes, any resources from a project with no work are equally spread throughout the other projects with work. I've heard conflicting comments, so i'll ask simply... is the long-term debt for a project without work ment to increase? if someone would care to explain in more detail the conditions under which LTD is and isn't ment to increase, including client settings such as "suspend" and "no new work", that would be most helpful :) for example, when i suspend a project, i find that it's LTD decreases, my assumption of project "suspension" is just that, it stays exactly where it is, so that when/if it's resumed later, it's as if it was never suspended in the first place, but that doesn't seem to be the case, so to achieve what i want, i find that "no new work" has the desired effect also have things been changed from v4.45, because from what i saw back then (didn't use boincview then, so my observations aren't as clear) "no new work" caused the LTD to continue increasing, so i can't help but wonder thanks to anyone who can straighten this out for me :) for the record i currently use BOINC CC v5.2.13 |
Send message Joined: 2 Sep 04 Posts: 7 Credit: 30,712 RAC: 0 |
Les, recently, my curiosity came about the same topic. I have manually set LTD to zero among all projects on one dual-core machine. I'm running CPDN and SETI, with other projects attached - all of them with the same share (100%). Out of 8 project, 4 are with "No new work" and LTD is still zero. Orbit has no appplication for Win so it never got any work and hence I assume that it's why LTD is also no increasing. LHC's LTD is increasing wile CPDN and SETI is descreasing (as those two project provide work on regular basis). This summary covers only one part of your question: "no new work" scenario. For those with no new work, LTD is no increasing as one may expect. I may try the same with "suspend" scenario on another machine attached to 11 projects (no with no new work for all of them, running two CPDN SpinUp model (alfa/beta)...once I have time and am in a mood to do so... I would expect LTD to be increasing on project with 'suspend' state. I would also expect that STD will be increasing when particular WU(s) are suspended. I believe most users kept this setting untouched so once LHC acutally has some work, BOINC scheduler on their machines will start with LHC project, download as many WUs as set in general settings (until their are limited with other projects dead-lines) and finish them quite soon. This may stress LHC server to some degree...but isn't it general problem of all BOINC projects after 'recovery'? Hope it help. |
Send message Joined: 13 Jul 05 Posts: 23 Credit: 22,567 RAC: 0 |
thanks Honza, you point out the other thing i was wondering, does BOINC try to maintain your resource shares over the long term, as in, if a project doesn't have work for a long time (LHC) does it do LOADS when that project does have work later on, to almost "make up" for earlier, and thus to keep "resources devided over projects" closer to what it should be or (as i've heard) does it only try to do that over the short term, and if a project isn't worked on for a while, it's forgotten about, and the resource shares are honoured on a more short-term basis STD seems pretty simple to me, as it has fewer factors affecting it, and thus is quite obvious, but LTD has been a bit mysterious to me for some time is there anything in the wiki (maybe an updated long-term debt page?) that would give an insight we really need JM7 here lol maybe this would be better asked over at SETI, where it might get more attention from the likes of Paul B and John M |
Send message Joined: 2 Sep 04 Posts: 7 Credit: 30,712 RAC: 0 |
Well, that's another aspect of BOINC scheduler system and it's not easy to figure-out all and with all connections. I believe -but may be wrong - that resource share is a plain figure that divides CPU times among projects. So, LTD is equality (or better say proportionally) affected by resource share. Statistically I would say that there is a close-to-perfect correlation or resourse share and LTD - albeit I have not enough data to provide a significant test :) You may find some info about scheduling on wiki. According to it, STD is involved in switching of WUs/projects (should not affect amount of work downloaded for example). P.S. Seti is down [again] so we may ask there...later. Resource share and STD are quite clear and simple concepts; LTD with connection to suspend project/WUs is still a bit unknown land. |
Send message Joined: 1 Sep 04 Posts: 275 Credit: 2,652,452 RAC: 0 |
LTD should increase whenever "the CPU scheduler" is not running that project for whatever reason. NNW and project suspend are user actions, no work from project is a server action, neither should affect LTD. LTD will drift since it is always adjusted so that the average for all projects is zero. BOINC WIKI BOINCing since 2002/12/8 |
Send message Joined: 13 Jul 05 Posts: 456 Credit: 75,142 RAC: 0 |
LTD should increase whenever "the CPU scheduler" is not running that project for whatever reason. If that is the design then it is working to spec. But I would like to ask for the spec to be changed. On this porject, with intermittent work, it would be very useful to be able to set (say) 25% resource share, and know I would give 25% resource to thei project if it had work 25% of the time or 100% of the time. With the current LTD mechanism, if I set 25% and LHC is only available 40% of the time, I end up giving only 10% to LHC. Obviously, if LHC was only up 5% of the time I no software could get me a 25% share; but if it is possbile my preference would be for the client to try to get me what I asked for. Again, if a server has downtime, I'd want my client to bring it back up to its resource sharew when it came online again, not for that project permanently to lose the missed time. John, are you part of / in contact with the client developers? If so could you pass this request on please? If not, perhaps JM7 will pop in here and see this thread... River~~ |
Send message Joined: 2 Sep 04 Posts: 545 Credit: 148,912 RAC: 0 |
River~~, I doubt you will see this, at least not in any near or mid-term horizon. The problem being the very complexity of trying to satisfy your goal. WIth projects that have more consistency this is obviously not a big issue, but, for projects like LHC, well, i don't see it happening as LHC is somewhat of an anomaly. WIth luck they will get the follow on applications working and we could see a more consistent work load, which *I* would like as I would much prefer to put much of my time to hard physics (why I am bent I cannot specify AstroPulse only, or at least that is not in the plans) ... and At this time Einstein@Home is the only game in town for that ... |
Send message Joined: 29 Aug 05 Posts: 42 Credit: 27,102 RAC: 0 |
1) If you want to shortcut the LTD figures, just detach, reboot and reattach. That should clear the LTD for the project. Obviously not something you want to do weekly, but is good enough for an extended outage like this. 2) Why do we always refer to debts as averaging zero? Wouldn't it be as correct (and more precise) to say they sum to zero? The two statements are identical, but I find myself preferring the latter. Is there a specific reason we do one or the other, or was it just preference and wiki-inertia? 8) I nitpick--I got nothin' else to do while I wait for my WUs. 8) (j) James |
Send message Joined: 13 Jul 05 Posts: 23 Credit: 22,567 RAC: 0 |
1) If you want to shortcut the LTD figures, just detach, reboot and reattach. That should clear the LTD for the project. Obviously not something you want to do weekly, but is good enough for an extended outage like this.an easier method that saves you having to merge hosts is to set "no new work" then "reset" the desired project multiple times (monitoring with BoincView or BoincDV to see what the LTD numbers are doing 2) Why do we always refer to debts as averaging zero? Wouldn't it be as correct (and more precise) to say they sum to zero? The two statements are identical, but I find myself preferring the latter. Is there a specific reason we do one or the other, or was it just preference and wiki-inertia?agree, and from my interpretation of things, "sum" seems more appropriate than "mean" (average) |
Send message Joined: 2 Sep 04 Posts: 545 Credit: 148,912 RAC: 0 |
Or just use BOINC DV to reset the numbers - making sure to stop the BOINC daemon ... |
Send message Joined: 1 Sep 04 Posts: 275 Credit: 2,652,452 RAC: 0 |
I suspect average is used instead of sum to account for differing resource shares. JM7 will have to pop in and verify that though. It was originally designed to continue increasing debt any time a project did not have work. However many complaints about projects dominating a host after an outage caused this to be changed. I do not think it will be possible politically to get it changed back. Technically it should be fairly easy, just comment out a few blocks of code. BOINC WIKI BOINCing since 2002/12/8 |
Send message Joined: 13 Jul 05 Posts: 23 Credit: 22,567 RAC: 0 |
I suspect average is used instead of sum to account for differing resource shares. JM7 will have to pop in and verify that though. and i imagine it's bad for the project that was down too, being hammered by everyone as soon as they're back up, ouch! |
Send message Joined: 13 Jul 05 Posts: 456 Credit: 75,142 RAC: 0 |
1) If you want to shortcut the LTD figures, you miss my point. I do *not* want to shortcut the LTD figs. I want them to reflect the non-work situation when a project is off-air. That means 'boosting' the LTDs whereas reset reduces them. R~~ |
Send message Joined: 13 Jul 05 Posts: 456 Credit: 75,142 RAC: 0 |
River~~, Hi Paul, what I suggest is *less* complex than the present code. A classic case of less is more. At present the logic has to distiguish between three groups of projects - projects that have work queued locally - projects that don't have work queued locally because of negative LTD - projects that don't have work queued locally for other reasons (server failure, user intervention, etc) Deciding between the latter two cases is quite complex, and the respective behaviour (involving some projects getting credit on their STD, some on their LTD, and some not at all) involves three different possibilities; moreover the amount of debt to be assigned varies depending which project 'count' in the mix. My suggestion is to combine the latter two cases. When project X runs, all other projects get their pro-rata abmount of debt given to them. If the project has work locally the credit goes to the STD, if not the same credit goes to the LTD. After each award of credit the LTD and STD are separately re-balanced so that STD totals zero and so does LTD. Less complexity, and the end result is that the LTD does what it says on the tin (*) - it balances work load over time for all projects, whether running in the short term or not. The current coding, tho clever, actually adds complecity to achieve a less desirable and a less understandable outcome, in my opinion.
It is however misleading to say that running another project protects against downtime on the major project when the code has added complexity that ensures the lost time is never made up. Back to the 'what is says on the tin' issue. If, heaven forfend, Einstein went down for a week I'd like the software to do its best to rebalance my long term work shares. If it doesn't do that it is not much of a backup and, in my view, should not be promoted as one. While we advocate (as both of us do) the use of backup projects, it is an issue. The importance fo this issue is not how often the projects go down, but how important we think it is to do the right thing in the unlikley event that a project does go down. In that sense LHC is a currently a good test bed for the 'backup' claims of the current LTD algorithm - a test that shows, I suggest, that it simply does not offer a backup at all at present. R~~ (*)What is says on the tin. Describes goods / services that do ecactly what they lead you to believe they do. Originated by TV ads that claimed that Ronseal (a wood sealing product) 'does what it says on the tin'. |
©2024 CERN