1) Message boards : Number crunching : my client will never connect (Message 18214)
Posted 16 Oct 2007 by uioped1
Post:

Me too!! I realize you can see the current deferment in the projects tab, but it is nice to see the history (like 1 min deferrals growing to 2 hours or more, etc) so you have a better idea of what's going on.



My MSG tab shows deferral amounts and why. (.10.20)

In fact, if you compare in my log the entries for SETI and LHC you will see the problem. Seti says "deferring for x" and then on the next line it says Reason: requested by project.

LHC says "deferring for y" but the reason is "no work from project"

What I'm trying to point out is that LHC does not seem to be communicating the required backoff time to the client. Maybe this is because doing so requires the server code upgrade they've been mentioning.
2) Message boards : Number crunching : dear Neasan & Alex (Message 17985)
Posted 26 Sep 2007 by uioped1
Post:
Server SW upgrade would also allow "server side abortion", which would further reduce wasted CPU cycles.


Oh, so just because you think it's a good idea, now we all have to live with your immoral choices?

nevermind...
3) Message boards : Number crunching : Even if there's no "real work", may I suggest a "real test"? (Message 17522)
Posted 23 Jul 2007 by uioped1
Post:
I didn't get anything again

And mine run 24/7 since the beginning.


This is part of what it says.........


7/20/2007 5:56:50 AM|lhcathome|Checksum or signature error for bottomOverlay_1.02_.tga



Magic,
Have you tried the "Skip Image File Verification" setting in the general boinc preferences?
I know it's hard to test given the shortage of WUs, but it's worth a try.

What other projects are you connected to?
4) Message boards : Number crunching : Can anyone explain ...... (Message 15589)
Posted 20 Nov 2006 by uioped1
Post:
If you see an example where the duff result shows validate status "Vaild" please post a link to the result here.

There is an issue in that the result should not be returned as a "Success", but the science and the credit are both correctly protected by the validator, at least in the cases I looked at.

Hope that reassures.

River~~


This unit shows an invalid result marked valid and granted credit.
This is the same host that started this thread, ID 88058.

I suspect that what happened here still doesn't affect the science.
-a.
5) Message boards : Number crunching : New computer database entry created on each connect (Message 15555)
Posted 18 Nov 2006 by uioped1
Post:
Problem is still happening! (and there is another thread about it)
This problem is making server load worse (host table in database is probably super-big by now) and also I think this is the reason why there are no stats. XML stats for hosts would be also extremely large.



Since there seems to be no post from someone _not_ experiencing the problem, I thought I'd point out that this not happening to everyone. It might be interesting to see if there are any patterns to the host duplication. . .

I have never had any of my hosts duplicated for no reason. My main host is using client Version 5.4.11 on windows xp. All of my hosts since the problem started have been windows of various versions.
6) Message boards : Number crunching : Can anyone explain ...... (Message 15536)
Posted 18 Nov 2006 by uioped1
Post:
Host was working in February; that's a long time to not notice the error.
one good thing is that since they don't validate, William gets no credit for the results.

-a.
7) Message boards : Number crunching : Can anyone explain ...... (Message 15534)
Posted 18 Nov 2006 by uioped1
Post:
Thought that was the case; thanks for confirming :)

Shame we can't get this system excluded, seeing as it's screwing up so badly; be more units for the rest of us as well ;)

Strange that with those errors there, it's still not recognized as "Computing error" by the client. The validator would get rid of those right away.


I think maybe this William person is using the anon. platform mechanism to force the lhc workunits to be run by an optimized seti app. Most likely they were trying to use an optimized app for seti, and just screwed up. Who knows.
8) Message boards : Number crunching : Fairer distribuiton of work(Flame Fest 2007) (Message 15067)
Posted 11 Oct 2006 by uioped1
Post:
This time, I will try for sucinctity, I promise:)

With the current batch of WUs, (It looked like there were about 10000) my estimate for the ideal time to completion is 32 hours, with statistical completion in 24. Of course, if everyone had gotten at least 1 wu, that would have been cut down to about 6... We'll see how far off reality comes. My suspicion is that there will be units in progress until the deadline.

If they had set the deadline to a conservative 2 days, we never would have seen the eggregious abuses river pointed out.

He's also right that keeping the majority of the users happy should also be a priority, although I think that this project generally has a brick wall for ears.

I really want to see tighter deadlines for this project, and we don't even need to go as far as chrulle's adjustable ones.

9) Message boards : Number crunching : Boinc farms. (Message 15036)
Posted 10 Oct 2006 by uioped1
Post:
For those of you out there tired of 8-way processing, and with very high credit limits, SUN is offering a 60 day free trial of a SUNFIRE x4600 server. That's up to 16 dual core 3 GHZ Opterons. Won't help you much for LHC though (64 bit procs, and who knows if you'd catch 1 work release in 60 days :) Do Einstein or SETI have good x64 apps out yet?
10) Message boards : Number crunching : Fairer distribuiton of work(Flame Fest 2007) (Message 15035)
Posted 10 Oct 2006 by uioped1
Post:
I have ended up with a long-ish post, that might disguise the fact that I share your sentiment. I felt the need to clear up a perceived misconception.

It is important to note the distinction between having the work finished, and having all the results returned.

If you looked at the graphs shortly after the end of this run, you noticed that shortly after the 7th (I think) the remaining results plummetted as the last workunits missed their deadline. Because these results were not re-issued (the graph stayed down at 0) and also because of the small number of them (~1000, from a batch of results significantly greater than 5000) we can deduce that none of those results were neeeded to complete a quorum.

If we assume that most of the quorums are reached shortly after the project falls below 4/5 of the initial batch size, this batch was actually done after about 3 days. (this is all from memory, as the relevant paerts of the curve have all fallen off the graphs by now. Forgive my approximations, please.)

If you looked at the rapidity with which the slope decreased, you can tell that the ideal return for the project would have been somewhere between 1 day to 1.5 days shorter (that 4/5 target).

My point is that CERN doesn't have a lot to gain from decreasing the quotas. In fact, any qouta system where very fast hosts are throttled will lengthen the initial part of the curve, even if it does provide some benefit at the tail end of the curve from choking off greedy hosts. Also, since we know that the trailing results are unnecessary, we cannot blame them for the delay in creating new work.

Furthermore, if Garfield does come on line, and if that app provides more steady work, any benefit that would have been provided by the quota would be rendered irrelevent, while the theoretical harm would remain.

Boinc has another method more suited for this problem, which is the use of deadlines to prevent waiting for slow caches. I believe that the scheduler does a calculation to make sure that work can be completed before the deadline before it gives you some. However this would cause other people annoyance as it throws the host's scheduler for a loop. (and everyone who runs this project really ought to be running other projects. I strongly recommend Rosetta@home.)

A third possibility would be to simply increase the initial replecation, however I don't think that anyone would like this as it would mean that much les value for the work you do return.

Personally, I would love to see tighter deadlines. I hate sitting around watching the outstanding results trickle down when
I haven't had work for days.

Full disclosure: One of my hosts had a configuration error and downloaded far too much work this round. The last 1/4 were put to no good use, being the last results returned for their respective WUs.
11) Message boards : Number crunching : Not HAPPY people. (Message 13336)
Posted 12 Apr 2006 by uioped1
Post:
Just a note for those that suspend lhc when it has no work available: I believe that boinc does not modify the debt figures for suspended projects, so you may find that you crunch less lhc work than you would by allowing your host to poll the project (which really wastes very little cpu/network)

You also may find that occasionally you get other peoples errored or timed out work if your host is lucky while there is supposedly no work.
12) Message boards : Number crunching : Not HAPPY people. (Message 13168)
Posted 29 Mar 2006 by uioped1
Post:
Now, what really bothers me about this project are the days with 100, or even 50 workunits still in progress. I don't know if the scientists actually have to wait for those (I doubt it) but it irks me just the possibility...
13) Message boards : Number crunching : Not HAPPY people. (Message 13161)
Posted 28 Mar 2006 by uioped1
Post:
I have only one point to add to River's last post; With the improvements that have been made in the last three months, this is all but a different project.

These improvements have come in:

  • Bug ellimination,
  • Algorithm improvement, to the point where the newest algorithms are capable of returning scientifically valuable results on certain classes of proteins, in addition to the scientific goal of the project to better understand the protein folding problem.
  • Scientific Communication, which as River stated is some of the best in DC.
  • Technical Communication, meaning acknowledging issues, explaining their plans, and asking for and listening to suggestions.
  • User Acknowledgement, in the form of credit and "result of the day" Acks.
  • Feature innovation,


And last but not least, the quality and value of the results received.

I really think that if you were to attach to their project now, you wouldn't recognize it as the one that so offended you previously.

14) Message boards : Number crunching : Not HAPPY people. (Message 13143)
Posted 28 Mar 2006 by uioped1
Post:
Rosetta@home is begging for more participants. The URL is http://boinc.bakerlab.org

This is one of the better projects that I have seen for communication from the project scientists and for acknoledgement of their contributors. I urge you all to check them out as they're doing great things!

-uio
15) Message boards : Number crunching : going, Going, ........ , GONE. (Message 12901)
Posted 28 Feb 2006 by uioped1
Post:
Under 10000 WUs still active in the wild. . . 9500 . . .
16) Message boards : Number crunching : Units not validating (Message 12900)
Posted 28 Feb 2006 by uioped1
Post:

How long do the results (any results) stay on the page until they are deleted?


The project may chose a delay to apply before deleting assimilated workunits, or it may simply be due to the backlog on the servers. These workunits have not been assimilated, or ther is another error that is preventing them from being dealt with, other than manually.

Mike is correct in that posting them here does little, other than possibly alerting the developers to the existence of these orphaned workunits. Due to the fact that this project periodically runs the work out, I suspect they notice these then anyway. (although they seem to have chosen not to correct them at this time.) Actually, because these studies build on each other, we can be sure that any lost WU whose configuration would be interesting would be noticed in the later study...

Anyway, they most likely would do any deleting not from the workunits that we post here, but from an analysis of the database, so a bunch more posts of WUs here isn't worth much.
17) Message boards : Number crunching : Units not validating (Message 12892)
Posted 27 Feb 2006 by uioped1
Post:
I don't think that it matters that this WU hasn't validated, but maybe it's time to remove it from the database, just as housekeeping.
18) Message boards : Number crunching : Unofficial BOINC Wiki closing 2006-03-31 (Message 12883)
Posted 26 Feb 2006 by uioped1
Post:
What will you do with your spare processing power now? If you have another idea that you find more appealing, I for one would like to know.



©2024 CERN