Message boards : Number crunching : Something is wrong managed!!!
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile littleBouncer
Avatar

Send message
Joined: 23 Oct 04
Posts: 358
Credit: 1,439,205
RAC: 0
Message 20471 - Posted: 22 Sep 2008, 0:03:46 UTC

If a WU has 3 successfull results, all other results will be marked from the server: as \"aborted by project.\" (later when reported: as \"redundant result\")

Why you send out 5 results to crunch instead of 3....

I got alot which I didn\'t need to crunch
Examples:
wu 2884613
wu 2884618
wu 2884886
wu 2885114
wu 2885306
etc....

This happened after I reported 10 finished WUs manually.

Next time I will set the manager of \"no more work\" from LHC, when I reached daily-quota, and contact the server when all WUs from LHC is done (not earlier !!!), so I can crunch all work which was send to me...

greetz littleBouncer
ID: 20471 · Report as offensive     Reply Quote
Simplex0

Send message
Joined: 26 Aug 05
Posts: 68
Credit: 545,660
RAC: 0
Message 20484 - Posted: 22 Sep 2008, 13:26:28 UTC

But compared with how it used to be this will definitely speed up the process.

Before some guys downloaded loads of units and sat on them forever.

ID: 20484 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 20490 - Posted: 22 Sep 2008, 16:56:17 UTC - in response to Message 20486.  
Last modified: 22 Sep 2008, 17:03:41 UTC

littleBouncer's plan is not progress, it is regression to the old, slow, wasteful way.


I think the walls of some structure somewhere just collapsed, because we agree that the "plan" put forth by littlebouncer defeats the purpose of not processing tasks that don't need to be processed...

FYI, if you look at my results list, I've been manually aborting tasks that already met quorum. I have one task to process that I may end up missing reporting by a couple of hours. Not sure yet. I also have 2 "ghost" tasks that I don't know what happened to. I haven't checked to see if they were downloaded or not, but they are definitely not on the box waiting to be processed (and they have met quorum too)...
ID: 20490 · Report as offensive     Reply Quote
Profile littleBouncer
Avatar

Send message
Joined: 23 Oct 04
Posts: 358
Credit: 1,439,205
RAC: 0
Message 20491 - Posted: 22 Sep 2008, 17:20:30 UTC

THx for all replies.

Sure is better to speed-up the server-side; when 3 sucess results are returned, then the other work is not necessary.

But you can do this by sending out only 3 WU !!
(or 4 , for assurance, when one fails)

My reasons (for this protest) were:
1.-I want to reach 100,000 credits, as fast as possible.
2.-After a long time not seeing LHC-screensaver, I was happy to see it for estimated 4 days, but then the half work was done......

greetz littleBouncer
(no offend was taken!!!)
ID: 20491 · Report as offensive     Reply Quote
droople
Avatar

Send message
Joined: 18 Aug 08
Posts: 11
Credit: 1,502,038
RAC: 7
Message 20534 - Posted: 25 Sep 2008, 22:18:50 UTC - in response to Message 20495.  
Last modified: 25 Sep 2008, 22:30:18 UTC

THx for all replies.

Sure is better to speed-up the server-side; when 3 sucess results are returned, then the other work is not necessary.

But you can do this by sending out only 3 WU !!
(or 4 , for assurance, when one fails)


The problem with LHC tasks is that the first 3 Success returns frequently fail to match. Frequently, even a 4th Success result fails to match. And frequently 1 or more of the first 3 returns Compute Error. That happens so frequently that they find it is faster just to send 5 results at the beginning, in other words, to use Initial Replication = 5 (IR = 5).

More precisely, it was faster to use the IR = 5 strategy when they were using the old server code. In fact, that was the only speed up strategy possible with the old server code. The new server code they recently installed has new features and capabilities that make possible new strategys where maybe they can have IR = 3 and complete the batch of work just as quickly as with IR = 5. But that strategy also includes implementing some as yet undetermined combination of these new features:

1) limiting crunchers to having no more than X tasks per CPU in their cache, to spread the work out over more hosts
2) the server keeping a list of fast reliable hosts to which additional tasks will be sent if 1 or more of the first 3 tasks fails or the first 3 don\\\'t achieve quorum.
3) possibly reducing the deadline
4) making BOINC client version 5.8.16 a minimum requirement to participate in the project
5) other new features I have forgotten at the moment

Using those new features, other projects that were forced to use IR = 5 a few years ago have found they are now able to reduce IR to 3 and still finish the batch quickly but they likely had to rewrite some/all of their verifier/assimilator/feeder/transitioner scripts to accomplish all that. That takes time and money. We can only hope that LHC will find a way to incorporate some of those new features into their strategy, if they can. Canceling redundant tasks is a good start.




I just think it's unfair for the people who finish the WU before the deadline but only a bit later than other people. Not all the computers are running and connecting to Internet and crunching LHC@Home 24 hours.
ID: 20534 · Report as offensive     Reply Quote
Leevis

Send message
Joined: 15 Sep 08
Posts: 1
Credit: 7,788
RAC: 0
Message 20535 - Posted: 26 Sep 2008, 0:01:24 UTC - in response to Message 20534.  
Last modified: 26 Sep 2008, 0:05:09 UTC


I just think its unfair for the people who finish the WU before the deadline but only a bit later than other people. Not all the computers are running and connecting to Internet and crunching LHC@Home 24 hours.


It wasnt very long ago that I joined this project, and I must admit that initially I felt the same way you do. It just seemed totally unfair that after waiting SO long for WUs to crunch, that when I finally get some, it seems that most get aborted before my clients barely get started on them.

I dont feel that way any longer. Why, you might ask? Well, not long ago, while I was in the process of composing a post for a new thread(not unlike the first in this thread), I had an epiphany. I suddenly realized that, it really shouldnt matter to me how may WUs I crunch and get credit for. The important thing is that the work just gets done period. It doesnt matter if my total credit is 1 or 1 million. For me, its not a contest, because the only "winner" should be the LHC and the scientists who have devoted their lives to the science.

By participating, I am offering my available resources to the project. If they are used, great! If they are not, thats OK too. Either way, I am still participating and feel like I am a part of something very important thats much bigger than me.

Leevis
ID: 20535 · Report as offensive     Reply Quote
droople
Avatar

Send message
Joined: 18 Aug 08
Posts: 11
Credit: 1,502,038
RAC: 7
Message 20536 - Posted: 26 Sep 2008, 4:35:35 UTC - in response to Message 20535.  


I just think its unfair for the people who finish the WU before the deadline but only a bit later than other people. Not all the computers are running and connecting to Internet and crunching LHC@Home 24 hours.


It wasnt very long ago that I joined this project, and I must admit that initially I felt the same way you do. It just seemed totally unfair that after waiting SO long for WUs to crunch, that when I finally get some, it seems that most get aborted before my clients barely get started on them.

I dont feel that way any longer. Why, you might ask? Well, not long ago, while I was in the process of composing a post for a new thread(not unlike the first in this thread), I had an epiphany. I suddenly realized that, it really shouldnt matter to me how may WUs I crunch and get credit for. The important thing is that the work just gets done period. It doesnt matter if my total credit is 1 or 1 million. For me, its not a contest, because the only \"winner\" should be the LHC and the scientists who have devoted their lives to the science.

By participating, I am offering my available resources to the project. If they are used, great! If they are not, thats OK too. Either way, I am still participating and feel like I am a part of something very important thats much bigger than me.

Leevis


Hi Leevis

I understand what you mean, as long as you can contribute something to the project, you are satisfied.

But please don\'t forget crunching needs electricity, and in Australia, the power stations burn coal(85%). LHC@home aborted the results, this means some CO2 was generated for nothing.

It\'s not sustainable.

Cheers
ID: 20536 · Report as offensive     Reply Quote
droople
Avatar

Send message
Joined: 18 Aug 08
Posts: 11
Credit: 1,502,038
RAC: 7
Message 20539 - Posted: 26 Sep 2008, 8:42:37 UTC - in response to Message 20537.  


I just think its unfair for the people who finish the WU before the deadline but only a bit later than other people. Not all the computers are running and connecting to Internet and crunching LHC@Home 24 hours.


It wasnt very long ago that I joined this project, and I must admit that initially I felt the same way you do. It just seemed totally unfair that after waiting SO long for WUs to crunch, that when I finally get some, it seems that most get aborted before my clients barely get started on them.

I dont feel that way any longer. Why, you might ask? Well, not long ago, while I was in the process of composing a post for a new thread(not unlike the first in this thread), I had an epiphany. I suddenly realized that, it really shouldnt matter to me how may WUs I crunch and get credit for. The important thing is that the work just gets done period. It doesnt matter if my total credit is 1 or 1 million. For me, its not a contest, because the only \"winner\" should be the LHC and the scientists who have devoted their lives to the science.

By participating, I am offering my available resources to the project. If they are used, great! If they are not, thats OK too. Either way, I am still participating and feel like I am a part of something very important thats much bigger than me.

Leevis


Hi Leevis

I understand what you mean, as long as you can contribute something to the project, you are satisfied.

But please don\'t forget crunching needs electricity, and in Australia, the power stations burn coal(85%). LHC@home aborted the results, this means some CO2 was generated for nothing.

It\'s not sustainable.

Cheers


They are canceling only redundant results that have not started crunching. If they haven\'t started then next to 0 CPU time and electricity has been spent on them. It\'s the tasks that are redundant but don\'t get canceled because the host has already started them... those are the results that are wasting CPU time and electricity. If hosts would contact the server more often then more redundant tasks would get canceled. The project managers can direct hosts to contact the server more frequently but they are not doing so.

There are BOINC options you can set on your end (the host end) which will tend to cause your computer to contact the server more often. More frequent contact increases the odds of your computer receiving a cancel order before it starts crunching a redundant task rather than after.


So you mean if I start to crunch the WU before connect with the server, the WU will be completed without cancelling even when during the crunching, BOINC communicates with the server?

Cheers
ID: 20539 · Report as offensive     Reply Quote
Brian Silvers

Send message
Joined: 3 Jan 07
Posts: 124
Credit: 7,065
RAC: 0
Message 20542 - Posted: 26 Sep 2008, 14:41:37 UTC - in response to Message 20537.  
Last modified: 26 Sep 2008, 14:42:04 UTC

If hosts would contact the server more often then more redundant tasks would get canceled. The project managers can direct hosts to contact the server more frequently but they are not doing so.


A question in regards to this:

Would setting a project to "No New Tasks" (or whatever it is called in newer versions) end up making it to where this would not work?

Reason I'm asking is because I am pretty sure that if I do that (set to NNT) with 5.8.16, if I have a pending scheduler connect on a countdown, BOINC won\'t even attempt to connect when the countdown is over. It is either that, or it does the connect that time, but no more....
ID: 20542 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 1 Sep 04
Posts: 36
Credit: 78,199
RAC: 0
Message 20543 - Posted: 26 Sep 2008, 16:37:19 UTC - in response to Message 20542.  
Last modified: 26 Sep 2008, 16:40:12 UTC

If hosts would contact the server more often then more redundant tasks would get canceled. The project managers can direct hosts to contact the server more frequently but they are not doing so.


A question in regards to this:

Would setting a project to \"No New Tasks\" (or whatever it is called in newer versions) end up making it to where this would not work?

Reason I\'m asking is because I am pretty sure that if I do that (set to NNT) with 5.8.16, if I have a pending scheduler connect on a countdown, BOINC won\\\'t even attempt to connect when the countdown is over. It is either that, or it does the connect that time, but no more....

WCG is routinely connecting scheduling-server once every 4 days, as this is the setting they\'e using, even WCG is currently set to \"No new work\"... If set to \"suspended\" on the other hand, it should not connect, atleast not if you're running v5.10.xx... Not sure, since isn't connected just now, but BURP has been using a 1-hour delay before re-connect, so it should likely be a good project for testing-out this feature. ;)
"I make so many mistakes. But then just think of all the mistakes I don't make, although I might."
ID: 20543 · Report as offensive     Reply Quote

Message boards : Number crunching : Something is wrong managed!!!


©2022 CERN