Message boards : Number crunching : Initial Replication
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

AuthorMessage
KAMasud

Send message
Joined: 7 Oct 06
Posts: 114
Credit: 23,192
RAC: 0
Message 17683 - Posted: 2 Aug 2007, 10:01:37 UTC
Last modified: 2 Aug 2007, 10:08:49 UTC

:-) if admin thinks its is important then they will implement your ideas ;-) but why should they go to all that hassle if their requirements are being met? B-)
;-) Now heavy lift gear is rated to 250 percent of Lifting Load, what would you call it? a waste of material, power and man power? :-)
Regards
Masud.
By the way read and digest ;-)
Besides helping build the LHC, LHC@home is helping to provide some fundamental insights into the challenges of distributed computing. Both with LHC@home, and with an in-house precursor called CPSS, it was discovered that different processors can produce sometimes markedly different results. This is due to the ways that certain mathematical functions such as exponential and tangent handle rounding errors on different processors. The lack of an internationally accepted standard for this, combined with the often chaotic nature of the particle orbits in the SixTrack simulations, which tend to amplify even the smallest differences, can produce significantly different results on different computers. Although initially perceived as a setback, this problem can be dealt with using newly developed function libraries from a group at the Ecole Nationale Supérieure in Lyon. The result also provides a potentially important insight for those currently developing applications for Grid computing, since ultimately these will also have to run on a wide range of computers.
ID: 17683 · Report as offensive     Reply Quote
Profile AstralWalker

Send message
Joined: 30 Nov 05
Posts: 14
Credit: 1,746,819
RAC: 0
Message 17703 - Posted: 4 Aug 2007, 2:38:48 UTC - in response to Message 17685.  
Last modified: 4 Aug 2007, 2:46:05 UTC

[quote]:-) if admin thinks its is important then they will implement your ideas ;-) but why should they go to all that hassle if their requirements are being met? B-)

To demonstrate that they are responsible adults rather than selfish, wasteful sloths, just for openers.

Thanks for the article but if you want it to carry any weight then you need to quote the source. Anyway, I don't see anything in the article that says LHC needs initial replication of 5. Do you?

I don't see anything in the article that says they can't do their research into DC without wasting our resources. Do you?

If you look at any large construction project (which is what the LHC is), the need to meet deadlines always requires added resources. And accuracy naturally requires replications for the reasons stated above mahy times.

Anyway, I don't know the specific article he quoted but the basic idea with the discrepancy between Intel and Intel compatible CPUs is repeated here, which is a paper written by a CERN scientist and some guy from Canada. While they do say they fixed this particular issue, what specifically are your reasons for saying that 3 replications is enough and 5 is just waste? Why is it that if we disagree with you we must be selfish wasteful sloths?

That article also says a replication of at least 3 is required for accuracy. I guess those folks at CERN (edit: and Canada) are selfish, wasteful sloths as well. ;)
ID: 17703 · Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 17 Sep 04
Posts: 41
Credit: 27,497
RAC: 0
Message 17705 - Posted: 4 Aug 2007, 10:50:02 UTC - in response to Message 17703.  

"While they do say they fixed this particular issue, what specifically are your reasons for saying that 3 replications is enough"

By accepting a quorum of 3, as LHC does, they, themselves, say that 3 similar results are accurate enough.

ID: 17705 · Report as offensive     Reply Quote
Bob Guy

Send message
Joined: 28 Sep 05
Posts: 21
Credit: 11,715
RAC: 0
Message 17706 - Posted: 4 Aug 2007, 13:49:11 UTC

If you want to be too literal:

"At least three" might be construed to mean "not less than three", but it does not mean "not more than three".

So five is OK, gives a saftey margin, provides more work from a limited supply, ensures that a quorum is met sooner rather than later and allows for evaluation of the possible AMD not equal to Intel issue.

The issue of wasting resources can be carried to extremes. If I wanted to be a dictator I might order that only recent AMD CPUs and C2Ds and C2Qs be used in a project because all the others are wasteful of resources due to operating inefficiency. That's not going to happen but it might be a reason for an individual to voluntarily retire their own old computer.
ID: 17706 · Report as offensive     Reply Quote
Profile [B^S] ShanerX

Send message
Joined: 14 Jul 05
Posts: 41
Credit: 1,788,341
RAC: 0
Message 17707 - Posted: 4 Aug 2007, 15:29:55 UTC

Good point(s) ... this thread could continue on forever. Everyone has a valid point, but the purpose of all Boinc projects is to accomplish a certain goal. Whether it's for testing something new, fixing existing problems, or for a real-time data crunching like LHC.

We've waited many, many months for work ... so I'm just happy we're crunching again and will contribute what I can. We got what we wanted, so let's just accept the fact that this project has an IR5 with a quorum of 3, done. No other projects would consider changing unless there was a big problem. Nanohive is the only project I know of that did just this. Anyway, my 2.55 cents, but let's be courteous here - all of us are contributing to one of the greatest creations of our time - - the collider!

ID: 17707 · Report as offensive     Reply Quote
Profile caspr
Avatar

Send message
Joined: 26 Apr 06
Posts: 89
Credit: 309,235
RAC: 0
Message 17708 - Posted: 4 Aug 2007, 15:41:03 UTC
Last modified: 4 Aug 2007, 15:43:28 UTC

Good point(s) ... this thread could continue on forever. Everyone has a valid point, but the purpose of all Boinc projects is to accomplish a certain goal. Whether it's for testing something new, fixing existing problems, or for a real-time data crunching like LHC.

We've waited many, many months for work ... so I'm just happy we're crunching again and will contribute what I can. We got what we wanted, so let's just accept the fact that this project has an IR5 with a quorum of 3, done. No other projects would consider changing unless there was a big problem. Nanohive is the only project I know of that did just this. Anyway, my 2.55 cents, but let's be courteous here - all of us are contributing to one of the greatest creations of our time - - the collider!





This is the best post I've seen on this thread! HERE! HERE!
A clear conscience is usually the sign of a bad memory


ID: 17708 · Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 17 Sep 04
Posts: 41
Credit: 27,497
RAC: 0
Message 17709 - Posted: 4 Aug 2007, 16:28:55 UTC - in response to Message 17707.  

Good point(s) ... this thread could continue on forever. Everyone has a valid point, but the purpose of all Boinc projects is to accomplish a certain goal. Whether it's for testing something new, fixing existing problems, or for a real-time data crunching like LHC.

We've waited many, many months for work ... so I'm just happy we're crunching again and will contribute what I can. We got what we wanted, so let's just accept the fact that this project has an IR5 with a quorum of 3, done. No other projects would consider changing unless there was a big problem. Nanohive is the only project I know of that did just this. Anyway, my 2.55 cents, but let's be courteous here - all of us are contributing to one of the greatest creations of our time - - the collider!



You can of course accept whatever you want and what you want according to what you write is just to crunch any numbers pointless or not. Also you're not concerned if all your computer is doing is "BUSY WORK"

That maybe what you want but it's certainly not what I want and if this project continues to waste resources that other (just as deserving projects) could have used to actually accomplish something then I will donate my humble resources to them. Maybe even make a difference.
ID: 17709 · Report as offensive     Reply Quote
Profile caspr
Avatar

Send message
Joined: 26 Apr 06
Posts: 89
Credit: 309,235
RAC: 0
Message 17710 - Posted: 4 Aug 2007, 17:54:34 UTC
Last modified: 4 Aug 2007, 18:08:27 UTC


You can of course accept whatever you want and what you want according to what you write is just to crunch any numbers pointless or not. Also you're not concerned if all your computer is doing is "BUSY WORK"

That maybe what you want but it's certainly not what I want and if this project continues to waste resources that other (just as deserving projects) could have used to actually accomplish something then I will donate my humble resources to them. Maybe even make a difference.






So you can determen as a partical physysist what they need to do as far as initial replication is concerned???
A clear conscience is usually the sign of a bad memory


ID: 17710 · Report as offensive     Reply Quote
Profile caspr
Avatar

Send message
Joined: 26 Apr 06
Posts: 89
Credit: 309,235
RAC: 0
Message 17711 - Posted: 4 Aug 2007, 18:00:02 UTC

And in case I got in a hurry and misspelled anything/everything.....ARE U REALLY THAT SMART??
A clear conscience is usually the sign of a bad memory


ID: 17711 · Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 17 Sep 04
Posts: 41
Credit: 27,497
RAC: 0
Message 17712 - Posted: 4 Aug 2007, 22:02:23 UTC - in response to Message 17710.  


You can of course accept whatever you want and what you want according to what you write is just to crunch any numbers pointless or not. Also you're not concerned if all your computer is doing is "BUSY WORK"

That maybe what you want but it's certainly not what I want and if this project continues to waste resources that other (just as deserving projects) could have used to actually accomplish something then I will donate my humble resources to them. Maybe even make a difference.






So you can determen as a partical physysist what they need to do as far as initial replication is concerned???





I have no need to try to determine the LHCs needs as they have already done that themselves when they set the quorum to 3

This means that they have come to the decision that they need 3 reults in which case, since they are sending each WU to 5 computers they are delibrately and knowingly wasting 2 of those computers efforts as the result they send back will just be discarded as a quorum will already have been reached and a cannonical result sent to the database of valid reults.

Dagorath has already said in this thread that they would be better off shortening the deadline if they needed reults back quickly and I agree with him.

ID: 17712 · Report as offensive     Reply Quote
Bob Guy

Send message
Joined: 28 Sep 05
Posts: 21
Credit: 11,715
RAC: 0
Message 17721 - Posted: 5 Aug 2007, 12:53:46 UTC

How's your blood pressure? Not too high, I hope.

I am not persuaded by arguments that resort to yelling and name calling.
ID: 17721 · Report as offensive     Reply Quote
Profile Irondog
Avatar

Send message
Joined: 30 Aug 05
Posts: 11
Credit: 7,169,523
RAC: 117
Message 17722 - Posted: 5 Aug 2007, 16:47:25 UTC - in response to Message 17720.  
Last modified: 5 Aug 2007, 17:05:16 UTC

Free Ads said
Why waste computing time? If you only need a quorum of 3 why an initial replication of 5???
... if this project continues to waste resources that other (just as deserving projects) could have used to actually accomplish something then I will donate my humble resources to them.


Dagorath said
No, it does NOT give a safety margin. If you think it does then you obviously are not aware of how the quorum works. If the first 3 results match then they declare the canonical result. They do not wait for the 2 remaining results to return to see if they match with the canonical result. So how can you possibly say 5 provides a safety margin? Man, THINK about it.



If this were a perfect world, I could see your argument. But it's not. Sometimes the results don't match what then? What if a host machine crashes and the work is lost? And the one I've experienced a few times, upgrade BOINC and it wipes out the work? (Again, if this were a perfect world, I would ALWAYS back up the BOINC folder before upgrading) Now you have to reissue the missing work, and if I remember correctly, that won't happen until the batch is complete. Now the scientists have to wait, wasting time. Somewhere, someone is going to waste time. I prefer it NOT be the scientists. I'm sure most people wouldn't want to pay the scientists to sit around waiting.

Let’s also not forget that this project has just moved to a new home with new admins. I think I read somewhere that this is not a typical BOINC installation. I'm sure time is needed to figure it out, a few jobs will need to be run to make sure things are running correctly.

I've also added four machines to this project. If you’re still upset about wasting time, feel free to leave the project as you've mentioned. I'll pick up your slack.

ID: 17722 · Report as offensive     Reply Quote
Betting Slip

Send message
Joined: 17 Sep 04
Posts: 41
Credit: 27,497
RAC: 0
Message 17723 - Posted: 5 Aug 2007, 18:55:15 UTC - in response to Message 17722.  

Free Ads said
Why waste computing time? If you only need a quorum of 3 why an initial replication of 5???
... if this project continues to waste resources that other (just as deserving projects) could have used to actually accomplish something then I will donate my humble resources to them.


Dagorath said
No, it does NOT give a safety margin. If you think it does then you obviously are not aware of how the quorum works. If the first 3 results match then they declare the canonical result. They do not wait for the 2 remaining results to return to see if they match with the canonical result. So how can you possibly say 5 provides a safety margin? Man, THINK about it.



If this were a perfect world, I could see your argument. But it's not. Sometimes the results don't match what then? What if a host machine crashes and the work is lost? And the one I've experienced a few times, upgrade BOINC and it wipes out the work? (Again, if this were a perfect world, I would ALWAYS back up the BOINC folder before upgrading) Now you have to reissue the missing work, and if I remember correctly, that won't happen until the batch is complete. Now the scientists have to wait, wasting time. Somewhere, someone is going to waste time. I prefer it NOT be the scientists. I'm sure most people wouldn't want to pay the scientists to sit around waiting.

Let’s also not forget that this project has just moved to a new home with new admins. I think I read somewhere that this is not a typical BOINC installation. I'm sure time is needed to figure it out, a few jobs will need to be run to make sure things are running correctly.

I've also added four machines to this project. If you’re still upset about wasting time, feel free to leave the project as you've mentioned. I'll pick up your slack.



Your post only demonstrates that you don't understand what this thread is about. I wont repeat the excellent arguements that Dagorath has already put forward.

As far as you taking up the slack is concerned, go ahead, you will have the pleasure of knowing that up to 40% of your contribution will be wasted.
ID: 17723 · Report as offensive     Reply Quote
Profile Irondog
Avatar

Send message
Joined: 30 Aug 05
Posts: 11
Credit: 7,169,523
RAC: 117
Message 17727 - Posted: 5 Aug 2007, 21:00:47 UTC

I agree the setting IR 5 to can be a waste, but I see my questions haven't been answered, lets try again.

Sometimes the results don't match what then? What if a host machine crashes and the work is lost?


So, with IR set to 3, what happens? Does a new batch have to be created for the missing/corruped work? How long does this take (identifiying what work is missing/corrupted, creating the new batch, waiting for the work to get completed, and hoping that runs 100% error free)? Maybe 4 would be better?


ID: 17727 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 7 · Next

Message boards : Number crunching : Initial Replication


©2024 CERN