Message boards :
Number crunching :
Initial replication and missing workunit
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 12 Oct 07 Posts: 5 Credit: 3,113 RAC: 0 |
Hello ! 2 questions for the same thread : 1) Is an initial replication of 5 really useful when you need a quorum of 3 ? Aren't the last 2 results a waste of CPU time ? 2) This workunit is supposed to be crunching now in my computer. http://lhcathome.cern.ch/lhcathome/workunit.php?wuid=2310088 Only thing is I have only 1 workunit on my computer, and it's not that one. Why is this so ? Thanks Duanra ![]() |
Send message Joined: 31 Dec 05 Posts: 68 Credit: 8,691 RAC: 0 |
Hello ! Rather than set short deadlines, this project tries to ensure fast turnarounds by using IR > Q. This decision has caused some controversy amongst crunchers, but the admins have said they will look at it again when they upgrade the server code. Aren't the last 2 results a waste of CPU time ? Yes. And because the server code is such an old version, the server can't ask clients to abort the unnecessary WUs. :-( |
Send message Joined: 16 Oct 06 Posts: 15 Credit: 144,247 RAC: 0 |
stuff happens: it might be a similar symptom to the orphaned wu I got at the beginning of the week. My boinc mgr sent a request for work at 04.32utc Mar4, a wu was issued at 04.34utc but my log reported communication failed at 04.37 and timed out without receiving the the wu. I can't abort or do anything about it, it will just have to 'fail to report' and possibly be re-issued if needed. /pg |
Send message Joined: 18 Sep 04 Posts: 8 Credit: 1,181,841 RAC: 0 |
Hello ! Otoh if a workunit is already in progress aborting it would be a waste, too. If the "faster" units did not error out the additional result is not really helpful, that I agree. But if the "fast" units error out a "slow" workunit on a different architecture might still finish and give needed data. So unless IR >> Q (say more than two surplus workunits) a little surplus of IR is making sense imv. |
![]() Send message Joined: 14 Jul 05 Posts: 275 Credit: 49,291 RAC: 0 ![]() |
Yes. And because the server code is such an old version, the server can't ask clients to abort the unnecessary WUs. :-( The abort mechanism only aborts ready-to-run workunits. Running workunits only get aborted if there is no way for them to get credit (ie. if they are WAY too late). |
Send message Joined: 31 Dec 05 Posts: 68 Credit: 8,691 RAC: 0 |
Otoh if a workunit is already in progress aborting it would be a waste, too. If the "faster" units did not error out the additional result is not really helpful, that I agree. Indeed. Those that dislike IR > Q would rather have shorter deadlines and have the project re-issue a WU if a client encounters an error or fails to report in time. However under certain circumstances IR > Q leads to Q more quickly than IR = Q with re-issues does. This project is deadline-contrained rather than crunchtime-constrained, which is why the admins have it setup like this - they can afford to waste donated CPU cycles, but they can't afford to waste time. If the flow of WUs ever oustrips the power of the attached crunchers (or they have WUs that are not time-critical), I'm sure they will look at this again. |
Send message Joined: 31 Dec 05 Posts: 68 Credit: 8,691 RAC: 0 |
The abort mechanism only aborts ready-to-run workunits. Running workunits only get aborted if there is no way for them to get credit (ie. if they are WAY too late). Yes, it is a pity there isn't a client-side option to allow the abort of running WUs if Q has been achieved (or for any other reason the server decides). This would allow crunchers who don't care about credits not to waste CPU time. Perhaps BOINC 6 will bring this... |
Send message Joined: 3 Jan 07 Posts: 124 Credit: 7,065 RAC: 0 |
The abort mechanism only aborts ready-to-run workunits. Running workunits only get aborted if there is no way for them to get credit (ie. if they are WAY too late). I was not certain enough yesterday to contradict what povaddict has said, and I'm still not certain enough today due to not tracking down what is tickling my brain into thinking what I'm thinking, but I thought that there was a capability within the server-side aborts to do the abort unconditionally. The conditional "abort the task if it is not running" was the "user-friendly" option... I could be mistaken about this though. IIRC, it was said over on the SETI message boards and I think it was either by John Mcleod VII, Josef Segur, or Ingleside... ![]() |
![]() Send message Joined: 14 Jul 05 Posts: 275 Credit: 49,291 RAC: 0 ![]() |
I was not certain enough yesterday to contradict what povaddict has said, and I'm still not certain enough today due to not tracking down what is tickling my brain into thinking what I'm thinking, but I thought that there was a capability within the server-side aborts to do the abort unconditionally. The conditional "abort the task if it is not running" was the "user-friendly" option... I could be mistaken about this though. IIRC, it was said over on the SETI message boards and I think it was either by John Mcleod VII, Josef Segur, or Ingleside... Yes. If the server notices the client has a workunit that was completely aborted by the admin, or a workunit that has already expired (user didn't return it on time) and even got validated from other results, it will send an "abort now, no matter if it's running or not". However, *users* can't choose if they want an "abort even if running" instead of "abort if not started" in the common case that a workunit reaches quorum normally. |
Send message Joined: 3 Jan 07 Posts: 124 Credit: 7,065 RAC: 0 |
I was not certain enough yesterday to contradict what povaddict has said, and I'm still not certain enough today due to not tracking down what is tickling my brain into thinking what I'm thinking, but I thought that there was a capability within the server-side aborts to do the abort unconditionally. The conditional "abort the task if it is not running" was the "user-friendly" option... I could be mistaken about this though. IIRC, it was said over on the SETI message boards and I think it was either by John Mcleod VII, Josef Segur, or Ingleside... Understood that... It's just that a project can choose to do unconditional aborts as well as the "polite" version... ;-) Also, BOINC versions 5.8.16 and older do not support the server-side aborts... ![]() |
Send message Joined: 17 Sep 04 Posts: 41 Credit: 27,497 RAC: 0 |
For more reading on the initial replication discussion go here http://lhcathome.cern.ch/lhcathome/forum_thread.php?id=2537 but don't expect admin to answer and don't expect a logical answer to the question. ![]() ![]() |
©2025 CERN