Message boards :
Number crunching :
Long WU's
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next
Author | Message |
---|---|
Send message Joined: 20 Sep 05 Posts: 31 Credit: 1,238,566 RAC: 7 |
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=2504999 Update: 103 hours gone, still 79 hours to go and way past the expected return date of 2012-08-11 (Prescott P4@3200, running high priority all the time. OS: 32-bit WinXP, 2 Gb of DDR2 RAM -the Intel D102GGC2 mobo won't support more). |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
70+ hours gone on my AMD E-450 APU, 45.157% done 80 hours to go, deadline past yesterday morning. Tullio |
Send message Joined: 20 Sep 05 Posts: 31 Credit: 1,238,566 RAC: 7 |
I win! Mine is supposed to run for a grand total of 180 hours, yours only 150....Anyway, as long as the result gets declared valid I crunch on ;) |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Well I THINK they are synonymous.......... We shall see. Eric. |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
Well I THINK they are synonymous.......... It's SSE3 which should be synonymous with PNI. SSE2 might possibly be slower, but probably not by much. |
Send message Joined: 20 Sep 05 Posts: 31 Credit: 1,238,566 RAC: 7 |
Well I THINK they are synonymous..........We shall see. Eric. Wikipedia says they're synonymous, SSE3 and Prescott New Instructions (PNI) |
Send message Joined: 26 Sep 11 Posts: 37 Credit: 7,807,848 RAC: 8 |
So what happens when a task does run beyond its deadline? Apologies if this has been asked before. I have a task that is only 45% completed and close to its deadline. Should I abort it now and stop wasting time or should I let it run? I don't care about credits very much, although more credits are always better. |
Send message Joined: 17 Feb 07 Posts: 86 Credit: 968,855 RAC: 0 |
So what happens when a task does run beyond its deadline? Apologies if this has been asked before. Well at other projects a taks after the deadline is marked as invalid and get no credits. I have aborted a long wu which never would met the deadline. I have now a new one, still need 58 hours in 6 days, so that must be no problem. Greetings from, TJ |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
So what happens when a task does run beyond its deadline? Apologies if this has been asked before. Not quite true. After the deadline has passed, the WU remains 'live' while a replacement task is created, sent to another user, computed, and returned. Provided your task is returned before the replacement comes back, you are eligible for credit. Look at WU 2505053, especially the middle task. The deadline for that middle task was 11 Aug 2012 | 0:40:32 UTC, and the user missed it. The third task was sent to me - almost immediately, in that case, because no new work was being generated at the time: usually they hang around for several hours. But mine took a long time to run, so that the middle user returned his before me, while the WU was still incomplete. Finally, I returned my copy before my assigned deadline of 14 Aug 2012 | 18:40:32 UTC, and all three of us got credit. That's correct, according to the way BOINC is designed to work. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Frankly I don't know....... We shall be looking at all this tomorrow. If you are close, let it run, and we shall see. Eric. (I already have 98% of these very long taks finished successfully.) |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Frankly I don't know....... We shall be looking at all this tomorrow. If you are close, let it run, and we shall see. Eric. (I already have 98% of these very long taks finished successfully.) |
Send message Joined: 26 Sep 11 Posts: 37 Credit: 7,807,848 RAC: 8 |
Thanks. I will let it run a bit more and see what happens. It's this task: http://lhcathomeclassic.cern.ch/sixtrack/result.php?resultid=5623859 It has the misfortune of running on my slowest machine, an Atom-powered netbook. I see 2 other people completed the WU for that task already, so I don't know if I will get any credit at all. Bummer if I don't, but fortunately that is not my motivation. Looking just now the task has run for almost 142 hours, reports that is has another 114 hours to go, and has only 18 hours until the deadline. Previously I aborted several tasks that had extremely long running times, but then I noticed that after a very slow start they tended to accellerate and complete in a decent time anyway. Which is why I let this one run on too, in the hope it would have the same pattern. Unfortunately, it is accellerating, but only very, very slowly. Let's see what happens. If I'm better off aborting the task and starting on something else let me know. I don't like wasting CPU time, even if it's a slow CPU. |
Send message Joined: 29 Sep 04 Posts: 42 Credit: 11,505,632 RAC: 0 |
I also had such a monster: http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=2515279 Spending 110 hours (more ) on it and only getting 190.57 Credits. Thats not fair ;( |
Send message Joined: 22 Jul 05 Posts: 72 Credit: 3,962,626 RAC: 0 |
.... If I'm better off aborting the task and starting on something else let me know. I don't like wasting CPU time, even if it's a slow CPU. Because there are already two validated tasks for that workunit, the quorum is already complete and you should immediately abort your now unnecessary copy. If it could be completed before deadline, it could also receive credit. Once the deadline passes, you will not receive credit so (from what you say) you should abort it immediately and stop wasting time. Cheers, Gary. |
Send message Joined: 22 Jul 05 Posts: 72 Credit: 3,962,626 RAC: 0 |
I also had such a monster: Everybody who had one of those 10M turn tasks suffered the same fate and Eric has already accepted blame for the oversight - both for the limit on credit and for the inadequate deadline. Not much use complaining further. But what about this particular example. It's a completed and validated quorum where one host took 180Ksecs and the other took 303Ksecs and the credit award was 0.00 for both. It's not an isolated event. Here is a small list of completed and validated quorums where zero credit was given. There are lots more like these. 2566284 2566253 2566221 2566220 2566215 2566214 2561956 2560652 2560651 2560648 The common factor is that one of the hosts participating in all those quorums is running the sse3 version of the application under the anonymous platform mechanism (AP). When the various versions were first released, there were problems with the detection of CPU capabilities and all my hosts (even though sse3 capable) were being sent the much slower generic app. I solved that problem by forcing the use of the sse3 version with AP. At that time I didn't see any problem with credit awards. It's possible I wasn't paying close enough attention but I do believe all validated tasks were receiving normal credit. When the CPU detection was improved, I started removing AP from my hosts as caches drained. I wasn't in any particular hurry - I was making the transition when convenient. I still had quite a few machines to go when I started noticing the zero credit awards. Not every result gets zero credit. At least half or slightly more get normal credit. It seems to be a pretty random thing. I reported it to Igor and Eric over a week ago but the behaviour continues. The caches for the last couple of AP hosts should drain today so when AP is removed on those hosts, that will be the end of the problem for me at present. The problem should be investigated so that future use of AP is not compromised. Cheers, Gary. |
Send message Joined: 26 Sep 11 Posts: 37 Credit: 7,807,848 RAC: 8 |
.... If I'm better off aborting the task and starting on something else let me know. I don't like wasting CPU time, even if it's a slow CPU. You're right. I'm just seeing it listed as an Error result because it timed out. I'm not at home now, so it may still be crunching away. I'll abort it when I get home. Sad for the wasted week of computing time. Older, yes; wiser, maybe. |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
Everybody who had one of those 10M turn tasks suffered the same fate and Eric has already accepted blame for the oversight - both for the limit on credit and for the inadequate deadline. Not much use complaining further. I didn't check the whole list of zero-credit WUs, but every one I checked shared another common factor: the Anonymous Platform host has declared a run time of 0.00 seconds. I don't think that's related to AP as such: it's likely to be the result of using BOINC version 6.2.15 - I think the separation of runtime and CPU time in client reports came in round about BOINC v6.6.xx, with the introduction of GPU computing (where the two figures differ greatly). If the client is not making a runtime report, the server is defaulting the missing field to zero, and Igor has implemented 'credit from runtime', we would get the result you report, without invoking Anonymous Platform at all. This problem rings a vague bell in my mind, and I thought it had been addressed - perhaps by using a copy of the CPU time report as a surrogate runtime report, at the server end - but it will take some lengthy rummaging in the archives of the boinc_alpha mailing list to confirm that. If it helps, I could fairly easily switch one of my hosts to AP, using a later BOINC, so that we could check whether there is any specific problem with AP alone, separately from the BOINC version issue. |
Send message Joined: 9 Jan 08 Posts: 66 Credit: 727,923 RAC: 0 |
Everybody who had one of those 10M turn tasks suffered the same fate and Eric has already accepted blame for the oversight - both for the limit on credit and for the inadequate deadline. Not much use complaining further. I also just checked and found 2 of these in my results: WU 2554378 WU 2554391 Both had the same wingman. He was using: Linux 2.6.26.8.tex3, BOINC version 6.2.15, CPU type: GenuineIntel, Intel(R) Core(TM)2 Quad CPU Q8400 @ 2.66GHz [Family 6 Model 23 Stepping 10] So the Boinc version and anonymous platform might be a bad combo for this project. Maybe the AP makes it not able to use the CPU report instead of the runtime report. Edit: I was just looking through his results. He has a lot of WU's that give 0 credit. But I can see now that he changed back to normal platform again, and since he did that, he got full points for all of those. But he didn't update the boinc version. So it is definately something with AP, and probably also something with v6.2.x, but probably only when in combination with AP. |
Send message Joined: 22 Oct 08 Posts: 26 Credit: 75,214 RAC: 0 |
Richard, I've been running AP since the various executable's came out. (sse, sse2 etc...) For two reasons: 1:To stop boinc downloading a different executable after each time I run out of tasks. To prevent it using something lower. 2:To stop boinc downloading the same executable again and again once I run out of tasks and get some more after a while. So far I'm running Boinc 7.0.28 on AP and no tasks have reported zero run time. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Yes, I take the blame but this is because I exceeded some limits in BOINC itself. Igor will explain all this shortly when he has finished investigating and correcting. Sometimes it is essential to get a new executable. While I hope we shall now remain numerically compatible (for ever? :-) there will be new physics in SixTrack, new elements in the ring. |
©2024 CERN