Message boards :
Number crunching :
cpu time ok... but zero credits granted
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Sep 04 Posts: 282 Credit: 1,415,417 RAC: 0 |
I have multiple WU's that did claim ok, but got granted zero... :-/ http://lhcathome.cern.ch/workunit.php?wuid=646 |
Send message Joined: 2 Sep 04 Posts: 165 Credit: 146,925 RAC: 0 |
The example given has had so many failures that nobody is getting credit, and it will not be sent again. BOINC WIKI |
Send message Joined: 2 Sep 04 Posts: 352 Credit: 1,748,908 RAC: 4,324 |
Personally I don't think LHC should have even gone Live again until these problems where straightened out first, the v64lhc type WU's are so buggy it's pathetic, I gotten so tired of watching WU after WU report with 0 Time... I've just simply started to delete them if I get any of the v64lhc type of WU, I refuse to waste my time on them anymore. The v64boince type WU seem okay but since I can't monitor each of my PC'c 24 hr's a day I can't be sure of them either. It wouldn't surprise me at all if after 9-11 hours of crunching on 1 that it just reports as 0 Time also ... |
Send message Joined: 27 Sep 04 Posts: 282 Credit: 1,415,417 RAC: 0 |
> The example given has had so many failures that nobody is getting credit, and > it will not be sent again. > if that is true, it sucks.... |
Send message Joined: 2 Sep 04 Posts: 121 Credit: 592,214 RAC: 0 |
> Personally I don't think LHC should have even gone Live again until these > problems where straightened out first, the v64lhc type WU's are so buggy it's > pathetic, I gotten so tired of watching WU after WU report with 0 Time... > > I've just simply started to delete them if I get any of the v64lhc type of WU, > I refuse to waste my time on them anymore. The v64boince type WU seem okay but > since I can't monitor each of my PC'c 24 hr's a day I can't be sure of them > either. It wouldn't surprise me at all if after 9-11 hours of crunching on 1 > that it just reports as 0 Time also ... Since they will certainly fix the Problem and Science is done regarless of Credits, I'll keep at full throttle. Bugs the Credits down a bit, but IMHO that will be among their No.1 Priorities to fix. Scientific Network : 45000 MHz - 77824 MB - 1970 GB |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,685,889 RAC: 15,649 |
Well just now is the first time I remember seeing the LHC "pending credit" page up and working. I have sent in a couple dozen since the project came back to life but only 2 are listed on the "pending credit" page. But now I have got 2 of those WU's that take over 24hrs to do. The one running now is one that started saying it would take 35hrs (as did the first one but it did finish faster for some reason) This one running now looks like it will take the entire 35hrs. Has anyone else had their "Total credit" go any higher since the project came back to life? Mine hasn't. But the "Recent average credit" has increased. (I am running LHC on my old PIII 500 pc since the others are busy with seti and Einstein) Volunteer Mad Scientist For Life |
Send message Joined: 17 Sep 04 Posts: 69 Credit: 26,714 RAC: 0 |
> Has anyone else had their "Total credit" go any higher since the project came > back to life? > > Mine hasn't. I started with 3991 look below. tony Formerly mmciastro. Name and avatar changed for a change The New Online Helpsytem help is just a call away. |
Send message Joined: 2 Sep 04 Posts: 352 Credit: 1,748,908 RAC: 4,324 |
Since they will certainly fix the Problem and Science is done regarless of Credits, I'll keep at full throttle. Bugs the Credits down a bit, but IMHO that will be among their No.1 Priorities to fix. ========= I may be wrong but I think this problem with reporting 0 Time is inherent with the v4.19 Client, I had v4.24 installed on all my computers but I couldn't get any WU's to download without error's so I switched them all to the v4.19 Client. I know with the v4.19 Client if I exit the GUI when I restart it the WU's start back at 0:00 Time but with the v4.24 Client they start back up where they left off. I switched my computers back to the v4.24 Client now that WU's can be downloaded successfully and ran 1 of the shorter WU's and it reported the correct time. So I shut off the Network Access on that computer because it still has a bunch of shorter run WU's on it. I'll let it run overnight while I sleep and see if all or any or just how many report successfully in the morning ... |
Send message Joined: 30 Sep 04 Posts: 112 Credit: 104,059 RAC: 0 |
Seems like they've fixed the total credit issue, as I've gone up almost 1000 cobblestones over the last couple of days.... :) Regardless, the Science was unaffected and it's been a real joy to have the LHC project come back to life again. > Since they will certainly fix the Problem and Science is done regarless of > Credits, I'll keep at full throttle. > > Bugs the Credits down a bit, but IMHO that will be among their No.1 Priorities > to fix. > ========= > > I may be wrong but I think this problem with reporting 0 Time is inherent with > the v4.19 Client, I had v4.24 installed on all my computers but I couldn't get > any WU's to download without error's so I switched them all to the v4.19 > Client. > > I know with the v4.19 Client if I exit the GUI when I restart it the WU's > start back at 0:00 Time but with the v4.24 Client they start back up where > they left off. > > I switched my computers back to the v4.24 Client now that WU's can be > downloaded successfully and ran 1 of the shorter WU's and it reported the > correct time. So I shut off the Network Access on that computer because it > still has a bunch of shorter run WU's on it. I'll let it run overnight while I > sleep and see if all or any or just how many report successfully in the > morning ... > > |
Send message Joined: 1 Sep 04 Posts: 157 Credit: 82,604 RAC: 0 |
> I know with the v4.19 Client if I exit the GUI when I restart it the WU's > start back at 0:00 Time but with the v4.24 Client they start back up where > they left off. I have a WU v64boincexxx that was in a paused state, running Boinc 4.19 and sixtrack 4.64 with leave in memory. I exited Boinc and start over again. The paused WU has still the correct CPU time and the % of process was also still OK. Will try to see what happens with the v64lhcxxx once it start crunching and will do the same: shut down Boinc and start over again. Looking to the history inside BoincLogX, a lot of v64lhcxxx WU's have 00:00:00 CPU time, even without having Boinc shut down, running 24/7 and leave in memory set on. Best greetings from Belgium Thierry |
Send message Joined: 1 Sep 04 Posts: 157 Credit: 82,604 RAC: 0 |
> I have a WU v64boincexxx that was in a paused state, running Boinc 4.19 and > sixtrack 4.64 with leave in memory. I tried this time with a WU v64boince6ib1-52s6_8615_1_sixvf_11185_0 that was running. I shut down Boinc. When starting over again, this WU was paused (left in memory). The CPU time is still correct, being 05:28:12 but the progress is reported as 0.00% and the time to completion is identical to another WU that still has to start crunching. Will keep an eye on this. |
Send message Joined: 1 Sep 04 Posts: 157 Credit: 82,604 RAC: 0 |
> When starting over again, this WU was paused (left in memory). The CPU time is > still correct, being 05:28:12 but the progress is reported as 0.00% and the > time to completion is identical to another WU that still has to start > crunching. > > Will keep an eye on this. When this WU start crunching again coming out from the paused state, the CPU time was incrementing and the progress jumped to the normal value but only after some 4 to 5 seconds. Looking to the behavior with E@H and S@H, the % of progress after restarting Boinc comes directly back as it was before shutting Boinc even if they are in a paused state before closing Boinc. But, I got another another WU starting crunching: LHC@home - 2005-02-27 14:00:18 - Starting result v64boince6ib1-38s18_20645_1_sixvf_11934_0 using sixtrack version 4.64 and then LHC@home - 2005-02-27 14:01:09 - Computation for result v64boince6ib1-38s18_20645_1_sixvf_11934 finished LHC@home - 2005-02-27 14:01:10 - Started upload of v64boince6ib1-38s18_20645_1_sixvf_11934_0_0 LHC@home - 2005-02-27 14:01:16 - Finished upload of v64boince6ib1-38s18_20645_1_sixvf_11934_0_0 LHC@home - 2005-02-27 14:01:16 - Throughput 11774 bytes/sec Total CPU time is for this WU 00:00:51 and has been reported as 00:00:00 in BoincLogX as well as on the server. Best greetings from Belgium Thierry |
Send message Joined: 23 Oct 04 Posts: 358 Credit: 1,439,205 RAC: 0 |
> > When starting over again, this WU was paused (left in memory). The CPU > time is > > still correct, being 05:28:12 but the progress is reported as 0.00% and > the > > time to completion is identical to another WU that still has to start > > crunching. > > > > Will keep an eye on this. > > When starting over again, the CPU time was incrementing and the progress > jumped to the normal value but only after some 4 to 5 seconds. > > Looking to the behavior with E@H and S@H, the % of progress after restarting > Boinc comes directly back as it was before shutting Boinc. > > But, I got another another WU starting crunching: > > LHC@home - 2005-02-27 14:00:18 - Starting result > v64boince6ib1-38s18_20645_1_sixvf_11934_0 using sixtrack version 4.64 > > and then > > LHC@home - 2005-02-27 14:01:09 - Computation for result > v64boince6ib1-38s18_20645_1_sixvf_11934 finished > > LHC@home - 2005-02-27 14:01:10 - Started upload of > v64boince6ib1-38s18_20645_1_sixvf_11934_0_0 > LHC@home - 2005-02-27 14:01:16 - Finished upload of > v64boince6ib1-38s18_20645_1_sixvf_11934_0_0 > LHC@home - 2005-02-27 14:01:16 - Throughput 11774 bytes/sec > > Total CPU time is for this WU 00:00:51 and has been reported as 00:00:00 in > BoincLogX as well as on the server. > > I had watched the same thing as you described here. I couldn't do better (describe)! I run LHC "solo" on this box, so the "switch between appl. .." to every 2 hours is not needed(IMO or : I'm wrong?). [EDIT19:38] THX for reply, and " auch einen schönen Tag noch, Chez nous ça neige.) [/EDIT] greetz from littleBouncer |
Send message Joined: 1 Sep 04 Posts: 157 Credit: 82,604 RAC: 0 |
> I run LHC "solo" on this box, so the "switch between appl. .." to every 2 > hours is not needed(IMO or : I'm wrong?). If running only 1 application this switch will indeed not occur. Bonne journée oder ein shönes Tag. Thierry |
Send message Joined: 2 Sep 04 Posts: 352 Credit: 1,748,908 RAC: 4,324 |
I've crunched about 2 Dozen of the shorter WU's since last night with the v4.24 and they all except for 1 WU reported some sort of time. The 1 that didn't just could have been one that did indeed take no time, I've seen them from time to time like that, as soon as they start they stop & report no time ... I still have 32 shorter WU's on that PC and have Suspended all the longer running ones so it will just crunch the shorter ones to see how they fair ... |
Send message Joined: 23 Oct 04 Posts: 358 Credit: 1,439,205 RAC: 0 |
The problem seems to be in(or with) the "normal" WU (100'000 turns; 40 - 70 min. CPU time, named as " 64lhcxx....") I observed one what toke the full run : 2 minutes before ending (before all indicators shown the right time) it finish with mark "uploading" and the 56 min. CPU time(!), then(after the upload) the mark switch to "ready to report" and CPU time "--.--". IMO : I think it has nothing do to with the simulation abort, where particles are leaving the colider or colides, because for such WU's I got credit. (I discovered also that the results are sometimes "wierd" validated, but for this (and that is a different problem in the " zero granted credits" - list) I try to write an exemple later (if I reach to explain it)) greetz from Switzerland littleBouncer à Thierry mercy et aussi une bonne journée |
Send message Joined: 1 Sep 04 Posts: 157 Credit: 82,604 RAC: 0 |
> The problem seems to be in(or with) the "normal" WU (100'000 turns; 40 - 70 > min. CPU time, named as " 64lhcxx....") I think it is independant of the WU littleBouncer: I had the problem with a v64boince... WU which is one of 1.000.000 turns. |
Send message Joined: 23 Oct 04 Posts: 358 Credit: 1,439,205 RAC: 0 |
> I think it is independant of the WU littleBouncer: > I had the problem with a v64boince... WU which is one of 1.000.000 turns. > I saw this only on one Host and only with this "normal" WU's, that's why my suggestion. But another suggestion: Is it possible it has do to with the use of L2 Cache (when it is > 1024). |
Send message Joined: 1 Sep 04 Posts: 157 Credit: 82,604 RAC: 0 |
> But another suggestion: > Is it possible it has do to with the use of L2 Cache (when it is > 1024). Don't think so as my CPU cache is only 512kB. |
Send message Joined: 23 Oct 04 Posts: 358 Credit: 1,439,205 RAC: 0 |
|
©2024 CERN