Questions and Answers :
Unix/Linux :
no progress indication
Message board moderation
Author | Message |
---|---|
Send message Joined: 18 Sep 04 Posts: 163 Credit: 1,682,370 RAC: 0 |
Hi, I know this is a minor issue at the moment, but there is no progress indication, see excerpt from client_state (result) (name)v64boince6ib1-40s10_12630_1_sixvf_1697_8(/name) (final_cpu_time)0.000000(/final_cpu_time) (exit_status)0(/exit_status) (state)2(/state) (wu_name)v64boince6ib1-40s10_12630_1_sixvf_1697(/wu_name) (report_deadline)1110213396(/report_deadline) (active_task) (project_master_url)http://lhcathome.cern.ch/(/project_master_url) (result_name)v64boince6ib1-40s10_12630_1_sixvf_1697_8(/result_name) (app_version_num)463(/app_version_num) (slot)1(/slot) (scheduler_state)2(/scheduler_state) (checkpoint_cpu_time)0.000000(/checkpoint_cpu_time) (fraction_done)0.000000(/fraction_done) (current_cpu_time)594.550000(/current_cpu_time) (/active_task) (/result) This is Boinc 4.19 on Linux 2.4.19. [Edit: substituted tags] |
Send message Joined: 2 Sep 04 Posts: 4 Credit: 22,828 RAC: 0 |
My WU just restarted and it seems to start at the beginning! |
Send message Joined: 18 Sep 04 Posts: 47 Credit: 1,886,234 RAC: 0 |
Same problem with the PP Linux cruncher.. (not sure about einstien, but CP seems OK...) The Seti Linux Cruncher hasn't been updated in quite some time, so I wonder if a common header file or source file for crunchers doesn't do the right thing for linux.... |
Send message Joined: 18 Sep 04 Posts: 163 Credit: 1,682,370 RAC: 0 |
> My WU just restarted and it seems to start at the beginning! > This only seems to be the case. After each restart computation begins at the point where it finished last time. You can monitor this by watching the files fort.91 and fort.93 in the slots directory. But at least final_cpu_time is calcutated incorrectly, see current_cpu_time and final_cpu_time for the above result: (result) (name)v64boince6ib1-53s4_6615_1_sixvf_2201_10(/name) (final_cpu_time)0.000000(/final_cpu_time) (exit_status)0(/exit_status) (state)2(/state) (wu_name)v64boince6ib1-53s4_6615_1_sixvf_2201(/wu_name) (report_deadline)1110219990(/report_deadline) (active_task) (project_master_url)http://lhcathome.cern.ch/(/project_master_url) (result_name)v64boince6ib1-53s4_6615_1_sixvf_2201_10(/result_name) (app_version_num)463(/app_version_num) (slot)1(/slot) (scheduler_state)2(/scheduler_state) (checkpoint_cpu_time)0.000000(/checkpoint_cpu_time) (fraction_done)0.000000(/fraction_done) (current_cpu_time)2346.400000(/current_cpu_time) (/active_task) (current_cpu_time)7036.600000(/current_cpu_time) ## next restart (current_cpu_time)35315.050000(/current_cpu_time) ## next restart (current_cpu_time)2772.470000(/current_cpu_time) ## next restart (result) (name)v64boince6ib1-53s4_6615_1_sixvf_2201_10(/name) (final_cpu_time)7742.090000(/final_cpu_time) (exit_status)0(/exit_status) (state)4(/state) (wu_name)v64boince6ib1-53s4_6615_1_sixvf_2201(/wu_name) (report_deadline)1110219990(/report_deadline) (/result) (result) (name)v64boince6ib1-53s4_6615_1_sixvf_2201_10(/name) (final_cpu_time)7742.090000(/final_cpu_time) (exit_status)0(/exit_status) (state)5(/state) (ready_to_report/) (wu_name)v64boince6ib1-53s4_6615_1_sixvf_2201(/wu_name) (report_deadline)1110219990(/report_deadline) (/result) Hopefully this does not affect claimed credit calculation. Now it would be helpful to have a look at the results :) ... and there ist still the fraction_done issue. Michael |
Send message Joined: 18 Sep 04 Posts: 71 Credit: 28,399 RAC: 0 |
What I see here is sixtrack version 4.63 only indicates progress in CPU time. It does not display progress in percent completed. |
Send message Joined: 18 Sep 04 Posts: 163 Credit: 1,682,370 RAC: 0 |
> What I see here is sixtrack version 4.63 only indicates progress in CPU time. > It does not display progress in percent completed. > > Not really. The mentioned WU was suspended 3 times with CPU times of 2346, 7036, 35315, 2772 and 7742 seconds, so total CPU time would be 55211 seconds. But only 7742 are reported (the CPU time from the last interval). But unfortunatly I got a WU which indeed restarts from the beginning. See fort93: SIXTRACR MAINCR SIXTRACR starts on: 23rd of February 2005, 44 minutes after 20. SIXTRACR CRCHECK CALLED lout= 92 restart F rerun F checkp F SIXTRACR CRCHECK no restart possible checkp= F SIXTRACR CRCHECK giving up on LOUT SIXTRACR MAINCR SIXTRACR reruns on: 23rd of February 2005, 02 minutes after 21. SIXTRACR CRCHECK CALLED lout= 92 restart F rerun T checkp F SIXTRACR CRCHECK no restart possible checkp= F SIXTRACR CRCHECK overwriting fort.6 SIXTRACR CRCHECK giving up on LOUT SIXTRACR MAINCR SIXTRACR reruns on: 24th of February 2005, 38 minutes after 07. SIXTRACR CRCHECK CALLED lout= 92 restart F rerun T checkp F SIXTRACR CRCHECK no restart possible checkp= F SIXTRACR CRCHECK overwriting fort.6 SIXTRACR CRCHECK giving up on LOUT SIXTRACR MAINCR SIXTRACR reruns on: 24th of February 2005, 02 minutes after 11. SIXTRACR CRCHECK CALLED lout= 92 restart F rerun T checkp F SIXTRACR CRCHECK no restart possible checkp= F SIXTRACR CRCHECK overwriting fort.6 SIXTRACR CRCHECK giving up on LOUT This is WU: v64lhc87-34s10_12515_1_sixvf_996_1 Hope this helps to track down the problem. Michael |
Send message Joined: 18 Sep 04 Posts: 71 Credit: 28,399 RAC: 0 |
> > What I see here is sixtrack version 4.63 only indicates progress in CPU > > time. It does not display progress in percent completed. > > Not really. Well, yes and no. There's no doubt that different systems can (and often do) see different behaviour. The latest issue under the no-progress banner is that it appeared that a WU had "stalled" -- 1.0 load but no progress -- so I killed it. The INSTANT I sent it the kill signal, the progress updated from 0.000 to 0.57-something hours. Something is definitely unwell and it seems that there are several different-yet-related issues. |
©2025 CERN