Thread '"SixTrack Tasks NOT being distributed'

Author	Message
Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 30902 - Posted: 20 Jun 2017, 12:37:47 UTC This thread replaces, "Will not WorkUnits", "no new WUs", "All tasks are pending" and "260.000 WUs to send, but no handed out". It is for SixTrack. This is, as a minimum, to let you know we are taking this issue rather seriously and working on it to the best of our abilities. I have managed to reproduce this problem on my own home Windows 10 computer. So far I have been unable to identify the precise problem from the available log information. However while awaiting some expert help, I found that: A project reset does not help as already tried and reported by several volunteers. In my BOINC Manager I removed the project. NOTA BENE: this DESTROYS any unsent results or active tasks. ============================================================= It should really only be done when the Client is idle. I then re-installed the client/boinc manager from the WWW and/or desktop icon, with the "repair" rather than delete option. To my surprise the project was still there!!! AND I got a bunch of new tasks immediately. :-) I also tried just remove project, then add project BUT I then get password problems! I even see a password problem when I re-install. I have opened a ticket for that. Eric ID: 30902 · Reply Quote

Magic Quantum Mechanic Send message Joined: 24 Oct 04 Posts: 1294 Credit: 95,297,887 RAC: 26,822	Message 30908 - Posted: 20 Jun 2017, 14:52:04 UTC Hi Eric, Well I have 16 in progress right now and 11 still pending and 3 longer ones are Validation inconclusive https://lhcathome.cern.ch/lhcathome/results.php?userid=5472&offset=0&show_names=0&state=3&appid=1 Volunteer Mad Scientist For Life unbelievable are you trying to promote linux again? ID: 30908 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 30914 - Posted: 21 Jun 2017, 7:59:16 UTC I now have @home more than 50 tasks (4 active on my 4 thread system). I normally get just the 4. We shall see later if I get new work. But for now this is much better, when we have over 1.6 million queued. Eric. ID: 30914 · Reply Quote

Orange Kid Send message Joined: 9 May 17 Posts: 1 Credit: 2,056,872 RAC: 3,609	Message 30921 - Posted: 22 Jun 2017, 1:24:38 UTC This project is a joke 1.8 million tasks available and I can't get any. Wake up. Get your shit together. ID: 30921 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 30922 - Posted: 22 Jun 2017, 3:51:53 UTC - in response to Message 30921. Thank you for your feedback (even if I find your language somewhat forthright). I see you have 4 Windows 10 systems, of which 3 are indeed idle, and not getting SixTrack tasks, while the 4th system has 24! in progress, although it has just 4 processors! These 24 seem to have been sent/received at 21:55 UTC yesterday. While I consider this a very strange situation, I see a very marginal improvement. The details of this situation are valuable. Your (edited) message will be passed to my colleagues. Eric. ID: 30922 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30924 - Posted: 22 Jun 2017, 4:44:52 UTC On a host that successfully runs vbox WUs from CERN via separate client instances I attached a new instance to sixtrack yesterday evening. Unfortunately I ran into the same problem than many others as the project server reports: No tasks are available for SixTrack Is there still a problem on the server or is it a client misconfiguration? Host: https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10486310 ID: 30924 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30925 - Posted: 22 Jun 2017, 6:19:36 UTC - in response to Message 30924. Finally I got 2 WUs that were finished after 8 s and 12 s. And with the next request a few minutes later: Server error: feeder not running ID: 30925 · Reply Quote

Nils Volunteer moderator Project administrator Project developer Project tester Send message Joined: 15 Jul 05 Posts: 253 Credit: 6,001,083 RAC: 0	Message 30926 - Posted: 22 Jun 2017, 6:39:36 UTC - in response to Message 30925. Thanks for the update. We just restarted the daemons with increased weight for the Sixtrack application. The feeder needs some time to fill the shared memory buffer used by the scheduler with tasks. Hence during 5-10 minutes after a BOINC service restart, you will get this message "feeder not" running from the scheduler in spite of the fact that the feeder indeed is running. On Tuesday we increased the memory buffer size and Sixtrack weight to avoid exhausting the buffer of Sixtrack tasks, but in spite of this, the very short tasks are immediately sucked from the queue. Right now we tried to further increase the number of Sixtrack tasks in the buffer, but either the tasks are going out in seconds, or we have another problem. The bottom line is that you might get a few more of these messages "feeder not running" while we're working on this. ID: 30926 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30927 - Posted: 22 Jun 2017, 6:44:18 UTC - in response to Message 30926. OK. I got some WUs with the most recent request and they seem to run longer than a few seconds. Thank you. ID: 30927 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 30928 - Posted: 22 Jun 2017, 6:54:49 UTC - in response to Message 30925. Thanks for the input. Can you name iID the tasks to save time. I shall find them in the database shortly though. Eric. ID: 30928 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 30929 - Posted: 22 Jun 2017, 7:10:58 UTC - in response to Message 30928. OK I found your Computer Linux ID 10486310 and a "short task". It is WU 71441414 and Task 147737902. The "sidekick" task in progress is 147737901. While waiting for the Task copy to finish and be validated or not, I am trying to run it myself. This is a big help. Thanks a lot and more news soonest. Eric. ID: 30929 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30930 - Posted: 22 Jun 2017, 7:22:01 UTC - in response to Message 30928. Last modified: 22 Jun 2017, 7:23:40 UTC <edit> Sorry. I was too slow. :-) </edit> I'm not sure which ID you refer to. Therefore both lists. The very short WUs from the first request: https://lhcathome.cern.ch/lhcathome/result.php?resultid=147737882 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147737902 w-c6_n4_lhc2016_40_MD-120-16-476-2.5-1.1282__43__s__64.31_59.32__14_15__6__46.5_1_sixvf_boinc38206_1 w-c6_n4_lhc2016_40_MD-120-16-476-2.5-1.1282__43__s__64.31_59.32__14_15__6__61.5_1_sixvf_boinc38216_1 App: SixTrack v451.07 (sse2) i686-pc-linux-gnu Next ones, some of them ran only a few minutes: https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744261 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744263 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744265 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744281 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744283 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744285 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744287 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744342 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744344 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744346 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744349 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147744351 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__7_8__6__70.5_1_sixvf_boinc25351_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__7_8__6__72_1_sixvf_boinc25352_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__7_8__6__73.5_1_sixvf_boinc25353_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__7_8__6__85.5_1_sixvf_boinc25361_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__7_8__6__87_1_sixvf_boinc25362_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__7_8__6__88.5_1_sixvf_boinc25363_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__8_9__6__1.5_1_sixvf_boinc25364_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__8_9__6__43.5_1_sixvf_boinc25392_0 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__8_9__6__45_1_sixvf_boinc25393_0 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__8_9__6__46.5_1_sixvf_boinc25394_0 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__8_9__6__48_1_sixvf_boinc25395_1 w-c2_n4_lhc2016_40_MD-140-16-476-2.5-1.2077__32__s__64.31_59.32__8_9__6__49.5_1_sixvf_boinc25396_1 App: SixTrack v451.07 (sse2) x86_64-pc-linux-gnu ID: 30930 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1536 Credit: 10,045,458 RAC: 1,143	Message 30931 - Posted: 22 Jun 2017, 8:13:38 UTC Last modified: 22 Jun 2017, 8:14:45 UTC With only SixTrack selected in prefs, no VBox installed and >million in server queue request on machine 10362384 132091 LHC@home 22 Jun 10:08:00 Requesting new tasks for CPU 132092 LHC@home 22 Jun 10:08:01 Scheduler request completed: got 0 new tasks 132093 LHC@home 22 Jun 10:08:01 No tasks sent 132094 LHC@home 22 Jun 10:08:01 No tasks are available for SixTrack 132095 LHC@home 22 Jun 10:08:01 No tasks are available for sixtracktest 132096 LHC@home 22 Jun 10:08:01 Message from server: VirtualBox is not installed ID: 30931 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30933 - Posted: 22 Jun 2017, 8:25:01 UTC Similar result as described below when I attached my second host (also via extra client instance). The first 2 WUs had walltimes of 9 s and 15 s. hostID: 10486393 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147758603 https://lhcathome.cern.ch/lhcathome/result.php?resultid=147758604 ID: 30933 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30934 - Posted: 22 Jun 2017, 8:31:21 UTC And again the feeder ... I know, I know, you're trying hardest to get that solved. ID: 30934 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1536 Credit: 10,045,458 RAC: 1,143	Message 30935 - Posted: 22 Jun 2017, 8:42:20 UTC @10:39:37 CEST: 132943 LHC@home 22 Jun 10:39:37 Scheduler request completed: got 30 new tasks ID: 30935 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 30940 - Posted: 22 Jun 2017, 10:18:08 UTC - in response to Message 30933. Many thanks; I have managed to run the WU w-c6_n4_lhc2016_40_MD-120-16-476-2.5-1.1282__43__s__64.31_59.32__14_15__6__61.5_1_sixvf_boinc38216.zip which was still in the download database table. It appears to be a genuine results with apparently very unstable particles which are lost after less than 500 turns. Unlucky. I have looked another 60,000 or so cases in the same study and I found 20,000 with > 1000 sec CPU, 30,000 with > 100 sec, but 441 with less than 1 sec CPU tracking........... This last number 441 is a bit strange and requires further investigation, but otherwise the results seem consistent with a typical beam physics application. Eric. ID: 30940 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1536 Credit: 10,045,458 RAC: 1,143	Message 30945 - Posted: 22 Jun 2017, 14:02:23 UTC - in response to Message 30935. 132943 LHC@home 22 Jun 10:39:37 Scheduler request completed: got 30 new tasks One time experience Unsent SixTracks 1747102. Again: 141806 LHC@home 22 Jun 15:58:46 Sending scheduler request: To fetch work. 141807 LHC@home 22 Jun 15:58:46 Requesting new tasks for CPU 141808 LHC@home 22 Jun 15:58:47 update requested by user 141809 LHC@home 22 Jun 15:58:49 Scheduler request completed: got 0 new tasks 141810 LHC@home 22 Jun 15:58:49 No tasks sent 141811 LHC@home 22 Jun 15:58:49 No tasks are available for SixTrack 141812 LHC@home 22 Jun 15:58:49 No tasks are available for sixtracktest 141813 LHC@home 22 Jun 15:58:49 Message from server: VirtualBox is not installed ID: 30945 · Reply Quote

Nils Volunteer moderator Project administrator Project developer Project tester Send message Joined: 15 Jul 05 Posts: 253 Credit: 6,001,083 RAC: 0	Message 30946 - Posted: 22 Jun 2017, 14:10:46 UTC - in response to Message 30945. I am afraid that you will get this message occasionally, even now that we have a scheduler buffer of 2000 tasks. The problem frequently occurs when there are batches of tasks with very short duration of a few seconds. Then the time for the feeder to query the DB and fill the buffer is simply too long. During normal operations with average task length, the buffer should now be large enough to avoid running out of Sixtrack tasks. We're looking into both how to optimize the BOINC server code and Sixtrack pre-processing, but both tasks are more long term than short term. ID: 30946 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2724 Credit: 300,144,234 RAC: 40,555	Message 30947 - Posted: 22 Jun 2017, 14:20:36 UTC - in response to Message 30946. Sounds good. Can you give some advice what task durations should be reported in the MB to get special attention of the admin team? There are long runners as well as short runners. ID: 30947 · Reply Quote