Message boards :
Sixtrack Application :
Highly Variable Runtimes
Message board moderation
Author | Message |
---|---|
Send message Joined: 7 May 17 Posts: 5 Credit: 145,820 RAC: 0 |
I just joined the project for the BOINC Pentathlon, but am hoping to continue afterwards and would like to know if the highly variable runtimes (from 3 minutes to tens of hours) I have seen so far are normal. Specifically, are the extra long running tasks normal or were they created especially for the pentathlon to keep the server load down? |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Well, what is "normal"? Within a study WU run times can be highly variable over the range of amplitudes and angles. The maximum run time will vary typically from an hour to 10 hours (100,000 turns, one million turns). I myself am running some one million turn studies, but at high amplitudes particles may be lost rather quickly. At the moment we are just trying to make sure enough work WUs are available to give all competitors a fair chance. Not very helpful perhaps, but I hope you will stay with LHC@home SixTrack. Eric. |
Send message Joined: 7 May 17 Posts: 5 Credit: 145,820 RAC: 0 |
Thanks for the quick reply--this is helpful. Is there any indicator in the task name that tells how many turns a WU contains? |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,859,285 RAC: 0 |
Yes, there is. Picking 2 of my recent tasks at random: wz3_jtbb2cm1__15__s__62.31_60.32__2_2.5__6__25.5_1_sixvf_boinc556535_0 ran for 15 hrs and w8_hllhc10_inj_a4b5_20_w8d__4__s__62.2801_60.31__2_4__5__82.5_1_sixvf_boinc81877_1 ran for 77 mins. The highlighted 5 and 6 show 10^6 and 10^5 turns respectively, however, although a 6 might be expected to run 10 times longer than a 5, any task may end early if the beam it simulates is unstable, so even a 6 may end after only a few mins. Even tasks that finish early are useful as they have simulated a beam configuration that was unstable and would not have worked (or even caused damage) in the real machine. Much better for them to fail in testing than in reality. |
Send message Joined: 7 May 17 Posts: 5 Credit: 145,820 RAC: 0 |
Thank you, this is great to know. Takes the mystery out of the runtimes. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks Ray, I should probably put this in a FAQ. To take one of your cases/WUs, wz3_jtbb2cm1__15__s__62.31_60.32__2_2.5__6__25.5_1_sixvf_boinc556535_0 wz3 is the workspace name jtbb2cm1 id the study name 15 is the seed number, typically between 1 and 60 representing a particular set of initial conditions s for simul pretty much constant right now 62.31_60.32 the "tune" 2_2.5 the amplitude range (2-20) 6 the number of turns as a power of 10 25.5 angle in phase space (0-90) Eric. |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,859,285 RAC: 0 |
This wu has the longest runtime estimate of any Sixtrack I've ever seen. Currently 1.220% (and rising) after 7+hrs and an estimate of 24+ days (and rising!). 1 wingman failed to return it by deadline (it may still be running after 8 days?) but the other returned it in 3hrs, on a faster host than mine, but not THAT much faster. I'll let it run just for the novelty factor. [Update] Seems it was hardly using any CPU, hence the slow progress, so I restarted Boinc, which has sparked it into normal processing. No idea why it was going so slowly but looks fine now. Expecting it to finish overnight. Whether it will validate against the wingman or not is another question. [Morning Update] Other wingman and mine both returned valid and credited overnight. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Hi Ray, which Work Unit, I don't see it..... Very strange. Eric. |
Send message Joined: 25 Jan 17 Posts: 27 Credit: 3,258,853 RAC: 0 |
Workunit 69564441 Jone0_TSL_jta_bbo_2222_1.1_0.75__1__s__62.2965_60.3065__6_8__6__31.5_1_sixvf_boinc4320 |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,859,285 RAC: 0 |
Yes, that's the one. 48,027.00 seconds wallclock but only 15,858.33 cpu time. About 9 hrs to get to 1.4% then normal progress after the Boinc restart so possibly some miscommunication within Boinc although 2 Theory tasks running at the same time didn't have the slow-down. Don't know if there'll be anything noteworthy in the returned logs. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Well SixTrack makes regular calls to BOINC to report progress and to query/report checkpoints. I don't know about Theory. I shall try and rerun the Task myself, if possible. Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Hello again Ray, please find below times for results for this WU. Your result is indeed strange, but WU appears normal. The fact that the server was restarted implies some problem, but I do not have any details. Eric. mysql> source results.src; +-----------+----------+--------------+ | id | cpu_time | elapsed_time | +-----------+----------+--------------+ | 144251805 | 14749.69 | 14864.921875 | +-----------+----------+--------------+ 1 row in set (0.00 sec) +-----------+----------+--------------+ | id | cpu_time | elapsed_time | +-----------+----------+--------------+ | 144251806 | 8100.555 | 8108.781187 | +-----------+----------+--------------+ 1 row in set (0.01 sec) +-----------+----------+--------------+ | id | cpu_time | elapsed_time | +-----------+----------+--------------+ | 144086903 | 11805.66 | 11949.363195 | +-----------+----------+--------------+ 1 row in set (0.00 sec) +-----------+----------+--------------+ | id | cpu_time | elapsed_time | +-----------+----------+--------------+ | 144259444 | 15858.33 | 48027.004255 | +-----------+----------+--------------+ 1 row in set (0.00 sec) +-----------+----------+--------------+ | id | cpu_time | elapsed_time | +-----------+----------+--------------+ | 144086902 | 18083.67 | 20462.942898 | +-----------+----------+--------------+ 1 row in set (0.00 sec) +-----------+----------+--------------+ | id | cpu_time | elapsed_time | +-----------+----------+--------------+ | 144086902 | 18083.67 | 20462.942898 | +-----------+----------+--------------+ 1 row in set (0.00 sec) mysql> quit |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
I just joined the project for the BOINC Pentathlon, but am hoping to continue afterwards and would like to know if the highly variable runtimes (from 3 minutes to tens of hours) I have seen so far are normal. Specifically, are the extra long running tasks normal or were they created especially for the pentathlon to keep the server load down? We are hot on the trail of a BOINC infrastructure problem which caused a (very) large number of undetected failures with very short run times. Eric. |
©2024 CERN