log in

Highly Variable Runtimes


Advanced search

Message boards : Sixtrack Application : Highly Variable Runtimes

Author Message
supdood
Send message
Joined: 7 May 17
Posts: 3
Credit: 37,372
RAC: 0
Message 30302 - Posted: 12 May 2017, 14:26:25 UTC

I just joined the project for the BOINC Pentathlon, but am hoping to continue afterwards and would like to know if the highly variable runtimes (from 3 minutes to tens of hours) I have seen so far are normal. Specifically, are the extra long running tasks normal or were they created especially for the pentathlon to keep the server load down?

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,446,391
RAC: 115
Message 30303 - Posted: 12 May 2017, 16:32:32 UTC - in response to Message 30302.

Well, what is "normal"? Within a study WU run times can be highly variable
over the range of amplitudes and angles. The maximum run time will vary
typically from an hour to 10 hours (100,000 turns, one million turns).
I myself am running some one million turn studies, but at high amplitudes particles may be lost rather quickly.
At the moment we are just trying to make sure enough work WUs
are available to give all competitors a fair chance.
Not very helpful perhaps, but I hope you will stay with LHC@home SixTrack.
Eric.
____________

supdood
Send message
Joined: 7 May 17
Posts: 3
Credit: 37,372
RAC: 0
Message 30308 - Posted: 12 May 2017, 18:52:09 UTC - in response to Message 30303.

Thanks for the quick reply--this is helpful. Is there any indicator in the task name that tells how many turns a WU contains?

Profile Ray Murray
Volunteer moderator
Avatar
Send message
Joined: 29 Sep 04
Posts: 146
Credit: 5,068,673
RAC: 3,664
Message 30314 - Posted: 12 May 2017, 20:38:07 UTC - in response to Message 30308.
Last modified: 12 May 2017, 20:51:38 UTC

Yes, there is.
Picking 2 of my recent tasks at random:

wz3_jtbb2cm1__15__s__62.31_60.32__2_2.5__6__25.5_1_sixvf_boinc556535_0 ran for 15 hrs and
w8_hllhc10_inj_a4b5_20_w8d__4__s__62.2801_60.31__2_4__5__82.5_1_sixvf_boinc81877_1 ran for 77 mins.

The highlighted 5 and 6 show 10^6 and 10^5 turns respectively, however, although a 6 might be expected to run 10 times longer than a 5, any task may end early if the beam it simulates is unstable, so even a 6 may end after only a few mins.

Even tasks that finish early are useful as they have simulated a beam configuration that was unstable and would not have worked (or even caused damage) in the real machine. Much better for them to fail in testing than in reality.

supdood
Send message
Joined: 7 May 17
Posts: 3
Credit: 37,372
RAC: 0
Message 30316 - Posted: 12 May 2017, 22:08:46 UTC - in response to Message 30314.

Thank you, this is great to know. Takes the mystery out of the runtimes.

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,446,391
RAC: 115
Message 30320 - Posted: 13 May 2017, 4:23:41 UTC

Thanks Ray, I should probably put this in a FAQ. To take one of your cases/WUs,
wz3_jtbb2cm1__15__s__62.31_60.32__2_2.5__6__25.5_1_sixvf_boinc556535_0
wz3 is the workspace name
jtbb2cm1 id the study name
15 is the seed number, typically between 1 and 60 representing a particular set of initial conditions
s for simul pretty much constant right now
62.31_60.32 the "tune"
2_2.5 the amplitude range (2-20)
6 the number of turns as a power of 10
25.5 angle in phase space (0-90)

Eric.
____________

Profile Ray Murray
Volunteer moderator
Avatar
Send message
Joined: 29 Sep 04
Posts: 146
Credit: 5,068,673
RAC: 3,664
Message 30599 - Posted: 2 Jun 2017, 18:43:15 UTC
Last modified: 3 Jun 2017, 7:41:31 UTC

This wu has the longest runtime estimate of any Sixtrack I've ever seen. Currently 1.220% (and rising) after 7+hrs and an estimate of 24+ days (and rising!). 1 wingman failed to return it by deadline (it may still be running after 8 days?) but the other returned it in 3hrs, on a faster host than mine, but not THAT much faster.
I'll let it run just for the novelty factor.

[Update]
Seems it was hardly using any CPU, hence the slow progress, so I restarted Boinc, which has sparked it into normal processing. No idea why it was going so slowly but looks fine now. Expecting it to finish overnight. Whether it will validate against the wingman or not is another question.

[Morning Update]
Other wingman and mine both returned valid and credited overnight.

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,446,391
RAC: 115
Message 30616 - Posted: 3 Jun 2017, 8:36:55 UTC - in response to Message 30599.

Hi Ray, which Work Unit, I don't see it.....
Very strange. Eric.
____________

James Molson
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 25 Jan 17
Posts: 17
Credit: 919,483
RAC: 352
Message 30618 - Posted: 3 Jun 2017, 10:42:09 UTC - in response to Message 30616.

Workunit 69564441

Jone0_TSL_jta_bbo_2222_1.1_0.75__1__s__62.2965_60.3065__6_8__6__31.5_1_sixvf_boinc4320

Profile Ray Murray
Volunteer moderator
Avatar
Send message
Joined: 29 Sep 04
Posts: 146
Credit: 5,068,673
RAC: 3,664
Message 30620 - Posted: 3 Jun 2017, 17:55:55 UTC - in response to Message 30618.

Yes, that's the one.
48,027.00 seconds wallclock but only 15,858.33 cpu time.
About 9 hrs to get to 1.4% then normal progress after the Boinc restart so possibly some miscommunication within Boinc although 2 Theory tasks running at the same time didn't have the slow-down.
Don't know if there'll be anything noteworthy in the returned logs.

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,446,391
RAC: 115
Message 30622 - Posted: 4 Jun 2017, 5:26:00 UTC - in response to Message 30620.

Well SixTrack makes regular calls to BOINC to report progress
and to query/report checkpoints. I don't know about Theory.
I shall try and rerun the Task myself, if possible. Eric.
____________

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,446,391
RAC: 115
Message 30623 - Posted: 4 Jun 2017, 6:38:02 UTC - in response to Message 30622.

Hello again Ray, please find below times for results for this WU.
Your result is indeed strange, but WU appears normal.
The fact that the server was restarted implies some problem,
but I do not have any details. Eric.

mysql> source results.src;
+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144251805 | 14749.69 | 14864.921875 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144251806 | 8100.555 | 8108.781187 |
+-----------+----------+--------------+
1 row in set (0.01 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144086903 | 11805.66 | 11949.363195 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144259444 | 15858.33 | 48027.004255 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144086902 | 18083.67 | 20462.942898 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144086902 | 18083.67 | 20462.942898 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

mysql> quit
____________

Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist
Send message
Joined: 12 Jul 11
Posts: 843
Credit: 1,446,391
RAC: 115
Message 30625 - Posted: 4 Jun 2017, 11:34:25 UTC - in response to Message 30302.

I just joined the project for the BOINC Pentathlon, but am hoping to continue afterwards and would like to know if the highly variable runtimes (from 3 minutes to tens of hours) I have seen so far are normal. Specifically, are the extra long running tasks normal or were they created especially for the pentathlon to keep the server load down?


We are hot on the trail of a BOINC infrastructure problem which caused a
(very) large number of undetected failures with very short run times. Eric.
____________

Message boards : Sixtrack Application : Highly Variable Runtimes