Message boards : Sixtrack Application : Highly Variable Runtimes
Message board moderation

To post messages, you must log in.

AuthorMessage
supdood

Send message
Joined: 7 May 17
Posts: 5
Credit: 145,820
RAC: 0
Message 30302 - Posted: 12 May 2017, 14:26:25 UTC

I just joined the project for the BOINC Pentathlon, but am hoping to continue afterwards and would like to know if the highly variable runtimes (from 3 minutes to tens of hours) I have seen so far are normal. Specifically, are the extra long running tasks normal or were they created especially for the pentathlon to keep the server load down?
ID: 30302 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 30303 - Posted: 12 May 2017, 16:32:32 UTC - in response to Message 30302.  

Well, what is "normal"? Within a study WU run times can be highly variable
over the range of amplitudes and angles. The maximum run time will vary
typically from an hour to 10 hours (100,000 turns, one million turns).
I myself am running some one million turn studies, but at high amplitudes particles may be lost rather quickly.
At the moment we are just trying to make sure enough work WUs
are available to give all competitors a fair chance.
Not very helpful perhaps, but I hope you will stay with LHC@home SixTrack.
Eric.
ID: 30303 · Report as offensive     Reply Quote
supdood

Send message
Joined: 7 May 17
Posts: 5
Credit: 145,820
RAC: 0
Message 30308 - Posted: 12 May 2017, 18:52:09 UTC - in response to Message 30303.  

Thanks for the quick reply--this is helpful. Is there any indicator in the task name that tells how many turns a WU contains?
ID: 30308 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 281
Credit: 11,859,285
RAC: 1
Message 30314 - Posted: 12 May 2017, 20:38:07 UTC - in response to Message 30308.  
Last modified: 12 May 2017, 20:51:38 UTC

Yes, there is.
Picking 2 of my recent tasks at random:

wz3_jtbb2cm1__15__s__62.31_60.32__2_2.5__6__25.5_1_sixvf_boinc556535_0 ran for 15 hrs and
w8_hllhc10_inj_a4b5_20_w8d__4__s__62.2801_60.31__2_4__5__82.5_1_sixvf_boinc81877_1 ran for 77 mins.

The highlighted 5 and 6 show 10^6 and 10^5 turns respectively, however, although a 6 might be expected to run 10 times longer than a 5, any task may end early if the beam it simulates is unstable, so even a 6 may end after only a few mins.

Even tasks that finish early are useful as they have simulated a beam configuration that was unstable and would not have worked (or even caused damage) in the real machine. Much better for them to fail in testing than in reality.
ID: 30314 · Report as offensive     Reply Quote
supdood

Send message
Joined: 7 May 17
Posts: 5
Credit: 145,820
RAC: 0
Message 30316 - Posted: 12 May 2017, 22:08:46 UTC - in response to Message 30314.  

Thank you, this is great to know. Takes the mystery out of the runtimes.
ID: 30316 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 30320 - Posted: 13 May 2017, 4:23:41 UTC

Thanks Ray, I should probably put this in a FAQ. To take one of your cases/WUs,
wz3_jtbb2cm1__15__s__62.31_60.32__2_2.5__6__25.5_1_sixvf_boinc556535_0
wz3 is the workspace name
jtbb2cm1 id the study name
15 is the seed number, typically between 1 and 60 representing a particular set of initial conditions
s for simul pretty much constant right now
62.31_60.32 the "tune"
2_2.5 the amplitude range (2-20)
6 the number of turns as a power of 10
25.5 angle in phase space (0-90)

Eric.
ID: 30320 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 281
Credit: 11,859,285
RAC: 1
Message 30599 - Posted: 2 Jun 2017, 18:43:15 UTC
Last modified: 3 Jun 2017, 7:41:31 UTC

This wu has the longest runtime estimate of any Sixtrack I've ever seen. Currently 1.220% (and rising) after 7+hrs and an estimate of 24+ days (and rising!). 1 wingman failed to return it by deadline (it may still be running after 8 days?) but the other returned it in 3hrs, on a faster host than mine, but not THAT much faster.
I'll let it run just for the novelty factor.

[Update]
Seems it was hardly using any CPU, hence the slow progress, so I restarted Boinc, which has sparked it into normal processing. No idea why it was going so slowly but looks fine now. Expecting it to finish overnight. Whether it will validate against the wingman or not is another question.

[Morning Update]
Other wingman and mine both returned valid and credited overnight.
ID: 30599 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 30616 - Posted: 3 Jun 2017, 8:36:55 UTC - in response to Message 30599.  

Hi Ray, which Work Unit, I don't see it.....
Very strange. Eric.
ID: 30616 · Report as offensive     Reply Quote
James Molson
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 25 Jan 17
Posts: 27
Credit: 3,258,853
RAC: 0
Message 30618 - Posted: 3 Jun 2017, 10:42:09 UTC - in response to Message 30616.  

Workunit 69564441

Jone0_TSL_jta_bbo_2222_1.1_0.75__1__s__62.2965_60.3065__6_8__6__31.5_1_sixvf_boinc4320
ID: 30618 · Report as offensive     Reply Quote
Profile Ray Murray
Volunteer moderator
Avatar

Send message
Joined: 29 Sep 04
Posts: 281
Credit: 11,859,285
RAC: 1
Message 30620 - Posted: 3 Jun 2017, 17:55:55 UTC - in response to Message 30618.  

Yes, that's the one.
48,027.00 seconds wallclock but only 15,858.33 cpu time.
About 9 hrs to get to 1.4% then normal progress after the Boinc restart so possibly some miscommunication within Boinc although 2 Theory tasks running at the same time didn't have the slow-down.
Don't know if there'll be anything noteworthy in the returned logs.
ID: 30620 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 30622 - Posted: 4 Jun 2017, 5:26:00 UTC - in response to Message 30620.  

Well SixTrack makes regular calls to BOINC to report progress
and to query/report checkpoints. I don't know about Theory.
I shall try and rerun the Task myself, if possible. Eric.
ID: 30622 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 30623 - Posted: 4 Jun 2017, 6:38:02 UTC - in response to Message 30622.  

Hello again Ray, please find below times for results for this WU.
Your result is indeed strange, but WU appears normal.
The fact that the server was restarted implies some problem,
but I do not have any details. Eric.

mysql> source results.src;
+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144251805 | 14749.69 | 14864.921875 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144251806 | 8100.555 | 8108.781187 |
+-----------+----------+--------------+
1 row in set (0.01 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144086903 | 11805.66 | 11949.363195 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144259444 | 15858.33 | 48027.004255 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144086902 | 18083.67 | 20462.942898 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

+-----------+----------+--------------+
| id | cpu_time | elapsed_time |
+-----------+----------+--------------+
| 144086902 | 18083.67 | 20462.942898 |
+-----------+----------+--------------+
1 row in set (0.00 sec)

mysql> quit
ID: 30623 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 30625 - Posted: 4 Jun 2017, 11:34:25 UTC - in response to Message 30302.  

I just joined the project for the BOINC Pentathlon, but am hoping to continue afterwards and would like to know if the highly variable runtimes (from 3 minutes to tens of hours) I have seen so far are normal. Specifically, are the extra long running tasks normal or were they created especially for the pentathlon to keep the server load down?


We are hot on the trail of a BOINC infrastructure problem which caused a
(very) large number of undetected failures with very short run times. Eric.
ID: 30625 · Report as offensive     Reply Quote

Message boards : Sixtrack Application : Highly Variable Runtimes


©2024 CERN