Message boards :
Theory Application :
(Native) Theory - Sherpa looooooong runners
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next
Author | Message |
---|---|
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Would it be worth keeping the Theory-Native sub-project running, but dedicated to Sherpa tasks? That would provide an easy way for volunteers to opt-in/opt-out of the babysitting... (and the machinery's already there). I am wondering how they handle these in-house. Do they let them go forever? Probably not. In fact, I am wondering whether they do them at all. Maybe they just throw them over the wall to us. (Though not to me. I am out of Theory.) |
Send message Joined: 7 May 08 Posts: 218 Credit: 1,575,053 RAC: 67 |
|
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
I am wondering how they handle these in-house. Do they let them go forever? Probably not. In fact, I am wondering whether they do them at all. Maybe they just throw them over the wall to us. (Though not to me. I am out of Theory.) In our internal batch system we have Job Flavours, job limits that are up to 1 week. Failures can be as scientifically valuable as success. See the Michelson–Morley experiment. |
Send message Joined: 7 Feb 14 Posts: 99 Credit: 5,180,005 RAC: 0 |
This one was a success: pp jets 7000 300 - sherpa 1.4.2 default 41000 190] https://lhcathome.cern.ch/lhcathome/result.php?resultid=253954301 80,534.88s |
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Failures can be as scientifically valuable as success. See the Michelson–Morley experiment. Certainly, as long as they are testing the science and not just the instability of the Sherpa code. As a user, I have to rely on their judgement. Some versions seem better than others. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
A faster computer may be necessary to finish this task within the limits: https://lhcathome.cern.ch/lhcathome/result.php?resultid=253350398 ===> [runRivet] Thu Nov 28 11:32:01 UTC 2019 [boinc ee zhad 91.2 - - sherpa 2.2.5 default 2000 188] . . . integration time: ( 7d 3h 9m 57s elapsed / 1943d 17h 16m 49s left ) [12:55:34] 4.23518e+18 pb +- ( 7.00423e+17 pb = 16.5382 % ) 2459080000 ( 2459080579 -> 99.9 % ) integration time: ( 7d 3h 10m 2s elapsed / 1943d 17h 42m 38s left ) [12:55:44] Poincare::Poincare(): Inaccurate rotation { a = (-0.262584,0.220149,0.939459) b = (0,0,1) a' = (0.083949,0.22735,0.970188) -> rel. dev. (inf,inf,-0.0298121) m_ct = 0.939459 m_st = -0.342661 m_n = (0,1.27743e-06,-2.99348e-07) } Poincare::Poincare(): Inaccurate rotation { a = (-0.262584,0.220149,0.939459) b = (0,0,1) a' = (0.083949,0.22735,0.970188) -> rel. dev. (inf,inf,-0.0298121) m_ct = 0.939459 m_st = -0.342661 m_n = (0,1.27743e-06,-2.99348e-07) } |
Send message Joined: 31 Jan 11 Posts: 12 Credit: 3,557,813 RAC: 0 |
Hi Henry, That's an interesting proposal. Sequestering any jobs that are judged as being not completely stable (for whatever reason) in a dedicated queue that people can opt into, while the main queue would be reserved for more streamlined production runs. I think that the actual name "Theory-Native" would not be good to retain for the 'development' sub-project. It would be a mis-badging, that would eventually confuse people. But I'm curious to ask other LHC@home developers if it would have merit and be possible to set up something like a "Theory-Beta" sub-project, where we could put those tasks that are problematic. As I've written about elsewhere, Sherpa is a complex code, with advanced capabilities, and unfortunately also sometimes advanced ways of failing. Well, an infinite loop is not a particularly advanced failure of course, but in its defence typically the things it is doing when it enters those loops you see are things that the other codes are not even trying to do. Moreover, since none of the Test4Theory/LHC@home developers are Sherpa authors, all we can usually do is the same as ordinary users: submit a bug report to the Sherpa authors about what you guys are seeing, and hope that eventually in a future version, some of those loops will get fixed. Meanwhile, we still want to run the existing version of the code, since it is interesting to compare it, for those cases where it succeeds, to the available data. On our side, we can try to find ways of running it as "correctly" as possible, and use tricks to abort jobs that we can somehow detect as loopers, but despite some improvements on that, there clearly are still issues remaining. I'd welcome feedback from other Test4Theory/LHC@home developers (and volunteers) to comment on this idea, and whether we would be able to implement it? All the best, Peter. |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
https://lhcathome.cern.ch/lhcathome/result.php?resultid=253690832 ===> [runRivet] Mon Dec 2 12:15:40 UTC 2019 [boinc pp jets 7000 80,-,960 - sherpa 1.4.3 default 100000 190] . . . 1000 events processed dumping histograms... Event 1100 ( 1m 33s elapsed / 2h 20m 38s left ) -> ETA: Mon Dec 02 16:32 1100 events processed Updating display... Event 1200 ( 1m 45s elapsed / 2h 24m 17s left ) -> ETA: Mon Dec 02 16:36 1200 events processed Display update finished (9 histograms, 1000 events). Event 1300 ( 1m 55s elapsed / 2h 26m 2s left ) -> ETA: Mon Dec 02 16:38 1300 events processed Event 1400 ( 2m 3s elapsed / 2h 24m 52s left ) -> ETA: Mon Dec 02 16:37 1400 events processed Updating display... Display update finished (9 histograms, 1000 events). Updating display... Display update finished (9 histograms, 1000 events). . . . But a week later: . . . Display update finished (9 histograms, 1000 events). Updating display... Display update finished (9 histograms, 1000 events). Updating display... Display update finished (9 histograms, 1000 events). |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 677 |
McPlots shows: sherpa 1.4.3 default pp jets 7000 80,-,960 - 0+13/68. |
Send message Joined: 7 Feb 14 Posts: 99 Credit: 5,180,005 RAC: 0 |
https://lhcathome.cern.ch/lhcathome/result.php?resultid=253690832 Yeah, it looks like the same behaviour of pp jets 7000 40,-,760 - sherpa 1.4.5 default 100000 190mentioned above that was stuck at Updating display... Display update finished (9 histograms, 75000 events). |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 677 |
Time limit reached in db yesterday, but still running 11 days and 18 hours, with no end. Get a shut down now. [boinc ee zhad 43.6 - - sherpa 2.2.5 default 2000 189] https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=127302369 |
Send message Joined: 7 Feb 14 Posts: 99 Credit: 5,180,005 RAC: 0 |
Time limit reached in db yesterday, but still running 11 days and 18 hours, with no end.How do you search for these tasks (successful ones)? I would like to know if it's useful I check all the active hosts (~3600) by a PHP script. Edit: Ok, that was one of your tasks. Do you post only your tasks? |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 1,882 |
How do you search for these tasks (successful ones)? I would like to know if it's useful I check all the active hosts (~3600) by a PHP script.Here you can find all tasks of the current 2279-batch: Sherpa runs of batch 2279: # of events, attempts. success, fail and lost Attaching the link to MC Production mentioned there has a long load time. |
Send message Joined: 7 Feb 14 Posts: 99 Credit: 5,180,005 RAC: 0 |
I tried to scan 100 hosts, as a test. MAX_RUNTIME=86400 Long runner condition $exit_status == 0 and $runtime > $MAX_RUNTIME and ( $plan_class == "native_theory" or ( $plan_class == "vbox64_theory" and $version >= 300.00 ) ) 8 sherpa long runners were found. (resultid, wuid, userid, hostid, run, sent_timestamp, report_deadline_timestamp, runtime, cputime, application_name, plan_class, version) 254955577 128154506 67 10509223 pp jets 7000 400 - sherpa 1.2.2p default 47000 192 1575957398 1576257282 186377.97 184224.6 Theory Simulation vbox64_theory 300.02 254725396 128029652 67 10509223 pp jets 13000 150,-,2360 - pythia8 8.235 tune-AU2lox 90000 192 1575820771 1575937324 116385.78 115760.2 Theory Simulation vbox64_theory 300.02 254410071 127857452 67 10509223 pp jets 13000 180,-,3560 - pythia8 8.235 tune-AU2ct10 65000 191 1575629833 1575767665 132898.3 131505.9 Theory Simulation vbox64_theory 300.02 254740811 128040944 248 10624524 pp jets 7000 300 - sherpa 1.4.2 default 47000 192 1575852462 1576021366 153905.26 114838.7 Theory Simulation vbox64_theory 300.02 254731110 128033833 248 10624524 pp jets 7000 150 - sherpa 2.1.0 default 71000 192 1575831679 1576021366 179895.46 134835.3 Theory Simulation vbox64_theory 300.02 254387019 127843745 248 10624524 pp jets 8000 800 - pythia8 8.235 cr1 100000 191 1575587355 1575714872 99413.44 74366.95 Theory Simulation vbox64_theory 300.02 254260169 127778796 248 10624524 pp jets 7000 80,-,1160 - pythia8 8.235 early 81000 191 1575556381 1575653145 88837.18 66426.13 Theory Simulation vbox64_theory 300.02 254731795 128034287 248 10624548 pp jets 13000 250,-,4760 - pythia8 8.235 tune-AU2lox 85000 192 1575832505 1575924383 88728.4 66311.83 Theory Simulation vbox64_theory 300.02 254536231 127892806 248 10624548 pp jets 8000 600 - pythia8 8.235 early 69000 191 1575638647 1575750844 108514.62 81099.94 Theory Simulation vbox64_theory 300.02 254384925 127842473 248 10624548 pp jets 8000 800 - pythia8 8.235 tune-AU2m 85000 191 1575582420 1575686538 89614.75 66915 Theory Simulation vbox64_theory 300.02 254126655 127707205 1013 10408677 pp jets 7000 300 - sherpa 1.4.5 default 41000 190 1575483946 1575580180 93937.45 83361.72 Theory Simulation vbox64_theory 300.02 254390931 127846456 1525 10546691 pp z1j 7000 100 - pythia8 8.235 tune-1 100000 191 1575596518 1575805828 112213.8 110974.1 Theory Simulation vbox64_theory 300.02 253592355 127427973 1731 10309088 pp jets 7000 80,-,1160 - pythia8 8.235 default-MBR 100000 190 1575251124 1575943567 188413.47 162158.1 Theory Simulation vbox64_theory 300.02 255062615 128210948 2139 10372172 1576013512 1576256496 105158.79 102382.5 Theory Simulation vbox64_theory 300.02 254395033 127848909 2139 10372172 1575601855 1576031520 141318.19 138428.9 Theory Simulation vbox64_theory 300.02 254745497 128044231 3210 9936309 pp jets 8000 800 - sherpa 1.4.2 default 17000 192 1575861233 1576114071 167609.51 154827 Theory Simulation vbox64_theory 300.02 255216964 128293991 3689 10580251 pp jets 7000 28 - pythia8 8.235 early 90000 193 1576132906 1576280280 108959.74 105950.3 Theory Simulation vbox64_theory 300.02 254990460 128172326 3775 10525345 pp jets 7000 65 - pythia8 8.235 cr1 81000 192 1575967981 1576070498 93187.88 84652.28 Theory Simulation vbox64_theory 300.02 254602898 127959283 3855 10463494 pp jets 7000 400 - sherpa 2.1.1 default 42000 192 1575702265 1576057301 349029.78 344602.6 Theory Simulation vbox64_theory 300.02 254760248 128052626 3855 10605580 pp jets 7000 250 - sherpa 1.4.0 default 58000 192 1575887266 1576160237 248303.5 235340.9 Theory Simulation vbox64_theory 300.02 253370499 127299459 3949 10511876 1574997154 1575686688 185962.38 139855.2 Theory Simulation vbox64_theory 300.02 254902362 128126598 4837 10391134 pp jets 7000 150,-,2360 - pythia8 8.235 tune-AZ 100000 192 1575930129 1576034304 91894.86 85350.45 Theory Simulation vbox64_theory 300.02 254741279 128041310 4837 10391134 pp jets 7000 600 - sherpa 2.1.1 default 17000 192 1575853233 1576233183 362646.35 341416.3 Theory Simulation vbox64_theory 300.02 254630592 127977466 4837 10391134 pp zinclusive 7000 -,-,50,130 - pythia8 8.235 tune-AU2 100000 192 1575751776 1575862778 96628.02 93324.64 Theory Simulation vbox64_theory 300.02 254625034 127973791 4837 10391134 pp jets 13000 180,-,3560 - pythia8 8.235 tune-AU2ct10 62000 192 1575740078 1575930110 173697.48 167787.8 Theory Simulation vbox64_theory 300.02 253420913 127163759 5326 10504105 pp jets 7000 80,-,960 - pythia8 8.209 default-CD 100000 188 1575039288 1576016739 296105.53 283197.5 Theory Simulation vbox64_theory 300.02 253422242 126977012 5326 10504105 pp z1j 7000 100 - pythia8 8.180 default 100000 188 1575039288 1576089048 159666.67 148856 Theory Simulation vbox64_theory 300.02 254139932 127714391 5729 10525461 pp jets 13000 150,-,1860 - pythia8 8.235 tune-AU2lox 85000 190 1575492171 1575667360 136217.81 127170.5 Theory Simulation vbox64_theory 300.02 255074042 128217424 5729 10576889 pp jets 8000 180,-,3560 - herwig7 7.1.0 softTune 100000 192 1576027864 1576269124 118828.47 117313 Theory Simulation vbox64_theory 300.02 254577714 127943174 5729 10576889 pp jets 13000 150,-,2360 - pythia8 8.235 tune-4cx 100000 192 1575656145 1576067031 164301.9 163863.5 Theory Simulation vbox64_theory 300.02 253941568 127611080 5729 10576889 pp jets 13000 180,-,3560 - pythia8 8.235 tune-4c 100000 190 1575414568 1575606991 110049.36 109642.3 Theory Simulation vbox64_theory 300.02 254576122 127942035 6039 10555092 pp jets 13000 150,-,1860 - pythia8 8.235 early 72000 191 1575650268 1576038491 125044.98 93737.42 Theory Simulation vbox64_theory 300.02 Script log MODE: SEARCH FOR LONG RUNNERS USERID: 31; HOSTID: 10481542; OFFSET: 0 USERID: 31; HOSTID: 10481542; OFFSET: 20 USERID: 31; HOSTID: 10481542; OFFSET: 40 USERID: 31; HOSTID: 10481542; OFFSET: 60 USERID: 56; HOSTID: 10623869; OFFSET: 0 USERID: 56; HOSTID: 10623869; OFFSET: 20 USERID: 67; HOSTID: 10509223; OFFSET: 0 USERID: 67; HOSTID: 10509223; OFFSET: 20 USERID: 67; HOSTID: 10509223; OFFSET: 40 Task 254955577 is (sherpa) long runner! USERID: 67; HOSTID: 10509223; OFFSET: 60 USERID: 67; HOSTID: 10509223; OFFSET: 80 USERID: 67; HOSTID: 10509223; OFFSET: 100 Task 254725396 is (sherpa) long runner! USERID: 67; HOSTID: 10509223; OFFSET: 120 USERID: 67; HOSTID: 10509223; OFFSET: 140 USERID: 67; HOSTID: 10509223; OFFSET: 160 Task 254410071 is (sherpa) long runner! USERID: 67; HOSTID: 10509223; OFFSET: 180 USERID: 67; HOSTID: 10509223; OFFSET: 200 USERID: 248; HOSTID: 10622647; OFFSET: 0 USERID: 248; HOSTID: 10622647; OFFSET: 20 USERID: 248; HOSTID: 10623274; OFFSET: 0 USERID: 248; HOSTID: 10623646; OFFSET: 0 USERID: 248; HOSTID: 10623669; OFFSET: 0 USERID: 248; HOSTID: 10624337; OFFSET: 0 USERID: 248; HOSTID: 10624354; OFFSET: 0 USERID: 248; HOSTID: 10624524; OFFSET: 0 USERID: 248; HOSTID: 10624524; OFFSET: 20 USERID: 248; HOSTID: 10624524; OFFSET: 40 Task 254740811 is (sherpa) long runner! Task 254731110 is (sherpa) long runner! USERID: 248; HOSTID: 10624524; OFFSET: 60 USERID: 248; HOSTID: 10624524; OFFSET: 80 Task 254387019 is (sherpa) long runner! USERID: 248; HOSTID: 10624524; OFFSET: 100 Task 254260169 is (sherpa) long runner! USERID: 248; HOSTID: 10624524; OFFSET: 120 USERID: 248; HOSTID: 10624548; OFFSET: 0 USERID: 248; HOSTID: 10624548; OFFSET: 20 USERID: 248; HOSTID: 10624548; OFFSET: 40 USERID: 248; HOSTID: 10624548; OFFSET: 60 Task 254731795 is (sherpa) long runner! USERID: 248; HOSTID: 10624548; OFFSET: 80 USERID: 248; HOSTID: 10624548; OFFSET: 100 Task 254536231 is (sherpa) long runner! USERID: 248; HOSTID: 10624548; OFFSET: 120 Task 254384925 is (sherpa) long runner! USERID: 248; HOSTID: 10624548; OFFSET: 140 USERID: 248; HOSTID: 10626076; OFFSET: 0 USERID: 248; HOSTID: 10626076; OFFSET: 20 USERID: 248; HOSTID: 10626076; OFFSET: 40 USERID: 248; HOSTID: 10626076; OFFSET: 60 USERID: 248; HOSTID: 10626076; OFFSET: 80 USERID: 387; HOSTID: 10411018; OFFSET: 0 USERID: 387; HOSTID: 10411018; OFFSET: 20 USERID: 417; HOSTID: 10138801; OFFSET: 0 USERID: 417; HOSTID: 10138801; OFFSET: 20 USERID: 417; HOSTID: 10138801; OFFSET: 40 USERID: 417; HOSTID: 10138801; OFFSET: 60 USERID: 417; HOSTID: 10389967; OFFSET: 0 USERID: 417; HOSTID: 10389967; OFFSET: 20 USERID: 417; HOSTID: 10389967; OFFSET: 40 USERID: 462; HOSTID: 10625542; OFFSET: 0 USERID: 462; HOSTID: 10625542; OFFSET: 20 USERID: 462; HOSTID: 10625542; OFFSET: 40 USERID: 723; HOSTID: 10474310; OFFSET: 0 USERID: 723; HOSTID: 10474310; OFFSET: 20 USERID: 807; HOSTID: 10487134; OFFSET: 0 USERID: 829; HOSTID: 10346050; OFFSET: 0 USERID: 829; HOSTID: 10346050; OFFSET: 20 USERID: 1013; HOSTID: 10408677; OFFSET: 0 USERID: 1013; HOSTID: 10408677; OFFSET: 20 USERID: 1013; HOSTID: 10408677; OFFSET: 40 USERID: 1013; HOSTID: 10408677; OFFSET: 60 USERID: 1013; HOSTID: 10408677; OFFSET: 80 USERID: 1013; HOSTID: 10408677; OFFSET: 100 USERID: 1013; HOSTID: 10408677; OFFSET: 120 USERID: 1013; HOSTID: 10408677; OFFSET: 140 USERID: 1013; HOSTID: 10408677; OFFSET: 160 USERID: 1013; HOSTID: 10408677; OFFSET: 180 Task 254126655 is (sherpa) long runner! USERID: 1013; HOSTID: 10408677; OFFSET: 200 USERID: 1013; HOSTID: 10408728; OFFSET: 0 USERID: 1013; HOSTID: 10408728; OFFSET: 20 USERID: 1013; HOSTID: 10408728; OFFSET: 40 USERID: 1013; HOSTID: 10408728; OFFSET: 60 USERID: 1013; HOSTID: 10408728; OFFSET: 80 USERID: 1013; HOSTID: 10408728; OFFSET: 100 USERID: 1013; HOSTID: 10408728; OFFSET: 120 USERID: 1095; HOSTID: 10452531; OFFSET: 0 USERID: 1095; HOSTID: 10452531; OFFSET: 20 USERID: 1211; HOSTID: 10569580; OFFSET: 0 USERID: 1211; HOSTID: 10569580; OFFSET: 20 USERID: 1391; HOSTID: 10570005; OFFSET: 0 USERID: 1400; HOSTID: 10379522; OFFSET: 0 USERID: 1452; HOSTID: 10617473; OFFSET: 0 USERID: 1452; HOSTID: 10617473; OFFSET: 20 USERID: 1499; HOSTID: 10605693; OFFSET: 0 USERID: 1525; HOSTID: 10546691; OFFSET: 0 USERID: 1525; HOSTID: 10546691; OFFSET: 20 Task 254390931 is (sherpa) long runner! USERID: 1525; HOSTID: 10546691; OFFSET: 40 USERID: 1731; HOSTID: 10309088; OFFSET: 0 Task 253592355 is (sherpa) long runner! USERID: 1731; HOSTID: 10309088; OFFSET: 20 USERID: 1911; HOSTID: 10363794; OFFSET: 0 USERID: 1981; HOSTID: 10588410; OFFSET: 0 USERID: 1981; HOSTID: 10588410; OFFSET: 20 USERID: 2055; HOSTID: 10374085; OFFSET: 0 USERID: 2055; HOSTID: 10374085; OFFSET: 20 USERID: 2060; HOSTID: 10596199; OFFSET: 0 USERID: 2060; HOSTID: 10596199; OFFSET: 20 USERID: 2060; HOSTID: 10596199; OFFSET: 40 USERID: 2060; HOSTID: 10596202; OFFSET: 0 USERID: 2060; HOSTID: 10596202; OFFSET: 20 USERID: 2060; HOSTID: 10596213; OFFSET: 0 USERID: 2060; HOSTID: 10596213; OFFSET: 20 USERID: 2060; HOSTID: 10596213; OFFSET: 40 USERID: 2060; HOSTID: 10596213; OFFSET: 60 USERID: 2060; HOSTID: 10596213; OFFSET: 80 USERID: 2060; HOSTID: 10596218; OFFSET: 0 USERID: 2060; HOSTID: 10596218; OFFSET: 20 USERID: 2139; HOSTID: 10372172; OFFSET: 0 Task 255062615 is (sherpa) long runner! Task 254395033 is (sherpa) long runner! USERID: 2139; HOSTID: 10372172; OFFSET: 20 USERID: 2291; HOSTID: 10364695; OFFSET: 0 USERID: 2291; HOSTID: 10364695; OFFSET: 20 USERID: 2322; HOSTID: 10547004; OFFSET: 0 USERID: 2322; HOSTID: 10547004; OFFSET: 20 USERID: 2322; HOSTID: 10547004; OFFSET: 40 USERID: 2361; HOSTID: 10312023; OFFSET: 0 USERID: 2361; HOSTID: 10312023; OFFSET: 20 USERID: 2361; HOSTID: 10312023; OFFSET: 40 USERID: 2438; HOSTID: 10409807; OFFSET: 0 USERID: 2537; HOSTID: 10485944; OFFSET: 0 USERID: 2537; HOSTID: 10485944; OFFSET: 20 USERID: 2537; HOSTID: 10600848; OFFSET: 0 USERID: 2739; HOSTID: 10294511; OFFSET: 0 USERID: 2739; HOSTID: 10294511; OFFSET: 20 USERID: 2739; HOSTID: 10294511; OFFSET: 40 USERID: 2739; HOSTID: 10294511; OFFSET: 60 USERID: 2739; HOSTID: 10294511; OFFSET: 80 USERID: 2739; HOSTID: 10294511; OFFSET: 100 USERID: 2739; HOSTID: 10294511; OFFSET: 120 USERID: 2739; HOSTID: 10294511; OFFSET: 140 USERID: 2739; HOSTID: 10509390; OFFSET: 0 USERID: 2739; HOSTID: 10509390; OFFSET: 20 USERID: 2739; HOSTID: 10509390; OFFSET: 40 USERID: 3016; HOSTID: 10298563; OFFSET: 0 USERID: 3016; HOSTID: 10298563; OFFSET: 20 USERID: 3109; HOSTID: 10458123; OFFSET: 0 USERID: 3109; HOSTID: 10458123; OFFSET: 20 USERID: 3137; HOSTID: 10363404; OFFSET: 0 USERID: 3137; HOSTID: 10363404; OFFSET: 20 USERID: 3156; HOSTID: 10363583; OFFSET: 0 USERID: 3156; HOSTID: 10363583; OFFSET: 20 USERID: 3168; HOSTID: 10565243; OFFSET: 0 USERID: 3168; HOSTID: 10565243; OFFSET: 20 USERID: 3168; HOSTID: 10565243; OFFSET: 40 USERID: 3168; HOSTID: 10565243; OFFSET: 60 USERID: 3168; HOSTID: 10565243; OFFSET: 80 USERID: 3168; HOSTID: 10566007; OFFSET: 0 USERID: 3168; HOSTID: 10566007; OFFSET: 20 USERID: 3168; HOSTID: 10566007; OFFSET: 40 USERID: 3210; HOSTID: 9936309; OFFSET: 0 USERID: 3210; HOSTID: 9936309; OFFSET: 20 USERID: 3210; HOSTID: 9936309; OFFSET: 40 Task 254745497 is (sherpa) long runner! USERID: 3210; HOSTID: 9936309; OFFSET: 60 USERID: 3210; HOSTID: 9936309; OFFSET: 80 USERID: 3210; HOSTID: 9936309; OFFSET: 100 USERID: 3210; HOSTID: 9936309; OFFSET: 120 USERID: 3239; HOSTID: 10395886; OFFSET: 0 USERID: 3239; HOSTID: 10395886; OFFSET: 20 USERID: 3277; HOSTID: 10589437; OFFSET: 0 USERID: 3284; HOSTID: 10622682; OFFSET: 0 USERID: 3284; HOSTID: 10622682; OFFSET: 20 USERID: 3285; HOSTID: 10620488; OFFSET: 0 USERID: 3285; HOSTID: 10620488; OFFSET: 20 USERID: 3285; HOSTID: 10620488; OFFSET: 40 USERID: 3328; HOSTID: 9848732; OFFSET: 0 USERID: 3328; HOSTID: 9848732; OFFSET: 20 USERID: 3353; HOSTID: 10617180; OFFSET: 0 USERID: 3353; HOSTID: 10617180; OFFSET: 20 USERID: 3689; HOSTID: 10580251; OFFSET: 0 Task 255216964 is (sherpa) long runner! USERID: 3689; HOSTID: 10580251; OFFSET: 20 USERID: 3689; HOSTID: 10580251; OFFSET: 40 USERID: 3689; HOSTID: 10580251; OFFSET: 60 USERID: 3775; HOSTID: 10370707; OFFSET: 0 USERID: 3775; HOSTID: 10525345; OFFSET: 0 USERID: 3775; HOSTID: 10525345; OFFSET: 20 Task 254990460 is (sherpa) long runner! USERID: 3775; HOSTID: 10525345; OFFSET: 40 USERID: 3775; HOSTID: 10525345; OFFSET: 60 USERID: 3775; HOSTID: 10525345; OFFSET: 80 USERID: 3775; HOSTID: 10525345; OFFSET: 100 USERID: 3775; HOSTID: 10525345; OFFSET: 120 USERID: 3855; HOSTID: 10457508; OFFSET: 0 USERID: 3855; HOSTID: 10457508; OFFSET: 20 USERID: 3855; HOSTID: 10457508; OFFSET: 40 USERID: 3855; HOSTID: 10457508; OFFSET: 60 USERID: 3855; HOSTID: 10457508; OFFSET: 80 USERID: 3855; HOSTID: 10457508; OFFSET: 100 USERID: 3855; HOSTID: 10457508; OFFSET: 120 USERID: 3855; HOSTID: 10457508; OFFSET: 140 USERID: 3855; HOSTID: 10457508; OFFSET: 160 USERID: 3855; HOSTID: 10463494; OFFSET: 0 USERID: 3855; HOSTID: 10463494; OFFSET: 20 USERID: 3855; HOSTID: 10463494; OFFSET: 40 Task 254602898 is (sherpa) long runner! USERID: 3855; HOSTID: 10463494; OFFSET: 60 USERID: 3855; HOSTID: 10463494; OFFSET: 80 USERID: 3855; HOSTID: 10605580; OFFSET: 0 Task 254760248 is (sherpa) long runner! USERID: 3855; HOSTID: 10605580; OFFSET: 20 USERID: 3855; HOSTID: 10605580; OFFSET: 40 USERID: 3949; HOSTID: 9889908; OFFSET: 0 USERID: 3949; HOSTID: 9889908; OFFSET: 20 USERID: 3949; HOSTID: 9889908; OFFSET: 40 USERID: 3949; HOSTID: 10511876; OFFSET: 0 Task 253370499 is (sherpa) long runner! USERID: 3949; HOSTID: 10511876; OFFSET: 20 USERID: 3949; HOSTID: 10557139; OFFSET: 0 USERID: 3949; HOSTID: 10580769; OFFSET: 0 USERID: 3949; HOSTID: 10580769; OFFSET: 20 USERID: 3949; HOSTID: 10580769; OFFSET: 40 USERID: 3949; HOSTID: 10580769; OFFSET: 60 USERID: 3949; HOSTID: 10580769; OFFSET: 80 USERID: 4013; HOSTID: 10557791; OFFSET: 0 USERID: 4023; HOSTID: 10346523; OFFSET: 0 USERID: 4023; HOSTID: 10346523; OFFSET: 20 USERID: 4197; HOSTID: 10587886; OFFSET: 0 USERID: 4219; HOSTID: 19063; OFFSET: 0 USERID: 4219; HOSTID: 9763811; OFFSET: 0 USERID: 4219; HOSTID: 9763811; OFFSET: 20 USERID: 4345; HOSTID: 10486325; OFFSET: 0 USERID: 4345; HOSTID: 10486325; OFFSET: 20 USERID: 4465; HOSTID: 10619631; OFFSET: 0 USERID: 4465; HOSTID: 10619631; OFFSET: 20 USERID: 4716; HOSTID: 10581411; OFFSET: 0 USERID: 4716; HOSTID: 10581411; OFFSET: 20 USERID: 4716; HOSTID: 10615347; OFFSET: 0 USERID: 4837; HOSTID: 10391134; OFFSET: 0 USERID: 4837; HOSTID: 10391134; OFFSET: 20 Task 254902362 is (sherpa) long runner! Task 254741279 is (sherpa) long runner! USERID: 4837; HOSTID: 10391134; OFFSET: 40 Task 254630592 is (sherpa) long runner! Task 254625034 is (sherpa) long runner! USERID: 4837; HOSTID: 10391134; OFFSET: 60 USERID: 5326; HOSTID: 10504105; OFFSET: 0 Task 253420913 is (sherpa) long runner! Task 253422242 is (sherpa) long runner! USERID: 5326; HOSTID: 10504105; OFFSET: 20 USERID: 5373; HOSTID: 10583586; OFFSET: 0 USERID: 5472; HOSTID: 10447575; OFFSET: 0 USERID: 5472; HOSTID: 10447575; OFFSET: 20 USERID: 5472; HOSTID: 10451775; OFFSET: 0 USERID: 5472; HOSTID: 10451775; OFFSET: 20 USERID: 5670; HOSTID: 10322182; OFFSET: 0 USERID: 5670; HOSTID: 10322182; OFFSET: 20 USERID: 5670; HOSTID: 10322182; OFFSET: 40 USERID: 5670; HOSTID: 10322182; OFFSET: 60 USERID: 5670; HOSTID: 10322182; OFFSET: 80 USERID: 5729; HOSTID: 10522104; OFFSET: 0 USERID: 5729; HOSTID: 10522104; OFFSET: 20 USERID: 5729; HOSTID: 10522104; OFFSET: 40 USERID: 5729; HOSTID: 10522104; OFFSET: 60 USERID: 5729; HOSTID: 10522104; OFFSET: 80 USERID: 5729; HOSTID: 10522104; OFFSET: 100 USERID: 5729; HOSTID: 10522104; OFFSET: 120 USERID: 5729; HOSTID: 10522104; OFFSET: 140 USERID: 5729; HOSTID: 10522104; OFFSET: 160 USERID: 5729; HOSTID: 10522104; OFFSET: 180 USERID: 5729; HOSTID: 10522104; OFFSET: 200 USERID: 5729; HOSTID: 10522104; OFFSET: 220 USERID: 5729; HOSTID: 10522104; OFFSET: 240 USERID: 5729; HOSTID: 10522104; OFFSET: 260 USERID: 5729; HOSTID: 10522104; OFFSET: 280 USERID: 5729; HOSTID: 10522104; OFFSET: 300 USERID: 5729; HOSTID: 10525461; OFFSET: 0 USERID: 5729; HOSTID: 10525461; OFFSET: 20 USERID: 5729; HOSTID: 10525461; OFFSET: 40 Task 254139932 is (sherpa) long runner! USERID: 5729; HOSTID: 10525461; OFFSET: 60 USERID: 5729; HOSTID: 10576889; OFFSET: 0 Task 255074042 is (sherpa) long runner! USERID: 5729; HOSTID: 10576889; OFFSET: 20 USERID: 5729; HOSTID: 10576889; OFFSET: 40 Task 254577714 is (sherpa) long runner! USERID: 5729; HOSTID: 10576889; OFFSET: 60 Task 253941568 is (sherpa) long runner! USERID: 5729; HOSTID: 10576889; OFFSET: 80 USERID: 5905; HOSTID: 10406695; OFFSET: 0 USERID: 5905; HOSTID: 10406695; OFFSET: 20 USERID: 5914; HOSTID: 10486095; OFFSET: 0 USERID: 5914; HOSTID: 10486095; OFFSET: 20 USERID: 5914; HOSTID: 10530433; OFFSET: 0 USERID: 5941; HOSTID: 10620272; OFFSET: 0 USERID: 5943; HOSTID: 10588290; OFFSET: 0 USERID: 5943; HOSTID: 10588290; OFFSET: 20 USERID: 5943; HOSTID: 10603952; OFFSET: 0 USERID: 5943; HOSTID: 10603952; OFFSET: 20 USERID: 5943; HOSTID: 10603952; OFFSET: 40 USERID: 5943; HOSTID: 10603952; OFFSET: 60 USERID: 5943; HOSTID: 10603952; OFFSET: 80 USERID: 5943; HOSTID: 10603952; OFFSET: 100 USERID: 5943; HOSTID: 10613888; OFFSET: 0 USERID: 5943; HOSTID: 10613888; OFFSET: 20 USERID: 5943; HOSTID: 10613888; OFFSET: 40 USERID: 5943; HOSTID: 10613888; OFFSET: 60 USERID: 5943; HOSTID: 10616579; OFFSET: 0 USERID: 5943; HOSTID: 10616579; OFFSET: 20 USERID: 6039; HOSTID: 10555092; OFFSET: 0 USERID: 6039; HOSTID: 10555092; OFFSET: 20 USERID: 6039; HOSTID: 10555092; OFFSET: 40 USERID: 6039; HOSTID: 10555092; OFFSET: 60 USERID: 6039; HOSTID: 10555092; OFFSET: 80 USERID: 6039; HOSTID: 10555092; OFFSET: 100 Task 254576122 is (sherpa) long runner! USERID: 6039; HOSTID: 10555092; OFFSET: 120 USERID: 6039; HOSTID: 10555092; OFFSET: 140 USERID: 6101; HOSTID: 10323545; OFFSET: 0 USERID: 6101; HOSTID: 10323545; OFFSET: 20 USERID: 6101; HOSTID: 10323545; OFFSET: 40 USERID: 6190; HOSTID: 10525261; OFFSET: 0 USERID: 6190; HOSTID: 10525261; OFFSET: 20 USERID: 6205; HOSTID: 10532965; OFFSET: 0 USERID: 6205; HOSTID: 10532965; OFFSET: 20 USERID: 6228; HOSTID: 10619860; OFFSET: 0 USERID: 6228; HOSTID: 10619860; OFFSET: 20 USERID: 6298; HOSTID: 10382093; OFFSET: 0 USERID: 6298; HOSTID: 10382093; OFFSET: 20 |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 677 |
Time limit reached in db yesterday, but still running 11 days and 18 hours, with no end. This Sherpa was finished now from a other Volunteer in half a hour? |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
time left >3543d and increasing. https://lhcathome.cern.ch/lhcathome/result.php?resultid=255020975 ===> [runRivet] Tue Dec 10 11:18:18 UTC 2019 [boinc ee zhad 14 - - sherpa 2.2.0 default 1000 188] . . . 6.97613e+18 pb +- ( 2.2631e+18 pb = 32.4406 % ) 2172520000 ( 2172540913 -> 99.9 % ) integration time: ( 3d 8h 52m 24s elapsed / 3543d 7h 23m 28s left ) [09:35:07] Poincare::Poincare(): Inaccurate rotation { a = (-0.14095,0.824337,-0.548271) b = (0,0,1) a' = (0.905231,-0.35381,0.235321) -> rel. dev. (inf,-inf,-0.764679) m_ct = -0.548271 m_st = -0.836301 m_n = (0,-4.69413e-07,-7.05773e-07) } Poincare::Poincare(): Inaccurate rotation { a = (-0.14095,0.824337,-0.548271) b = (0,0,1) a' = (0.905231,-0.35381,0.235321) -> rel. dev. (inf,-inf,-0.764679) m_ct = -0.548271 m_st = -0.836301 m_n = (0,-4.69413e-07,-7.05773e-07) } Poincare::Poincare(): Inaccurate rotation { a = (0.4385,-0.489443,0.753766) b = (0,0,1) a' = (0.921121,-0.211997,0.326485) -> rel. dev. (inf,-inf,-0.673515) m_ct = 0.753766 m_st = -0.657143 m_n = (-0,6.46517e-07,4.19804e-07) } Poincare::Poincare(): Inaccurate rotation { a = (0.4385,-0.489443,0.753766) b = (0,0,1) a' = (0.921121,-0.211997,0.326485) -> rel. dev. (inf,-inf,-0.673515) m_ct = 0.753766 m_st = -0.657143 m_n = (-0,6.46517e-07,4.19804e-07) } Poincare::Poincare(): Inaccurate rotation { a = (-0.733593,-0.18741,-0.653238) b = (0,0,1) a' = (0.993764,-0.0307506,-0.107184) -> rel. dev. (inf,-inf,-1.10718) m_ct = -0.653238 m_st = -0.757153 m_n = (0,-3.56106e-07,1.02165e-07) } Poincare::Poincare(): Inaccurate rotation { a = (-0.733593,-0.18741,-0.653238) b = (0,0,1) a' = (0.993764,-0.0307506,-0.107184) -> rel. dev. (inf,-inf,-1.10718) m_ct = -0.653238 m_st = -0.757153 m_n = (0,-3.56106e-07,1.02165e-07) } Poincare::Poincare(): Inaccurate rotation { a = (-0.0616526,0.214378,-0.974803) b = (0,0,1) a' = (0.282742,-0.206022,0.936809) -> rel. dev. (inf,-inf,-0.0631907) m_ct = -0.974803 m_st = -0.223067 m_n = (0,-1.46214e-06,-3.21553e-07) } Poincare::Poincare(): Inaccurate rotation { a = (-0.0616526,0.214378,-0.974803) b = (0,0,1) a' = (0.282742,-0.206022,0.936809) -> rel. dev. (inf,-inf,-0.0631907) m_ct = -0.974803 m_st = -0.223067 m_n = (0,-1.46214e-06,-3.21553e-07) } 6.97607e+18 pb +- ( 2.26308e+18 pb = 32.4406 % ) 2172540000 ( 2172560913 -> 99.9 % ) integration time: ( 3d 8h 52m 27s elapsed / 3543d 8h 15m 20s left ) [09:35:12] |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
Looks like it got stuck after 92100 events. https://lhcathome.cern.ch/lhcathome/result.php?resultid=255475321 ===> [runRivet] Sat Dec 14 05:12:22 UTC 2019 [boinc pp jets 7000 65 - sherpa 1.4.1 default 100000 194] . . . Event 92100 ( 10h 27m 32s elapsed / 53m 49s left ) -> ETA: Sat Dec 14 18:05 92100 events processed Updating display... Poincare::Poincare(): Inaccurate rotation { a = (0,0,1) b = (0.23570226039552,0.94280904158206,0.23570226039552) a' = (0.97182531580755,0,0.23570226039552) -> rel. dev. (3.1231056256177,-1,0) m_ct = 0.23570226039552 m_st = -0.97182531580755 m_n = (0,1,0) } Display update finished (127 histograms, 92000 events). Updating display... Display update finished (127 histograms, 92000 events). . . . Updating display... Display update finished (127 histograms, 92000 events). |
Send message Joined: 7 Feb 14 Posts: 99 Credit: 5,180,005 RAC: 0 |
Another successful long sherpa for me. pp jets 8000 800 - sherpa 1.3.1 default 26000 1942 days 5 hours 18 minutes 54 seconds https://lhcathome.cern.ch/lhcathome/result.php?resultid=255556375 Is it useful to report them here? Or should I write about problematic/buggy tasks only? Is it useful my list for someone? Do admins/project scientists save informations about sherpa runtime to a database? |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 1,882 |
Shall I give this one a try? ===> [runRivet] Thu Dec 19 15:36:23 UTC 2019 [boinc pp jets 7000 150,-,2360 - sherpa 2.2.5 default 1000 195]Of 137 attempts, 'only' 7 were successful to process 1000 events each, whereof 2 from the last 8 attemps. That's interesting: So 5 of the first 129 attempts and 2 of the last 8. Maybe something to do with the fact that the VM's are not longer killed after 18 hours wall clock run time. Edit: It's a repair job. The previous try ended in an error after 2 days and 18 hours: 2019-12-19 11:31:20 (11120): Status Report: Elapsed Time: '222022.785223' 2019-12-19 11:31:20 (11120): Status Report: CPU Time: '226057.859375' 2019-12-19 11:32:15 (11120): Guest Log: job: CPU usage: 2019-12-19 11:32:15 (11120): Guest Log: 0m0.115s 0m0.217s 2019-12-19 11:32:15 (11120): Guest Log: 3265m39.044s 221m39.547s 2019-12-19 11:32:17 (11120): Guest Log: 11:32:16 CET +01:00 2019-12-19: cranky: [ERROR] Container 'runc' terminated with status code 1. 2019-12-19 11:32:31 (11120): Guest Log: [ERROR] Job Failed |
Send message Joined: 9 Feb 16 Posts: 48 Credit: 537,111 RAC: 0 |
This Sherpa was finished now from a other Volunteer in half a hour? I've seen that happen with my Virtual Box long-runners; tasks that I aborted after days or weeks of running were completed in a fraction of the time by another host. |
©2024 CERN