Message boards :
Theory Application :
Theory's endless looping
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5 . . . 6 · Next
Author | Message |
---|---|
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
computezrmle wrote:
When the number of events slowly increases, let it run. On the first line of the running.log (second last number), you can see the total number of events to process. |
Send message Joined: 15 Jun 08 Posts: 2509 Credit: 249,197,047 RAC: 127,312 |
Crystal Pellet wrote: On the first line of the running.log (second last number), you can see the total number of events to process. If you refer to the output on console 2: Unfortunately I can only see the last 25 lines of the log there. If you refer to a file on the local filesystem: There is no file with this name with reference to the running WU. |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
Crystal Pellet wrote:On the first line of the running.log (second last number), you can see the total number of events to process. In BOINC Manager you are able when the running task is highlighted to use the button "Show Graphics". From there you have access to the Logs inside the VM. |
Send message Joined: 24 Oct 04 Posts: 1157 Credit: 53,054,009 RAC: 62,641 |
I usually look at that Log with the VB Manager but I guess there you don't get to see that picture -------- Event generation run with SHERPA started ....... ----------- ----------------------------------------------------------------------------- ................................................ | + ................................................ || | + + ................................... .... | | / + ................. ................ _,_ | .... || +| + + ............................... __.' ,\| ... || / +| + .............................. ( \ \ ... | | | + + \ + ............................. ( \ -/ .... || + | + ........ ................... <S /()))))~~~~~~~~## + /\ + ............................ (!H (~~)))))~~~~~~#/ + + | + ................ ........... (!E (~~~))))) /|/ + + ............................ (!R (~~~))))) ||| + + + ..... ...................... (!P (~~~~))) /| + + + ............................ (!A> (~~~~~~~~~## + + + ............................. ~~(! '~~~~~~~ \ + + + + ............................... `~~~QQQQQDb // | + + + + ........................ .......... IDDDDP|| \ + + + + + + .................................... IDDDI|| \ + .................................... IHD HD|| \ + + + + + + + + ................................... IHD ##| :-) + +\ + ......... ............... ......... IHI ## / / + + + + +\ + ................................... IHI/ / / + + + + + ................................... ## | | / / + + + + / + ....................... /TT\ ..... ##/ /// / + + + + + + +/ + ......................./TTT/T\ ... /TT\/\\\ / + + + + + + +/ \ + version 1.4.0 ......../TTT/TTTT\...|TT/T\\\/ + ++ + / ----------------------------------------------------------------------------- (I'm watching Cern LHC on the History Channel right now) 2014 so they are talking about that man made Black Hole still |
Send message Joined: 15 Jun 08 Posts: 2509 Credit: 249,197,047 RAC: 127,312 |
In BOINC Manager you are able when the running task is highlighted to use the button "Show Graphics". Yes, thank you to remind me. I did not use this button since it did not work a long while ago. But now it works perfect. By the way, the sherpa job recovered and finished. So, it was the right decision to let it run. |
Send message Joined: 15 Jun 08 Posts: 2509 Credit: 249,197,047 RAC: 127,312 |
WU: https://lhcathome.cern.ch/lhcathome/result.php?resultid=150343289 I opened the running.log of the last sherpa job in an editor just a few minutes before the WU was closed (18 h limit). If somebody from the sherpa team needs the complete log for exmination I will keep it for a while. On the other hand I'm nearly sure that the log is already available on the project server, isn't it? Some snippets: ===> [runRivet] Mon Jul 3 15:30:38 CEST 2017 [boinc pp jets 7000 150 - sherpa 2.1.0 default 33000 968] |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
On the other hand I'm nearly sure that the log is already available on the project server, isn't it? I don't think so, cause these endless tasks will never finish and I believe there's no (partial) upload until the task ends normally. |
Send message Joined: 15 Jun 08 Posts: 2509 Credit: 249,197,047 RAC: 127,312 |
On the other hand I'm nearly sure that the log is already available on the project server, isn't it? Hm, ok. I saved the logfile and will not delete it for a few days. If it is of interest I can mail it. To post it here wouldn't be a good idea as it has more than 4300 lines. |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
===> [runRivet] Sat Jul 22 22:01:11 CEST 2017 [boinc pp uemb-soft 53 - - sherpa 2.1.1 default 3000 960] . . . integration time: ( 6m 36s elapsed / 42s left ) [22:17:10] Updating display... Display update finished (0 histograms, 0 events). 7.48543e+08 pb +- ( 2.28853e+06 pb = 0.305731 % ) 300000 ( 695894 -> 42.2 % ) integration time: ( 7m 6s elapsed / 15s left ) [22:17:42] 7.48963e+08 pb +- ( 2.24534e+06 pb = 0.299794 % ) 310000 ( 719533 -> 42.2 % ) integration time: ( 7m 22s elapsed / 0s left ) [22:17:57] 2_2__j__j__j__j : 7.48963e+08 pb +- ( 2.24534e+06 pb = 0.299794 % ) exp. eff: 0.637125 % reduce max for 2_2__j__j__j__j to 0.65124 ( eps = 0.001 ) Output_Phase::Output_Phase(): Set output interval 1000000000 events. ---------------------------------------------------------- -- SHERPA generates events with the following structure -- ---------------------------------------------------------- Perturbative : Signal_Processes Perturbative : Hard_Decays Perturbative : Jet_Evolution:CSS Perturbative : Lepton_FS_QED_Corrections:Photons Perturbative : Multiple_Interactions:Amisic Perturbative : Minimum_Bias:Off Hadronization : Beam_Remnants Hadronization : Hadronization:Ahadic Hadronization : Hadron_Decays Analysis : HepMC2 Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). endless . . . |
Send message Joined: 24 Oct 04 Posts: 1157 Credit: 53,054,009 RAC: 62,641 |
The only place I am having ANY luck is at vLHC-dev and even there the server is asleep half the time. https://lhcathome.cern.ch/lhcathome/results.php?userid=5472 About 140 hours of nothing but that *heartbeat* ........still after almost 7 years I am not a fan of Oracle VB (must admit Theory is basically all I have been running and I haven't had one of those looping tasks for several years) |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
===> [runRivet] Thu Jul 27 07:57:37 CEST 2017 [boinc pp uemb-hard 7000 - - sherpa 1.4.1 default 100000 960] . . . 17000 events processed dumping histograms... Event 17100 ( 52m 15s elapsed / 4h 13m 19s left ) -> ETA: Thu Jul 27 13:21 17100 events processed Updating display... Display update finished (6 histograms, 17000 events). Error in Splitting_Tools::ConstructKinematics(kt = -nan, z = 0.657727, y = 0.927099). Event 17200 ( 52m 32s elapsed / 4h 12m 55s left ) -> ETA: Thu Jul 27 13:21 17200 events processed Updating display... Display update finished (6 histograms, 17000 events). Updating display... Display update finished (6 histograms, 17000 events). Updating display... Display update finished (6 histograms, 17000 events). Updating display... Display update finished (6 histograms, 17000 events). etc etc etc |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 |
20:18:17 +0100 2017-08-10 [INFO] New Job Starting in slot1 20:18:17 +0100 2017-08-10 [INFO] Condor JobID: 4095653.0 in slot1 20:18:22 +0100 2017-08-10 [INFO] MCPlots JobID: 36792248 in slot1 ===> [runRivet] Thu Aug 10 20:18:17 BST 2017 [boinc ppbar uemb-soft 63 - - sherpa 2.1.1 default 4000 914] Did all the setup stuff but 10 hrs overnight of: Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... And another on a different machine today; 07:03:48 +0100 2017-08-11 [INFO] New Job Starting in slot1 07:03:48 +0100 2017-08-11 [INFO] Condor JobID: 4130927.0 in slot1 07:03:53 +0100 2017-08-11 [INFO] MCPlots JobID: 37718454 in slot1 ===> [runRivet] Fri Aug 11 07:03:48 BST 2017 [boinc ppbar uemb-soft 53 - - sherpa 2.1.1 default 5000 976] |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 |
Another looping sherpa 02:26:12 +0100 2017-09-10 [INFO] New Job Starting in slot1 02:26:12 +0100 2017-09-10 [INFO] Condor JobID: 4553367.0 in slot1 02:26:17 +0100 2017-09-10 [INFO] MCPlots JobID: 38258727 in slot1 [runRivet] Sun Sep 10 02:26:12 BST 2017 [boinc pp jets 7000 20,-,610 - sherpa 2.2.0 default 100000 1014] I just happened to notice the elapsed time being 15+ hours so had a look at the logs and, sure enough, 14 of those hours have been Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events I have manually reset the VM so it will do something more useful for its remaining couple of hours. |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 |
Most often a looping job will be a Sherpa but for a novelty I have a looping Pythia job today; 11:22:08 +0200 2017-10-24 [INFO] New Job Starting in slot1 11:22:08 +0200 2017-10-24 [INFO] Condor JobID: 4935273.111 in slot1 11:22:13 +0200 2017-10-24 [INFO] MCPlots JobID: 38834924 in slot1 Tue Oct 24 11:22:09 CEST 2017 [boinc pp uemb-soft 2360 - - pythia8 8.165 default-MBR 100000 1066] Nothing unexpected until; 81400 events processed 81500 events processed 81600 events processed Updating display... Display update finished (28 histograms, 81000 events). Updating display... Display update finished (28 histograms, 81000 events). Updating display... Display update finished (28 histograms, 81000 events). If survives the VBox upgrade I'm about to do, I'll reset the VM. |
Send message Joined: 31 Jan 11 Posts: 12 Credit: 3,557,813 RAC: 0 |
For the looping Sherpa jobs, we are considering deprecating some of the older Sherpa versions which should improve things somewhat. Generally, patches are not retroactively added to older code versions, so to the extent the underlying problem is fixed by the Sherpa authors, it will appear as improved behaviour in the newer versions, and most of the posts I see here about looping Sherpa jobs cite the older ones. Ray, thanks for letting us know about the looping of the Pythia job. In this particular run, the "default-MBR" label means that the generator is using an alternative model for so-called diffractive processes, called the Minimum-Bias-Rockefeller model. (Diffraction, in the particle-physics context, occurs when one or both of the colliding protons fluctuate to spit off a little "ball" of gluons (called a Pomeron) which coherently carry some fraction of the proton momentum, and the other beam particle hits that ball of glue rather than the original proton). In order to pinpoint if the looping is associated with that particular model, or with a rare occurrence in Pythia in general, I would be very glad to know if anyone sees another looping Pythia job, and if so if it was again with the default-MBR model, or a different setting. |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
===> [runRivet] Thu Feb 22 12:09:00 CET 2018 [boinc ppbar uemb-soft 53 - - sherpa 2.1.0 default 1000 102] . . . full optimization: ( 3m 53s (3m 47s) elapsed / 1m 35s (1m 32s) left ) [12:18:54] Updating display... Display update finished (0 histograms, 0 events). 7.63235e+08 pb +- ( 2.733e+06 pb = 0.358081 % ) 240000 ( 524344 -> 47 % ) integration time: ( 4m 16s (4m 9s) elapsed / 1m 14s (1m 12s) left ) [12:19:16] 7.63114e+08 pb +- ( 2.59818e+06 pb = 0.340471 % ) 260000 ( 566629 -> 47.1 % ) integration time: ( 4m 40s (4m 32s) elapsed / 53s (52s) left ) [12:19:40] 7.63429e+08 pb +- ( 2.48336e+06 pb = 0.32529 % ) 280000 ( 608611 -> 47.3 % ) integration time: ( 5m 3s (4m 55s) elapsed / 33s (31s) left ) [12:20:04] Updating display... Display update finished (0 histograms, 0 events). 7.64921e+08 pb +- ( 2.387e+06 pb = 0.312059 % ) 300000 ( 650575 -> 47.4 % ) integration time: ( 5m 25s (5m 16s) elapsed / 11s (11s) left ) [12:20:26] 7.6399e+08 pb +- ( 2.33729e+06 pb = 0.305932 % ) 310000 ( 671845 -> 47.3 % ) integration time: ( 5m 37s (5m 27s) elapsed / 0s (0s) left ) [12:20:37] 2_2__j__j__j__j : 7.6399e+08 pb +- ( 2.33729e+06 pb = 0.305932 % ) exp. eff: 0.697804 % reduce max for 2_2__j__j__j__j to 0.678184 ( eps = 0.001 ) Output_Phase::Output_Phase(): Set output interval 1000000000 events. ---------------------------------------------------------- -- SHERPA generates events with the following structure -- ---------------------------------------------------------- Perturbative : Signal_Processes Perturbative : Hard_Decays Perturbative : Jet_Evolution:CSS Perturbative : Lepton_FS_QED_Corrections:Photons Perturbative : Multiple_Interactions:Amisic Perturbative : Minimum_Bias:Off Hadronization : Beam_Remnants Hadronization : Hadronization:Ahadic Hadronization : Hadron_Decays Analysis : HepMC2 Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). etc etc etc |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
===> [runRivet] Mon Mar 12 09:18:50 CET 2018 [boinc ee zhad 133 - - sherpa 1.4.5 default 65000 126] . . . 11.9572 pb +- ( 0.0590684 pb = 0.493997 % ) 290000 ( 337476 -> 86.3 % ) integration time: ( 3m 53s(3m 37s) elapsed / 16s(15s) left ) 11.9601 pb +- ( 0.0580373 pb = 0.485257 % ) 300000 ( 349098 -> 86.3 % ) integration time: ( 4m(3m 43s) elapsed / 8s(8s) left ) 11.9548 pb +- ( 0.0569209 pb = 0.476135 % ) 310000 ( 360697 -> 86.3 % ) integration time: ( 4m 7s(3m 50s) elapsed / 0s(0s) left ) 2_4__e-__e+__j__j__j__j : 11.9548 pb +- ( 0.0569209 pb = 0.476135 % ) exp. eff: 0.272756 % reduce max for 2_4__e-__e+__j__j__j__j to 0.797682 ( eps = 0.001 ) Process_Group::CalculateTotalXSec(): Calculate xs for '2_5__e-__e+__j__j__j__j__j' (Comix) Starting the calculation. Lean back and enjoy ... . Exception_Handler::GenerateStackTrace(..): Generating stack trace { } Exception_Handler::SignalHandler: Signal (6) caught. Cannot continue. Exception_Handler::GenerateStackTrace(..): Generating stack trace { } ------------>> above Exception 32 times and then endless Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). Updating display... Display update finished (0 histograms, 0 events). etc etc etc |
Send message Joined: 1 Sep 04 Posts: 139 Credit: 2,579 RAC: 0 |
Hi CP. I'll be seeing Peter and Anton tomorrow or Wednesday and will ask them about this. Thanks a lot! Ben |
Send message Joined: 1 Sep 04 Posts: 139 Credit: 2,579 RAC: 0 |
Hi CP. Hi again, I just met with Peter and Anton and they said: 1. Even looping jobs are sometimes useful for development, especially for Sherpa. 2. Anton will try and add a check in the code to detect looping and terminate gracefully. 3. Thanks to all our volunteers! And tomorrow we pass 4 TRILLION EVENTS for Theory !!!! Ben |
Send message Joined: 14 Jan 10 Posts: 1384 Credit: 9,169,183 RAC: 4,226 |
I just met with Peter and Anton and they said: Hello Ben, I don't understand why looping jobs on BOINC-clients can be useful, except when they are reported in this thread. Normally the job will be killed after BOINC's 18 hours time limit without notice to the client and no result will be reported to the server. |
©2024 CERN