Message boards :
Theory Application :
(Native) Theory - Sherpa looooooong runners
Message board moderation
Previous · 1 . . . 4 · 5 · 6 · 7 · 8 · Next
Author | Message |
---|---|
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,920,098 RAC: 35,202 |
Crystal Pellet wrote on Feb. 19: The complete list of 613 failing sherpa's on to now IMO:You could add to this list: ppbar zinclusive 1960 -,-,50,120 - sherpa 2.1.0 default I just aborted it, since console F2 said: Exception.Handler::Signal Handler: Signal (6) caught. Cannot continue and no CPU activity was shown on F3. |
Send message Joined: 7 Jan 07 Posts: 41 Credit: 16,102,983 RAC: 10 |
Crystal Pellet, in your list, I have one "ppbar jets 1960 37 - sherpa 2.2.6 default" succeeded. |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 2,078 |
You could add to this list: There's 1 success in MC Plots database for that description. |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 2,078 |
Crystal Pellet, in your list, I have one "ppbar jets 1960 37 - sherpa 2.2.6 default" succeeded.Thanks, the list was based on 'only' 8 attempts for each description, but it seems more attempts could result in a success and your's was a fast one: 3 hours and 22 minutes. One may build an own list from http://mcplots-dev.cern.ch/production.php?view=runs&rev=2363&display=all and filtering on keyword sherpa. Edit: reduced the previous sherpa list from 613 to 577 occasions. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,920,098 RAC: 35,202 |
hello Crystal Pellet, There is a "boinc ee zhad 206 - - sherpa 2.2.8 default" on one of my machines, NOT contained (yet) in your list - 22 hrs 35 min elapsed, 1108 days left (number is increasing); I guess I can abort this task, right? P.S. I just found it in http://mcplots-dev.cern.ch/production.php?view=runs&rev=2363&display=all with 54000 / 18 / 3 / 5 / 10 but still, the increasing number of "days left" is indicating a failure, isn't it? |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 2,078 |
There is a "boinc ee zhad 206 - - sherpa 2.2.8 default" on one of my machines, NOT contained (yet) in your list - 22 hrs 35 min elapsed, 1108 days left (number is increasing)When it's not in 'my' list, it will never come on it. 3 successes. It's up to you to gamble between a great disappointment and great satisfaction ;) Increasing left time means not always that a task will end in an error, but that it's surely a heavy task. I, myself have seen successes after such an increasing left time suddenly jumping to event processing. |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 4 |
One may build an own list from http://mcplots-dev.cern.ch/production.php?view=runs&rev=2363&display=all and filtering on keyword sherpa.Out of interest, what does the "2363" refer to? For "PbPb heavyion-mb 2760" I get run events attempts success failure lostbut 265306266 is still running and reporting "events processed" to the log (I'm estimating an 80-hour run time). Or is lost a badly-chosen euphemism for still running? |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 2,078 |
Out of interest, what does the "2363" refer to?That is the revision of MC Production -> http://mcplots-dev.cern.ch/production.php?view=control For "PbPb heavyion-mb 2760" I getNo, probably not. In theory you could be right, but a task would be running then for weeks and weeks, I think.run events attempts success failure lostbut 265306266 is still running and reporting "events processed" to the log (I'm estimating an 80-hour run time). I don't know when a job is declared lost (No return, but since when?) The last number of your job description (after the # events) is the attempt sequence. It will be something like 40 atm. In the list it's still 10. |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 4 |
Thanks. The last number of your job description (after the # events) is the attempt sequence. It will be something like 40 atm. In the list it's still 10.The answer is that it is, of course, 42! See other thread. |
Send message Joined: 18 Dec 15 Posts: 1821 Credit: 118,920,098 RAC: 35,202 |
Crystal Pellet wrote: Increasing left time means not always that a task will end in an error, but that it's surely a heavy task.meanwhile, "left time" has jumped up to 4983 days. Should I get worried by now? |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
Should I get worried by now? No - as long as there are no Vogon ships around. Just prepare a towel. |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 4 |
... and for the rest of you, just keep banging theShould I get worried by now?No - as long as there are no Vogon ships around. Just prepare a towel. |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 4 |
I think that's my point - it looks like lost (which has negative connotations) is being used for everything which hasn't yet returned a success or failure, which could be for entirely reasonable reasons such as still being queued or running.Or is lost a badly-chosen euphemism for still running?No, probably not. In theory you could be right, but a task would be running then for weeks and weeks, I think. |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 2,078 |
===> [runRivet] Thu Mar 5 14:17:32 UTC 2020 [boinc pp jets 7000 300 - sherpa 1.4.1 default 100000 44]Run time 6 days 5 hours 31 min 35 sec CPU time 6 days 2 hours 28 min 50 sec Peak disk usage 3.80 GB https://lhcathome.cern.ch/lhcathome/result.php?resultid=266301804 |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 2,078 |
Passed the 10 days deadline, but accepted: ===> [runRivet] Wed Mar 4 11:41:37 UTC 2020 [boinc pp jets 7000 400 - sherpa 1.4.2 default 100000 44]Task: https://lhcathome.cern.ch/lhcathome/result.php?resultid=265985279 Run time 11 days 5 hours 38 min 47 sec CPU time 11 days 1 hours 25 min 51 sec Peak disk usage 4.22 GB |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 748 |
This Sherpa 1.4.2 [boinc pp jets 8000 600 - sherpa 1.4.2 default 100000 32] https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=143167540 did not finished a second time (because of a Restart of the Linux VM) at the same info: 55900 events processed Event 56000 ( 8h 16m 56s elapsed / 6h 30m 27s left ) -> ETA: Tue Aug 04 23:50 XS = 24.568347762363 pb +- ( 0.10380316347774 pb = 0.42 % ) 56000 events processed dumping histograms... Event 56100 ( 8h 17m 49s elapsed / 6h 29m 33s left ) -> ETA: Tue Aug 04 23:50 56100 events processed Error in Cluster_Formation_Handler::ClustersToHadrons : Did not find a kinematically allowed solution for the cluster list. Will trigger a new event. Event 56200 ( 8h 18m 49s elapsed / 6h 28m 46s left ) -> ETA: Tue Aug 04 23:50 56200 events processed Event 56300 ( 8h 19m 31s elapsed / 6h 27m 44s left ) -> ETA: Tue Aug 04 23:50 56300 events processed Need investigation, because is started for a other User (now a third time!!) from the System |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 748 |
Theory_2390-1148370-34 - [boinc pp winclusive 7000 10 - sherpa 1.4.5 default 3000 34] Now 22 hours and 40 Min. Any chance to see a end? Channel_Elements::GenerateYForward(2.1912818017204e-09,{-8.98847e+307,0,-8.98847e+307,0,0,},{-10,10,-9.24186,}): Y out of bounds ! ymin, ymax vs. y : -9.96939 9.96939 vs. -9.96939 Setting y to lower bound ymin=-9.96939 ISR_Handler::MakeISR(..): s' out of bounds. s'_{min}, s'_{max 1,2} vs. s': 0.0049, 4.9e+07, 4.9e+07 vs. 0.0049 ISR_Handler::MakeISR(..): s' out of bounds. s'_{min}, s'_{max 1,2} vs. s': 0.0049, 4.9e+07, 4.9e+07 vs. 0.0049 Channel_Elements::GenerateYBackward(1.4743920286663e-10,{-8.98847e+307,0,-8.98847e+307,0,0,},{-10,10,3.99746,}): Y out of bounds ! ymin, ymax vs. y : -10 10 vs. 10 Setting y to upper bound ymax=10 |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 15,000,737 RAC: 4 |
Try to grep the log file for events or elapsed - this can make it easier to see if there's any actual progress. But a continuing stream of ISR_Handler errors is usually a bad sign in my experience. |
Send message Joined: 2 May 07 Posts: 2244 Credit: 173,902,375 RAC: 748 |
Thank you Henry, this is from the beginning, will cancel the task: Comix was compiled for multithreading. Matrix_Element_Handler::BuildProcesses(): Looking for processes ............................................................................................ done ( 232156 kB, 19s / 19s ). Matrix_Element_Handler::InitializeProcesses(): Performing tests .................................................................................... done ( 232552 kB, 0s / 0s ). Initialized the Matrix_Element_Handler for the hard processes. Initialized the Soft_Photon_Handler. Hadron_Decay_Map::Read: Initializing HadronDecays.dat. This may take some time. Initialized the Hadron_Decay_Handler, Decay model = Hadrons Process_Group::CalculateTotalXSec(): Calculate xs for '2_2__j__j__e-__nu_eb' (Internal) Starting the calculation. Lean back and enjoy ... . Channel_Elements::GenerateYBackward(2.2879576294321e-09,{-8.98847e+307,0,-8.98847e+307,0,0,},{-10,10,5.42984,}): Y out of bounds ! ymin, ymax vs. y : -9.9478031411026 9.9478031411026 vs. 9.9478031411026 |
Send message Joined: 15 Jun 08 Posts: 2541 Credit: 254,608,838 RAC: 56,545 |
===> [runRivet] Fri Aug 7 00:05:40 UTC 2020 [boinc pp winclusive 7000 20 - sherpa 2.2.5 default 2000 34] . . . +----------------------------------+ | | | CCC OOO M M I X X | | C O O MM MM I X X | | C O O M M M I X | | C O O M M I X X | | CCC OOO M M I X X | | | +==================================+ | Color dressed Matrix Elements | | http://comix.freacafe.de | | please cite JHEP12(2008)039 | +----------------------------------+ Matrix_Element_Handler::BuildProcesses(): Looking for processes .................................................................................................................................................................................... done ( 45 MB, 7s / 7s ). Matrix_Element_Handler::InitializeProcesses(): Performing tests .................................................................................................................................................................................... done ( 45 MB, 0s / 0s ). Initialized the Matrix_Element_Handler for the hard processes. Initialized the Beam_Remnant_Handler. Hadron_Decay_Map::Read: Initializing HadronDecays.dat. This may take some time. Initialized the Hadron_Decay_Handler, Decay model = Hadrons Initialized the Soft_Photon_Handler. Variations::InitialiseParametersVector(0 variations){ Named variations: } Process_Group::CalculateTotalXSec(): Calculate xs for '2_2__j__j__e-__veb' (Comix) Starting the calculation at 00:08:46. Lean back and enjoy ... . No more logfile lines for >5 days. |
©2024 CERN