Message boards :
Theory Application :
Theory Sherpa
Message board moderation
Author | Message |
---|---|
Send message Joined: 24 Oct 04 Posts: 1155 Credit: 51,415,952 RAC: 38,687 |
https://lhcathome.cern.ch/lhcathome/result.php?resultid=281574899 Still just running for 10 days and using CPU only to become *Computation error* Same with version 300.06 and 5.21 They use CPU and internet UL/DL the entire time but just end up wasting 10 days and some of the members are running so many cores that they don't notice they got Sherpa's along with the other event generators. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
sherpa 2.2.4 default pp winclusive 7000 20 - 0+20/20 - Lean back and enjoy since 24 hours?!? Theory_2390-1149680-46_0 https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=145132057 |
Send message Joined: 13 Jul 05 Posts: 169 Credit: 14,980,093 RAC: 135 |
sherpa 2.2.4 defaultI'd give up on it. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
I'd give up on it. Ok, this is a Ryzen3950x. Will wait the next two days. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
After 5 days and only lean back and enjoy text.. canceled it. Now the next Volunteer is running the task. Sherpa 2.2.4 need investigation from Cern-IT! |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Sherpa stopped always at the same Event: Canceled it now! Is there a interest to stop this Sherpa's? INFO: (display) vars=pp jets 7000 25,-,100 sherpa 1.4.1 default INFO: display service switched off ===> [rungen] Wed Aug 9 04:49:47 UTC 2023 [boinc pp jets 7000 25,-,100 - sherpa 1.4.1 default 100000 524 /shared/tmp/tmp.D0HQTRKnmB/generator.hepmc] Setting environment for sherpa 1.4.1 ... 22700 events processed Error in Splitting_Tools::ConstructKinematics(kt = -nan, z = 0.676233, y = 0.559974). Event 22800 ( 19m 31s elapsed / 1h 6m 6s left ) -> ETA: Wed Aug 09 06:24 22800 events processed Event 22900 ( 19m 36s elapsed / 1h 6m left ) -> ETA: Wed Aug 09 06:24 22900 events processed |
Send message Joined: 15 Jul 05 Posts: 23 Credit: 2,331,449 RAC: 2,441 |
Looks like I too have a problem with Sherpa process https://lhcathome.cern.ch/lhcathome/result.php?resultid=397140224
7900 events processed Event 8000 ( 18m 15s elapsed / 3h 29m 54s left I -> ETA: Sat Aug 05 23:09 XS = 178969 pb +- ( 1997.22 pb = 1.11 %. ) 8000 events processed dumping histograms ... Error in Splitting_Tools::ConstructKinematics(kt = -nan, z = 0.589889, y = 0.300539). Event 8100 ( 18m 30s elapsed / 3h 29m 54s left I -> ETA: Sat Aug 05 23:09 Rivet. Analysis. CMS_2011_S8968497: WARN Skipping histo with null area /CMS_2011_S8968497 /d01-x01-y01 Rivet. Analysis. CMS_2011_S8968497: WARN Skipping histo with null area /CMS_Z011_S8968497 /d0Z-x01-y01 Rivet. Analysis. CMS_Z011_S8968497: WARN Skipping histo with null area /CMS_Z011_S8968497 /d03-x01-y01 Rivet. Analysis. CMS_Z011_S8968497: WARN Skipping histo with null area /CMS_Z011_S8968497 /d04-x01-y01 8100 events processed Event 8200 ( 18m 41s elapsed / 3h 29m 13s left I -> ETA: Sat Aug 05 23:09 8200 events processed
Matthias |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
We need Theory WITHOUT Sherpa or if wanted from the Volunteer's with Sherpa. This daily watching, when you have 64 Core is NOT a good handling!! |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 245,701,514 RAC: 68,432 |
Errors can be annoying, but so far this sherpa setup computed 11 million events with a success rate of close to 90 %. From mc-plots: run events attempts success failure unknown pp jets 7000 25,-,100 - sherpa 1.4.1 default 11000000 124 110 2 12 Hence, I doubt it will completely be stopped. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Sorry, we have no Cray or Summit to do Sherpa. |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 245,701,514 RAC: 68,432 |
How to read the mc-plots values: 124 tasks with the "pp jets 7000 25,-,100 - sherpa 1.4.1 default" parameter set have been sent to BOINC clients. 110 tasks have successfully been returned 2 failed 12 are unknown (may either be lost or still in progress) The 110 valid tasks returned a total of 11 million events |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Sorry, we have no Cray or Summit to do Sherpa. SIGUSR1, SIGUSR1, SIGUSR1.... more than 3 day runtime for nothing today. |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 245,701,514 RAC: 68,432 |
Calm down. The typical failure rate per (good) computer is between 1-5 %. Your computers show 2 % failure rate. This means 98 % valid results. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Calm down. This invalids are because of watching a lot of time. Virtualboxmanager looking and restart Tasks with SIGUSR1. Most of this restarts are failing when restarted the third time. Something is for Windows Theory Tasks not so as for Linux. Native Linux have no problem in CentOS7-VM. When have time, will looking WSL2 with Oracle Linux 9.1. WCG showing, what is possible. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Theory_2390-1099814-526_0 Laufzeit 1 Tage 4 Stunden 4 min. 37 sek. CPU Zeit 1 Tage 3 Stunden 49 min. 38 sek. ===> [runRivet] Sun Aug 13 03:00:48 UTC 2023 [boinc ppbar ue 1800 15 - sherpa 1.4.0 default 100000 526] Event 31400 ( 9m 57s elapsed / 21m 44s left ) -> ETA: Sun Aug 13 03:42 31400 events processed Seeing this TASK so often!! Stopped working always at th same Event. |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 245,701,514 RAC: 68,432 |
run events attempts success failure unknown ppbar ue 1800 15 - sherpa 1.4.0 default 11400000 126 114 3 9 Has been sent to BOINC clients 126 times. 114 of them succeeded. Success rate: >90 % So far only 3 failed. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Theory_2390-1099214-526_0 Laufzeit 22 Stunden 58 min. 46 sek. CPU Zeit 22 Stunden 45 min. 47 sek. #date_d ngood nbad total 2023-08-13 1130 32 1162 1. AMD Ryzen Threadripper PRO 3995WX 64-Cores [Family 23 Model 49 Stepping 0] 2023-08-13 902 22 924 2. AMD Ryzen Threadripper PRO 3995WX 64-Cores [Family 23 Model 49 Stepping 0] 1K Theory Tasks running ok per day, only Sherpa is the longrunning with doing nothing after some time of running? |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
Theory_2390-1131819-528_0 Arbeitspaket 214193875 Laufzeit 7 Stunden 50 min. 53 sek. CPU Zeit 7 Stunden 46 min. 31 sek. Theory_2390-1120389-520_2 Arbeitspaket 213708523 Laufzeit 1 Tage 10 Stunden 10 min. 45 sek. CPU Zeit 1 Tage 9 Stunden 56 min. 19 sek. |
Send message Joined: 2 May 07 Posts: 2176 Credit: 172,465,898 RAC: 76,559 |
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=215071518 Please stop sending this Sherpa!! |
Send message Joined: 15 Jun 08 Posts: 2473 Credit: 245,701,514 RAC: 68,432 |
From mcplots: run events attempts success failure unknown pp jets 7000 20,-,310 - sherpa 1.4.0 default 11800000 129 118 2 9 118 successful attempts out of 129 attempts sent out. => 91.5 % success rate. There's no reason not to send them out. |
©2024 CERN