41) Message boards : Theory Application : MadGraph5 (Message 43270)
Posted 24 Aug 2020 by Henry Nebrensky
Post:
Have found this thread you wrote - Extreme overload:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5323#41736
Have native Linux with ONE Cpu, but in the log is a entry to use two cpu's (set nb_core 2)
How can this second Cpu being used?

Same way as it used all 232 cores on computezrmle's machine! :(
It'll just chuck processes at the OS and see what happens - isn't there a rivetvm.exe as well, or is that idle while madgraph does its multiprocessing thing?

The running: is 2. Now 548 Completed and Idle: 50 (seem 600 is the max.)

Looking back at that thread you might want to do a
grep subprocess /var/lib/boinc/slots/?/cernvm/shared/runRivet.log
to check that 600 is indeed the correct number (edit: just in case "idle" doesn't mean what I think it does).
42) Message boards : Theory Application : MadGraph5 (Message 43268)
Posted 24 Aug 2020 by Henry Nebrensky
Post:
My guess is that the "idle" number is slowly reducing until "Completed" reaches 600 when either the task completes, or starts a whole new phase...
Can you leave it for ~20hrs and see what happens? As long as the log file is growing then there's some grounds for optimism it'll finish OK.
Back-stepping through the log file to the start of the current phase should tell you what it's trying to do in this phase.

My experience with madgraph hasn't been been good - native it will run 2 cores forcing other tasks off the machine,
Is there a chance of finishing, runRivet.log is still growing, last line so long:
INFO: Idle:185, Running: 2, Completed: 413 [ 35h 51 min]

It also has significant stretches of not actually using CPU at all. We did have a thread about it some months back.
43) Message boards : Theory Application : (Native) Theory - Sherpa looooooong runners (Message 43215)
Posted 12 Aug 2020 by Henry Nebrensky
Post:
No more logfile lines for >5 days.
I've seen a number of those and they've never recovered - I now abort anything I find that hasn't updated the log file in the past 2 hours.
44) Message boards : ATLAS application : Processor Time Locks Up Elapsed Time Continues to Climb (Message 43201)
Posted 9 Aug 2020 by Henry Nebrensky
Post:
I'm not seeing what you describe, but here are the few lines of the VBox.log...

A look some of the failed Theory tasks reveals they all report something like
2020-08-03 22:57:53 (13876): Guest Log: Probing /cvmfs/alice.cern.ch... Failed!
suggesting the VM is seeing network issues. Have you worked through Yeti's checklist on that machine?
45) Message boards : Theory Application : (Native) Theory - Sherpa looooooong runners (Message 43194)
Posted 6 Aug 2020 by Henry Nebrensky
Post:
Try to grep the log file for events or elapsed - this can make it easier to see if there's any actual progress.

But a continuing stream of ISR_Handler errors is usually a bad sign in my experience.
46) Message boards : Theory Application : Tasks run 4 days and finish with error (Message 43185)
Posted 4 Aug 2020 by Henry Nebrensky
Post:
You are running a linux computer where SVM is disabled.
This causes all vbox tasks from LHC@home to fail.

Yes I know, it's probably a shortcoming of Boinc and nothing can be done. But is there no way of informing people they have important stuff like this disabled? If it can't come through Boinc, an automated email from the server?

I'm not sure it's any of BOINC's business - it's entirely valid for users to run tasks (Sixtrack) and other projects that don't use virtualisation, so why should they be lumbered with a mass of irrelevant emails?
Surely it's for the VirtualBox installer to make it clear that the process is incomplete? This crops up often enough that something's clearly missing. (Obligatory reminder of Yeti's checklist !)

I think the Boinc default for switching tasks is pretty short. I changed mine to infinity (or whatever the largest number it would accept is). I see no point in stopping a running task to do a bit of another one.
I'm still on 1200min (20 hours) - long enough that sane tasks should have finished, but leaves the client able to swap out never-ending Sherpas et al. if it wants.
On a brighter note, Theory tasks do seem to have been much better behaved over the past couple of months!
47) Message boards : Number crunching : VM Applications Errors (Message 43159)
Posted 1 Aug 2020 by Henry Nebrensky
Post:
a) This isn't the only project which has problems when the Formula Boinc Circus comes to town.
b) This isn't the only project.
...
I have the following suggestions :
1. Either LHC should opt-out of FB sprints, OR, preferably, try to ensure it has SixTrack available for the next one

Surely it's the "sprint organiser's" responsibility to ensure they choose appropriate projects with suitable work available. Sixtrack job availability has been known to be intermittent for decades.

Touch wood, my recent CMS tasks are all running and I'm getting tasks for all sub-projects selected in my preferences. Not sure why you think you're stuck with purely CMS tasks. What happened when you unticked CMS?
48) Questions and Answers : Windows : Dual CPU Xeon Windows 10 - configuration for LHC? (Message 43043)
Posted 13 Jul 2020 by Henry Nebrensky
Post:
When reducing the bandwidth in the Boincmanager prefs-> Network Setting-> Downloadsize to 3100 Mbit/sec:
...
Have controlled with the Taskmanager. Download is limited to 3100 Mbit/s.
Shown in the Taskmanager of Windows!
Does this limit also apply to CVMFS downloads from within the VM, or only to traffic for the Boinc executable itself?
49) Questions and Answers : Unix/Linux : CentOS Homepage - Cern (Message 43017)
Posted 10 Jul 2020 by Henry Nebrensky
Post:
Hello and thanks for your answer!

you then need to do something like a "system boinc-client start" or systemctl equivalent to actually start it.

This is what i did and meant by "trying to start BOINC client manually". That didn't work, probably because of the "missing" file you mentioned. The deamon was not to be found in taskmanager.
The file I was thinking of was gui_rpc_auth.cfg, which gets created automatically once the daemon first starts (and which boinccmd uses to connect to the daemon).
50) Questions and Answers : Unix/Linux : CentOS Homepage - Cern (Message 43005)
Posted 10 Jul 2020 by Henry Nebrensky
Post:
Anyone here who can explain how to get BOINC working?
I did have a CentOS 8 box running for a long time but it's now down for the summer, so I can't easily check details.

The BOINC client from EPEL will install as a daemon in /var/lib/boinc; you then need to do something like a "system boinc-client start" or systemctl equivalent to actually start it. To authenticate as a user and use boinccmd, you need a local copy of a file I forget the name of, which you can (as root) copy from /var/lib/boinc to a suitable local directory.

The first thing to check (with ps) is whether boinc is already running or not.
51) Message boards : Number crunching : Setting up a local Squid to work with LHC@home - Comments and Questions (Message 43003)
Posted 10 Jul 2020 by Henry Nebrensky
Post:
Nice!

You could add that for "Connecting the BOINC Client" the command-line version is:
boinccmd --set_proxy_settings squid_hostname_or_IP 3128 '' '' '' ''  '' '' ''

(Those are pairs of single-quotes, to specify seven null parameters.)
52) Message boards : Theory Application : Tasks run 4 days and finish with error (Message 42534)
Posted 18 May 2020 by Henry Nebrensky
Post:
273368209 failed the same way - "Starting the calculation" after the Comix banner and then hog a CPU with no further output. This time I gave it nearly 5 days to sort itself out, but no such luck.

Meanwhile, 272500168 has been sitting gobbling a CPU since it announced that
Initialized the Shower_Handler.
ME_Generator_Base::SetPSMasses(): Massive PS flavours for Internal: (c,cb,b,bb,e-,e+,mu-,mu+,tau-,tau+)
ME_Generator_Base::SetPSMasses(): Massive PS flavours for Comix: (c,cb,b,bb,e-,e+,mu-,mu+,tau-,tau+)
+----------------------------------+
|                                  |
|      CCC  OOO  M   M I X   X     |
|     C    O   O MM MM I  X X      |
|     C    O   O M M M I   X       |
|     C    O   O M   M I  X X      |
|      CCC  OOO  M   M I X   X     |
|                                  |
+==================================+
|  Color dressed  Matrix Elements  |
|     http://comix.freacafe.de     |
|   please cite  JHEP12(2008)039   |
+----------------------------------+
Matrix_Element_Handler::BuildProcesses(): Looking for processes .................................................................................................................................................................................... done ( 36 MB, 23s / 21s ).
Matrix_Element_Handler::InitializeProcesses(): Performing tests .................................................................................................................................................................................... done ( 36 MB, 0s / 0s ).
Initialized the Matrix_Element_Handler for the hard processes.
Initialized the Beam_Remnant_Handler.
Hadron_Decay_Map::Read:   Initializing HadronDecays.dat. This may take some time.
Initialized the Hadron_Decay_Handler, Decay model = Hadrons
Initialized the Soft_Photon_Handler.
Process_Group::CalculateTotalXSec(): Calculate xs for '2_2__j__j__e-__veb' (Comix)
Starting the calculation at 21:19:15. Lean back and enjoy ... .
(yes, that's 21:19 yesterday since it bothered with a progress report) - so I've leant back and enjoyed killing it.
53) Message boards : Theory Application : Extreme Overload caused by a Theory Task (Message 42483)
Posted 14 May 2020 by Henry Nebrensky
Post:
Another one: 273087449. boinccmd --get_tasks reports
2) -----------
   name: Theory_2390-1153380-3_0
   WU name: Theory_2390-1153380-3
   project URL: https://lhcathome.cern.ch/lhcathome/
   received: Thu May 14 00:45:03 2020
   report deadline: Sun May 24 00:45:02 2020
   ready to report: no
   state: downloaded
   scheduler state: scheduled
   active_task_state: EXECUTING
   app version num: 30006
   resources: 1 CPU
   estimated CPU time remaining: 0.017439
   slot: 1
   PID: 8741
   CPU time at last checkpoint: 0.000000
   current CPU time: 10407.640000
   fraction done: 1.000000
   swap size: 7842 MB
   working set size: 6124 MB
and pstree -c 8741 reports
wrapper_2019_03─┬─cranky-0.0.32───runc─┬─job───runRivet.sh─┬─rivetvm.exe
                │                      │                   ├─rungen.sh───python───python─┬─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             └─{python}
                │                      │                   └─sleep
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      └─{runc}
                └─{wrapper_2019_03}

Again looks like the BOINC client is trying to do the right thing by not starting any new tasks to keep the load down to 4, but - as you've pointed out below - the tasks themselves should be running the subtasks in series, not parallel. This task has been hogging the entire machine for 12+ hours now.
54) Message boards : ATLAS application : How to only get Atlas 8 core tasks. (Message 42478)
Posted 14 May 2020 by Henry Nebrensky
Post:
I took a look at the max cores option in preferences but that seems to be max number and below so if set it to 8 I could still get other lower core count tasks.
Just use that option - I don't remember ever being issued a lower core-count task, assuming the cores actually exist.
It's "max. cores" because some of us have a range of different machines: max. 8 allows a four-core machine in that locale to get four-core tasks.
55) Message boards : Theory Application : Extreme Overload caused by a Theory Task (Message 42417)
Posted 11 May 2020 by Henry Nebrensky
Post:
IMO it's also cheating us on the credit:

For single-threaded 272136516
02:58:17 (13728): cranky exited; CPU time 674129.550630
and 6,326.46 credit, i.e. 6.3k cr. for ~630k s CPU time. While for "multi-core" 272677122,
15:26:19 (28778): cranky exited; CPU time 447204.364571
and 657.71 credit, i.e. 0.7k cr. for 447k s CPU. I don't think my machines are a factor 5 different, but the Run times are!
56) Message boards : Theory Application : Extreme Overload caused by a Theory Task (Message 42416)
Posted 11 May 2020 by Henry Nebrensky
Post:
I guess the BOINC client still treats the task as 1-core.
Yes: boinccmd --get_tasks reported
   name: Theory_2390-1152716-2_0
   WU name: Theory_2390-1152716-2
   project URL: https://lhcathome.cern.ch/lhcathome/
   received: Sun May 10 13:17:48 2020
   report deadline: Wed May 20 13:17:47 2020
   ready to report: no
   state: downloaded
   scheduler state: scheduled
   active_task_state: EXECUTING
   app version num: 30005
   resources: 1 CPU
   estimated CPU time remaining: 1.650947
   CPU time at last checkpoint: 0.000000
   current CPU time: 20120.640000
   fraction done: 0.999978
   swap size: 8667 MB
   working set size: 5558 MB
but it started (eventually) 8 active processes, and the BOINC client was sensible and didn't start any more tasks as the existing ones finished off until the madgraph task completed:
15835: 11-May-2020 15:04:21 (low) [LHC@home] Starting task RAfMDmsbmrwnsSi4apGgGQJmABFKDmABFKDmBVrYDmABFKDmcy862n_0
15836: 11-May-2020 15:04:21 (low) [LHC@home] Starting task Theory_2390-1102868-2_0
15837: 11-May-2020 15:04:21 (low) [LHC@home] Starting task Theory_2390-1146431-2_1
15838: 11-May-2020 15:04:22 (low) [LHC@home] Starting task Theory_2390-1087074-2_0
15839: 11-May-2020 15:26:22 (low) [LHC@home] Computation for task Theory_2390-1152716-2_0 finished
15840: 11-May-2020 15:26:22 (low) [LHC@home] Starting task Theory_2390-1113717-2_0
15841: 11-May-2020 15:26:24 (low) [LHC@home] Started upload of Theory_2390-1152716-2_0_r1750715384_result
15842: 11-May-2020 15:26:29 (low) [LHC@home] Finished upload of Theory_2390-1152716-2_0_r1750715384_result

Worst case (on an 8 core CPU) would be that BOINC starts 8 of them concurrently and the load average jumps to 8*8=64 (plus normal work).
Maybe: I didn't check the load last night when it would still have been fighting with multi-core Atlas tasks. It looks like the BOINC client is trying to do the right thing, but - as you've pointed out below - the tasks themselves should be running the subtasks in series, not parallel.
57) Message boards : Theory Application : Extreme Overload caused by a Theory Task (Message 42414)
Posted 11 May 2020 by Henry Nebrensky
Post:
It looks like all subprocesses are running concurrently which puts an extreme load on the host.

Like other Theory tasks this type should also respect the 1 core behavior and avoid running that many processes concurrently.
I've stumbled on 272677122 which has ATM taken over all 8 cores on the host (but for a loadaverage ~8); so not overwhelming the host completely but certainly pushing out all other BOINC tasks. top reports the master python process is taking up 69.3% of the memory!

===> [runRivet] Sun May 10 16:33:34 UTC 2020 [boinc pp zinclusive 7000 20,-,50,200 - madgraph5amc 2.6.6.atlas nlo2jet 100000 2]

> grep subprocess /var/lib/boinc/slots/2/cernvm/shared/runRivet.log
INFO: Generated 16 subprocesses with 192 real emission diagrams, 32 born diagrams and 32 virtual diagrams 
INFO: Generated 48 subprocesses with 2944 real emission diagrams, 192 born diagrams and 1440 virtual diagrams 
INFO: Generated 232 subprocesses with 36320 real emission diagrams, 2560 born diagrams and 47392 virtual diagrams 

> pstree -c 28778
wrapper_2019_03─┬─cranky-0.0.31───runc─┬─job───runRivet.sh─┬─rivetvm.exe
                │                      │                   ├─rungen.sh───python───python─┬─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─ajob1───madevent_mintMC
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             ├─{python}
                │                      │                   │                             └─{python}
                │                      │                   └─sleep
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      ├─{runc}
                │                      └─{runc}
                └─{wrapper_2019_03}

Log file presently ends with
NFO:  Idle: 0,  Running: 3,  Completed: 444 [  22m 15s  ] 
INFO:  Idle: 0,  Running: 2,  Completed: 445 [  22m 15s  ] 
INFO:  Idle: 0,  Running: 0,  Completed: 447 [  22m 15s  ] 
INFO:    Doing reweight 
INFO:  Idle: 0,  Running: 2,  Completed: 445 [ current time: 13h32 ] 
INFO:  Idle: 0,  Running: 1,  Completed: 446 [  0.12s  ] 
INFO:  Idle: 0,  Running: 0,  Completed: 447 [  0.74s  ] 
INFO: Collecting events 
and doesn't give any clear indication on progress - it's completed 447 out of... 232? 2560? what?
58) Message boards : Theory Application : Tasks run 4 days and finish with error (Message 42389)
Posted 8 May 2020 by Henry Nebrensky
Post:
OTOH, 272136516 has chugged along slowly towards 98% done and should finish some time tonight...
Completed and validated:
Name Theory_2378-1064671-8_0
Sent 30 Apr 2020, 3:42:10 UTC
Report deadline 11 May 2020, 3:42:10 UTC
Received 8 May 2020, 1:58:29 UTC
Outcome Success
Exit status 0 (0x00000000)
Run time 7 days 21 hours 21 min 28 sec
CPU time 7 days 19 hours 15 min 29 sec
Credit 6,326.46
59) Message boards : Theory Application : Tasks run 4 days and finish with error (Message 42384)
Posted 7 May 2020 by Henry Nebrensky
Post:
Well, 272478919 seems to have been spewing just
ISR_Handler::MakeISR(..): s' out of bounds.
  s'_{min}, s'_{max 1,2} vs. s': 0.0049, 49000000, 49000000 vs. 0.0049
ISR_Handler::MakeISR(..): s' out of bounds.
  s'_{min}, s'_{max 1,2} vs. s': 0.0049, 49000000, 49000000 vs. 0.0049
ISR_Handler::MakeISR(..): s' out of bounds.
  s'_{min}, s'_{max 1,2} vs. s': 0.0049, 49000000, 49000000 vs. 0.0049
ISR_Handler::MakeISR(..): s' out of bounds.
  s'_{min}, s'_{max 1,2} vs. s': 0.0049, 49000000, 49000000 vs. 0.0049
Channel_Elements::GenerateYForward(5.9997117627854e-10,{-8.98847e+307,0,-8.98847e+307,0,0},{-10,10,5.50191}):  Y out of bounds ! 
   ymin, ymax vs. y : -10 10 vs. -10
Setting y to lower bound  ymin=-10
Channel_Elements::GenerateYForward(1.1155e-10,{-8.98847e+307,0,-8.98847e+307,0,0},{-10,10,0.404185}):  Y out of bounds ! 
   ymin, ymax vs. y : -10 10 vs. -10
Setting y to lower bound  ymin=-10
Channel_Elements::GenerateYForward(8.17608e-10,{-8.98847e+307,0,-8.98847e+307,0,0},{-10,10,3.02746}):  Y out of bounds ! 
   ymin, ymax vs. y : -10 10 vs. -10
Setting y to lower bound  ymin=-10
Channel_Elements::GenerateYBackward(1.1046941250437e-10,{-8.98847e+307,0,-8.98847e+307,0,0},{-10,10,-0.0929604}):  Y out of bounds ! 
   ymin, ymax vs. y : -10 10 vs. 10
Setting y to upper bound ymax=10
Channel_Elements::GenerateYBackward(1.7451462098184e-10,{-8.98847e+307,0,-8.98847e+307,0,0},{-10,10,3.69002}):  Y out of bounds ! 
   ymin, ymax vs. y : -10 10 vs. 10
Setting y to upper bound ymax=10
Channel_Elements::GenerateYBackward(0.0021676548016573,{-8.98847e+307,0,-8.98847e+307,0,0},{-10,10,0.551644}):  Y out of bounds ! 
   ymin, ymax vs. y : -3.0670547162051 3.0670547162051 vs. 3.0670547162051
Setting y to upper bound ymax=3.0670547162051
ISR_Handler::MakeISR(..): s' out of bounds.
  s'_{min}, s'_{max 1,2} vs. s': 0.0049, 49000000, 49000000 vs. 0.0049
ISR_Handler::MakeISR(..): s' out of bounds.
  s'_{min}, s'_{max 1,2} vs. s': 0.0049, 49000000, 49000000 vs. 0.0049
for a day and a half now without any sign of useful progress in the log. Those usually go on for ever IME, so I've put a stop to that!
Meanwhile, 272500168 has been sitting gobbling a CPU since it announced that
Initialized the Shower_Handler.
ME_Generator_Base::SetPSMasses(): Massive PS flavours for Internal: (c,cb,b,bb,e-,e+,mu-,mu+,tau-,tau+)
ME_Generator_Base::SetPSMasses(): Massive PS flavours for Comix: (c,cb,b,bb,e-,e+,mu-,mu+,tau-,tau+)
+----------------------------------+
|                                  |
|      CCC  OOO  M   M I X   X     |
|     C    O   O MM MM I  X X      |
|     C    O   O M M M I   X       |
|     C    O   O M   M I  X X      |
|      CCC  OOO  M   M I X   X     |
|                                  |
+==================================+
|  Color dressed  Matrix Elements  |
|     http://comix.freacafe.de     |
|   please cite  JHEP12(2008)039   |
+----------------------------------+
Matrix_Element_Handler::BuildProcesses(): Looking for processes .................................................................................................................................................................................... done ( 36 MB, 23s / 21s ).
Matrix_Element_Handler::InitializeProcesses(): Performing tests .................................................................................................................................................................................... done ( 36 MB, 0s / 0s ).
Initialized the Matrix_Element_Handler for the hard processes.
Initialized the Beam_Remnant_Handler.
Hadron_Decay_Map::Read:   Initializing HadronDecays.dat. This may take some time.
Initialized the Hadron_Decay_Handler, Decay model = Hadrons
Initialized the Soft_Photon_Handler.
Process_Group::CalculateTotalXSec(): Calculate xs for '2_2__j__j__e-__veb' (Comix)
Starting the calculation at 21:19:15. Lean back and enjoy ... .
(yes, that's 21:19 yesterday since it bothered with a progress report) - so I've leant back and enjoyed killing it.

OTOH, 272136516 has chugged along slowly towards 98% done and should finish some time tonight - though whether it's actually produced anything meaningful isn't clear:
97000 events processed
dumping histograms...
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d01-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d02-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d03-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d04-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d05-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d06-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d07-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d08-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d09-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d10-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d11-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d12-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d13-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d14-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d15-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d16-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d17-x01-y01[4]
Rivet.Analysis.CMS_2017_I1605749: WARN  Skipping histo with null area /CMS_2017_I1605749/d18-x01-y01[4]
97100 events processed
97200 events processed
97300 events processed
...
60) Message boards : Theory Application : Tasks run 4 days and finish with error (Message 42376)
Posted 4 May 2020 by Henry Nebrensky
Post:
Do we think this https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5413 will fix our issues?

"This time the update addressed Sherpa event generator and particularly issues with the endless loops which should be significantly reduced or disappear now."
I don't actually know anything about the code...
The problematic Sherpas that sit gobbling CPU but giving no sign of progress might well be in an endless loop, so that is hopefully a thing of that past. Those ones that report an estimated time remaining that shows sudden spikes up to an infeasible value look to me as though something other than "endless loop" is the problem.
I suppose there are release notes on the Web somewhere if I could be bothered to look... I'm sure we'll find out soon enough now that the tasks are coming through.


Previous 20 · Next 20


©2024 CERN