1) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50034)
Posted 3 hours ago by Dark Angel
Post:
Turned off proxy in Boinc manager, reset project. Still only downloading the CMS_2022_09_07_prod.vdi and not the 70.20 (vbox64_mt_mcore_cms) one.
Turning proxy back on now.
2) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50033)
Posted 3 hours ago by Dark Angel
Post:
Without Proxy, the same?

I'll try disabling the proxy, but the new vdi has a different name and isn't being requested as far as I can tell.
3) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50030)
Posted 12 hours ago by Dark Angel
Post:
2024-04-24 22:17:02 (1865718): Setting Memory Size for VM. (2048MB)
2024-04-24 22:17:02 (1865718): Setting CPU Count for VM. (1)

Still NOT working
https://lhcathome.cern.ch/lhcathome/result.php?resultid=410235598
4) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50028)
Posted 1 day ago by Dark Angel
Post:
Reset project: still not downloading the CMS multithread vdi
5) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 50027)
Posted 1 day ago by Dark Angel
Post:
Cores per work unit set to four: check
Machine on correct profile: check
Guest Extensions correct version: check (VBox Version 7.0.16 r162802 (Qt5.15.3) )
Work fetch enabled: check
Abort all existing single core work units: check
Request fresh work: check
Check stderr for running CMS work unit: only requests single core and VM only allocates a single core, VBox Extension Pack recognised
Check VBox manager: all VMs show Extension pack available

Something isn't right.
6) Message boards : ATLAS application : PC almost unresponsive when running 4x Atlas units (Message 50017)
Posted 1 day ago by Dark Angel
Post:
You said you're on Windows so you're running VBox units.
There's a small overhead running VirtualBox.
You didn't specify any system details, so it's possible you've choked the CPU's cache on the Windows box but the others aren't as loaded.

You can set the number of cores used per work unit in your project preferences. if you want to run a different number on each machine you can also set up different profiles with different settings.
7) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49989)
Posted 3 days ago by Dark Angel
Post:
I'm only running my queue set to 1:1
At this rate I'll halve that.
8) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49987)
Posted 3 days ago by Dark Angel
Post:
Reset the project, made sure it's set to use 4 cores, Atlas native is running ok on four cores (been playing with HDDs after I had a failure so there's some errored and aborted tasks in my records), Theory is as reliable as ever <sarcasm>, but CMS just won't grab any of the multi-core work but keeps getting single core jobs that supposedly aren't even in the queue.
9) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49982)
Posted 4 days ago by Dark Angel
Post:
The single core back end that's cached now at CERN
Is completely gone they said
The single core back end that's cached now at CERN
Is completely gone ...
And still
They come!

<to the tune of The Eve of the War - Jeff Wayne's War of the Worlds>
10) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49963)
Posted 6 days ago by Dark Angel
Post:
I'm still only getting single core work units at this stage though my profile is set for four cores (for Atlas jobs originally)
I have a few to get through so I'll just watch as see what pops up.
11) Questions and Answers : Unix/Linux : theory simulation gets error at 14-minute mark (Message 49945)
Posted 12 days ago by Dark Angel
Post:
I would also like to note that there is no warning/notice that additional software needs to be installed to run LHC tasks

It is mentioned at the homepage as well as a couple of times at the FAQ page:

https://lhcathome.cern.ch/lhcathome/
"Please note that some of the applications on LHC@home requre Virtual Box to be installed."


I am aware of that notice and (just like any other newbie that follows guidance) I installed Virtualbox when installing Boinc.
It should be more specific. Something like "LHC@home requires Virtualbox and CVMFS to be installed for most tasks to run properly. This is how to install CVMFS: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5594#44232"

Otherwise the task would not run, a search on bing would provide useless, the Boinc event logs would only report a failure and the newbie would (if they even know how to use the forums) ask the same question I did.

You don't need CVMFS AND Virtualbox for things to run. You only need CVMFS if you plant to run Linux Native tasks. If you do not select that specific option then you can run CMS, Theory, and Atlas tasks with Virtualbox alone.
12) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49906)
Posted 16 days ago by Dark Angel
Post:
Any word on release to production yet?
13) Questions and Answers : Windows : Having a lot of VM issues - now showing "No work available to process" (Message 49884)
Posted 21 days ago by Dark Angel
Post:
You can only run one hypervisor (the thing that manages virtual machines) on a given system. Hyper-V is a hypervisor. So is VirtualBox. Hyper-V loads before VirtualBox when you start the system. They work slightly differently but they compete for the same system resources. Since Hyper-V loads first it blocks VirtualBox.
You need to disable Hyper-V.
14) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49881)
Posted 22 days ago by Dark Angel
Post:
Yes, I am *clearly* a trouble maker after all. Utterly incorrigible. ;)
I'll take your word for it...

Thanks for that, I'm looking for ward to seeing how these run in production.

You can see my poor underpowered machine's tasks here.
I think Laurence still has some holidays in his pocket, so we may not turn on multicore in production until next week.


Well ... I could if I had an account on the -dev project. Thanks anyway.
15) Message boards : CMS Application : CMS@Home difficulties in attempts to prepare for multi-core jobs (Message 49878)
Posted 22 days ago by Dark Angel
Post:
Yes, I am *clearly* a trouble maker after all. Utterly incorrigible. ;)

Thanks for that, I'm looking for ward to seeing how these run in production.
16) Message boards : Theory Application : How long may Native-Theory-Tasks run (Message 49876)
Posted 22 days ago by Dark Angel
Post:
As I said, it's been aborted now.
I had only left it alone because I've had others run long but were otherwise working normally.
I don't make a habit of digging through workunit logs without cause.
17) Message boards : Theory Application : How long may Native-Theory-Tasks run (Message 49873)
Posted 22 days ago by Dark Angel
Post:
Ok, poking around I checked the stderr.txt and found this

07:33:48 AEDT +11:00 2024-04-01: cranky-0.1.4: [INFO] Pausing container Theory_2743-2857700-30_0.

apparently something DID cause it to pause at some point and I do not have resume capability (wrong sudo version, tried installing the latest version and it crashed every unit that ran from then on so I rolled it back). Since that makes it likely that I'm the one that broke it I've aborted the unit.
18) Message boards : Theory Application : How long may Native-Theory-Tasks run (Message 49872)
Posted 22 days ago by Dark Angel
Post:
Keyword: pp z1j 8000 - - herwig++ 2.5.1 LHC-UE-EE-2-2760 (matched 1 of 202704 rows)
run events attempts success failure unknown
pp z1j 8000 - - herwig++ 2.5.1 LHC-UE-EE-2-2760 0 1 0 0 1

It appears nobody else has run this.
19) Message boards : Theory Application : How long may Native-Theory-Tasks run (Message 49869)
Posted 22 days ago by Dark Angel
Post:
This is the complete log file:

===> [runRivet] Fri Mar 29 15:11:45 UTC 2024 [boinc pp z1j 8000 - - herwig++ 2.5.1 LHC-UE-EE-2-2760 100000 30]

Setting environment...
INFO: uname:
Linux runc 6.5.0-26-generic #26~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Mar 12 10:22:43 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
INFO: /etc/redhat-release:
cat: /etc/redhat-release: No such file or directory

MCGENERATORS=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators
g++ = /cvmfs/sft.cern.ch/lcg/releases/gcc/11.2.0-8a51a/x86_64-centos7/bin/g++
g++ version = 11.2.0
RIVET=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/rivet/3.1.10/x86_64-centos7-gcc11-opt
YODA=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/yoda/1.9.10/x86_64-centos7-gcc11-opt
Rivet version = rivet v3.1.10
RIVET_ANALYSIS_PATH=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/rivet/3.1.10/x86_64-centos7-gcc11-opt/lib/Rivet:/shared/analyses
RIVET_DATA_PATH=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/rivet/3.1.10/x86_64-centos7-gcc11-opt/share/Rivet:/shared/analyses
GSL=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/GSL/2.7/x86_64-centos7-gcc11-opt
HEPMC=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/HepMC/2.06.11/x86_64-centos7-gcc11-opt
FASTJET=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/fastjet/3.4.1/x86_64-centos7-gcc11-opt
PYTHON=/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/Python/3.9.12/x86_64-centos7-gcc11-opt

Input parameters:
mode=boinc
beam=pp
process=z1j
energy=8000
params=-
specific=-
generator=herwig++
version=2.5.1
tune=LHC-UE-EE-2-2760
nevts=100000
seed=30

Prepare temporary directories and files ...
workd=/shared
tmpd=/shared/tmp/tmp.IPuslKhFRO
tmp_params=/shared/tmp/tmp.IPuslKhFRO/generator.params
tmp_hepmc=/shared/tmp/tmp.IPuslKhFRO/generator.hepmc
tmp_yoda=/shared/tmp/tmp.IPuslKhFRO/generator.yoda
tmp_jobs=/shared/tmp/tmp.IPuslKhFRO/jobs.log
tmpd_flat=/shared/tmp/tmp.IPuslKhFRO/flat
tmpd_dump=/shared/tmp/tmp.IPuslKhFRO/dump
tmpd_rivetdb=/shared/tmp/tmp.IPuslKhFRO/rivetdb.map

Prepare Rivet parameters ...
Total histograms selected: 1
analysesNames=ATLAS_2019_I1744201
Total analyses selected: 1
analysesBaseNames=ATLAS_2019_I1744201
Total base analyses selected: 1

Unpack data histograms...
dataFiles =
/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/rivet/3.1.10/x86_64-centos7-gcc11-opt/share/Rivet/ATLAS_2019_I1744201.yoda.gz
output = /shared/tmp/tmp.IPuslKhFRO/flat
make: Entering directory `/shared/rivetvm'
g++ yoda2flat-split.cc -o yoda2flat-split.exe -Wfatal-errors -Wl,-rpath /cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/yoda/1.9.10/x86_64-centos7-gcc11-opt/lib `/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/yoda/1.9.10/x86_64-centos7-gcc11-opt/bin/yoda-config --cppflags --libs`
make: Leaving directory `/shared/rivetvm'

Total histograms unpacked=20 / selected=1
complete ./REF_ATLAS_2019_I1744201_d02-x01-y01.dat

Building rivetvm ...
make: Entering directory `/shared/rivetvm'
g++ rivetvm.cc -o rivetvm.exe -DNDEBUG -Wfatal-errors -Wl,-rpath /cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/rivet/3.1.10/x86_64-centos7-gcc11-opt/lib -Wl,-rpath /cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/HepMC/2.06.11/x86_64-centos7-gcc11-opt/lib `/cvmfs/sft.cern.ch/lcg/releases/LCG_104d_ATLAS_10/MCGenerators/rivet/3.1.10/x86_64-centos7-gcc11-opt/bin/rivet-config --cppflags --ldflags --libs` -lHepMC
make: Leaving directory `/shared/rivetvm'

Run herwig++ 2.5.1 and Rivet ...
generatorExecString = ./rungen.sh boinc pp z1j 8000 - - herwig++ 2.5.1 LHC-UE-EE-2-2760 100000 30 /shared/tmp/tmp.IPuslKhFRO/generator.hepmc
rivetExecString = /shared/rivetvm/rivetvm.exe -a ATLAS_2019_I1744201 -i /shared/tmp/tmp.IPuslKhFRO/generator.hepmc -o /shared/tmp/tmp.IPuslKhFRO/flat -H /shared/tmp/tmp.IPuslKhFRO/generator.yoda -d /shared/tmp/tmp.IPuslKhFRO/dump
INFO: (display) T4T_DISPLAY=
INFO: (display) datdir=/shared/tmp/tmp.IPuslKhFRO/dump
INFO: (display) vars=pp z1j 8000 - herwig++ 2.5.1 LHC-UE-EE-2-2760
INFO: display service switched off
===> [rungen] Fri Mar 29 15:11:56 UTC 2024 [boinc pp z1j 8000 - - herwig++ 2.5.1 LHC-UE-EE-2-2760 100000 30 /shared/tmp/tmp.IPuslKhFRO/generator.hepmc]

Setting environment for herwig++ 2.5.1 ...
tree = hepmc2.06.05
tag =

grep: /etc/redhat-release: No such file or directory
MCGENERATORS=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05
LCG_PLATFORM=x86_64-slc5-gcc43-opt
g++ = /shared/tmp/tmp.eldt4Q5K7G/g++
g++ version = 4.3.6
g++ orig = /cvmfs/sft.cern.ch/lcg/external/gcc/4.3.6/x86_64-slc5/bin/g++
AGILE=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05/agile/1.4.0/x86_64-slc5-gcc43-opt
HEPMC=/cvmfs/sft.cern.ch/lcg/external/HepMC/2.06.05/x86_64-slc5-gcc43-opt
AGILE_GEN_PATH=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05
LHAPDF=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05/lhapdf/5.8.9/x86_64-slc5-gcc43-opt

grep: /etc/redhat-release: No such file or directory
INFO: EL9/CC7 compat: herwig++ - added work-around for missing libraries:
-rwxr-xr-x 1 0 0 7504 Mar 29 15:11 empty.so
lrwxrwxrwx 1 0 0 8 Mar 29 15:11 libreadline.so.5 -> empty.so
lrwxrwxrwx 1 0 0 8 Mar 29 15:11 libtermcap.so.2 -> empty.so
/shared

Input parameters:
mode=boinc
beam=pp
process=z1j
energy=8000
params=-
specific=-
generator=herwig++
version=2.5.1
tune=LHC-UE-EE-2-2760
nevts=100000
seed=30
outfile=/shared/tmp/tmp.IPuslKhFRO/generator.hepmc

Prepare temporary directories and files ...
workd=/shared
tmpd=/shared/tmp/tmp.IPuslKhFRO
tmp_params=/shared/tmp/tmp.IPuslKhFRO/generator.params

Decoding parameters of generator...
pTmin = 0
pTmax = 8000
mHatMin = 0
mHatMax = 8000

processCode=z1j

beam1=p+
beam2=p+
beam energy = 4000.
INFO: steering file template = configuration/herwig++-z1j.params
INFO: cache is not active, CACHE=
Prepare herwig++ 2.5.1 parameters ...
=> /shared/tmp/tmp.IPuslKhFRO/generator.params :
# based on example from Herwig++ 2.4.2 distribution:
# share/Herwig++/TVT.in

# Run options:
cd /Herwig/Generators
set LHCGenerator:NumberOfEvents 100000
set LHCGenerator:RandomNumberGenerator:Seed 30
set LHCGenerator:DebugLevel 0
set LHCGenerator:PrintEvent 1
set LHCGenerator:MaxErrors 100000

# redirect all log output to stdout
set LHCGenerator:UseStdout true

# do output to a HepMC file
cd /Herwig/Generators
insert LHCGenerator:AnalysisHandlers 0 /Herwig/Analysis/HepMCFile
set /Herwig/Analysis/HepMCFile:PrintEvent 1000000
set /Herwig/Analysis/HepMCFile:Format GenEvent
set /Herwig/Analysis/HepMCFile:Filename /shared/tmp/tmp.IPuslKhFRO/generator.hepmc
# set /Herwig/Analysis/HepMCFile:Units GeV_mm


# Beam parameters:
set LHCGenerator:EventHandler:LuminosityFunction:Energy 8000
set LHCGenerator:EventHandler:BeamA /Herwig/Particles/p+
set LHCGenerator:EventHandler:BeamB /Herwig/Particles/p+
set LHCGenerator:MaxErrors -1


# Process setup
# Z+1jet production
cd /Herwig/MatrixElements
insert SimpleQCD:MatrixElements[0] MEZJet
DISABLEREADONLY
newdef MEZJet:ZDecay ChargedLeptons

## Set cuts
## Use this for hard leading-jets in a certain pT window
set /Herwig/Cuts/JetKtCut:MinKT 0*GeV # minimum jet pT
set /Herwig/Cuts/JetKtCut:MaxKT 8000*GeV # maximum jet pT
#
## Use this for a certain mHat window
#set /Herwig/Cuts/QCDCuts:MHatMin 0*GeV # minimum jet mHat
#set /Herwig/Cuts/QCDCuts:MHatMax 8000*GeV # maximum jet mHat


# Make particles with c*tau > 10 mm stable:
set /Herwig/Decays/DecayHandler:MaxLifeTime 10*mm
set /Herwig/Decays/DecayHandler:LifeTimeOption Average


# tune 'LHC-UE-EE-2-2760' parameters: -------------------
#%tuneFile%
# Based on LHC tune example from Herwig++ 2.5.1 distribution
# share/Herwig++/LHC-UE-EE-2.in

##################################################
# Override default MPI parameters
##################################################


# Colour reconnection settings
set /Herwig/Hadronization/ColourReconnector:ColourReconnection Yes
set /Herwig/Hadronization/ColourReconnector:ReconnectionProbability 0.55

# Colour Disrupt settings
set /Herwig/Partons/RemnantDecayer:colourDisrupt 0.15

# inverse hadron radius
set /Herwig/UnderlyingEvent/MPIHandler:InvRadius 1.1
## for \sqrt(s) = 2760 GeV
# Min KT parameter
set /Herwig/UnderlyingEvent/KtCut:MinKT 3.31
# This should always be 2*MinKT!!
set /Herwig/UnderlyingEvent/UECuts:MHatMin 6.62


# MPI model settings
set /Herwig/UnderlyingEvent/MPIHandler:softInt Yes
set /Herwig/UnderlyingEvent/MPIHandler:twoComp Yes
set /Herwig/UnderlyingEvent/MPIHandler:DLmode 3

# ---------------------------------------------


set /Herwig/UnderlyingEvent/MPIHandler:IdenticalToUE -1

# Run generator
cd /Herwig/Generators
run TVT LHCGenerator
--------------------------------------

HERWIGPP=/cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05/herwig++/2.5.1/x86_64-slc5-gcc43-opt
Run herwig++ 2.5.1 ...
generatorExecString = /cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05/herwig++/2.5.1/x86_64-slc5-gcc43-opt/bin/Herwig++ read -r /cvmfs/sft.cern.ch/lcg/external/MCGenerators_hepmc2.06.05/herwig++/2.5.1/x86_64-slc5-gcc43-opt/share/Herwig++/HerwigDefaults.rpo /shared/tmp/tmp.IPuslKhFRO/generator.params
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>> ThePEG - Toolkit for HEP Event Generation - version 1.7.1 <<<<<<<<<<
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
** An event exception of type ThePEG::Exception occurred while generating event number 1:
Failed to generate the shower after 100 attempts in Evolver::showerHardProcess()
The event will be discarded.
No more warnings of this kind will be reported.

It appears to have never got the first event running for some reason.
20) Message boards : Theory Application : How long may Native-Theory-Tasks run (Message 49864)
Posted 22 days ago by Dark Angel
Post:
I don't follow.

Other Theory tasks I have running are processing events normally and showing them in their respective logs but this one is different.

Is this indicative of a failure and I should abort the task or is this just another normal variation I haven't happened to see before?


Next 20


©2024 CERN