Message boards : Number crunching : "New" project, old problem (LHCb)
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile jjv

Send message
Joined: 28 Jun 06
Posts: 5
Credit: 11,147,697
RAC: 1,326
Message 27969 - Posted: 27 Nov 2016, 9:27:06 UTC

After moving my machines over to LHC@home I've left my settings as the defaults. So far I don't think I've crunched a single LHCb unit successfully. Theory, CMS, sixtrack no problems. LHCb fail after fail. Good old EXIT_INIT_FAILURE.
It kind of seems pointless to keep this up. So should I just give up and disable LHCb again?

JJ
ID: 27969 · Report as offensive     Reply Quote
captainjack

Send message
Joined: 21 Jun 10
Posts: 25
Credit: 3,047,765
RAC: 4,132
Message 27971 - Posted: 27 Nov 2016, 13:04:14 UTC

jjv,

Yes it is a known problem and has been reported on the "LHCb Application" topic. The virtual machine can't communicate with the HTCondor server so it waits 600+ seconds then aborts. My recommendation would be to turn it off and monitor this post https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4014&postid=27898 to see when the project admins get it fixed.
ID: 27971 · Report as offensive     Reply Quote
Profile jjv

Send message
Joined: 28 Jun 06
Posts: 5
Credit: 11,147,697
RAC: 1,326
Message 28058 - Posted: 5 Dec 2016, 17:40:31 UTC

Yeah well, it is "better" now. Not all of them fail. Current ratio for me is 18 valid vs 33 error. It seems kind of odd that my machines are running nothing but LHCb if there is no actual work to be done.
33 x 15 minutes of runtime per failed unit equals over 8 hours of wasted computation. Seriously considering disabling LHCb again.

JJ
ID: 28058 · Report as offensive     Reply Quote

Message boards : Number crunching : "New" project, old problem (LHCb)


©2019 CERN