Message boards : News : CMS@Home -- ongoing problems
Message board moderation

To post messages, you must log in.

AuthorMessage
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 724
Credit: 5,688,626
RAC: 964
Message 41635 - Posted: 19 Feb 2020, 14:07:10 UTC

Sorry that the CMS@Home HTCondor server is still playing up. Again over the weekend it refused to serve jobs even though plenty were available. Together with Federica we've decided not to inject another workflow this week, to let if "fail hard" again so that she can investigate which ClassAd preferences are not being met.
So, you will probably see the number of running jobs falling, and the number of errors increasing, in the next few days. Please feel free to set No New Tasks in that case. I won't, so that there is still some pressure for jobs on the server. I've also asked Laurence if I can run the CMS@Home VM outside of BOINC, to get around the quota back-off problem.
ID: 41635 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1553
Credit: 89,003,623
RAC: 94,972
Message 41637 - Posted: 19 Feb 2020, 14:16:31 UTC - in response to Message 41635.  

If it helps to identify the problem I may not set my clients to NNT but instead configure a script to request tasks every x min.
Just shout if and when you need it.
ID: 41637 · Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 12 Aug 06
Posts: 216
Credit: 1,173,688
RAC: 3,087
Message 41662 - Posted: 20 Feb 2020, 14:05:16 UTC - in response to Message 41635.  
Last modified: 20 Feb 2020, 14:08:04 UTC

I'll be leaving my main computer doing LHC. If an error helps, then it's worth doing.

Unfortunately only 1 of my 4 computers can run LHC. The older 8GB RAM machines (my main one is 16GB) with rubbish processors seem to screw up when doing any Virtualbox stuff. They always give way too many computation errors on the tasks. They do have virtualisation capabilities which is switched on in the BIOS, and Virtualbox is installed correctly in the same way as the main machine. Oh well, they're on Rosetta and Universe instead.
ID: 41662 · Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 12 Aug 06
Posts: 216
Credit: 1,173,688
RAC: 3,087
Message 41665 - Posted: 20 Feb 2020, 15:32:24 UTC

I can see 1254 tasks available for Theory in the server status, although unusually Atlas has run out.

But my computer is requesting tasks and not getting those Theory ones.

Is something up?

Note: my computer is currently processing 10 CMS tasks which are erroring due to the server problems. Is that causing me to get limited in downloading more tasks?
ID: 41665 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 983
Credit: 6,395,181
RAC: 574
Message 41668 - Posted: 20 Feb 2020, 15:44:20 UTC - in response to Message 41665.  

Note: my computer is currently processing 10 CMS tasks which are erroring due to the server problems. Is that causing me to get limited in downloading more tasks?
The maximum number of tasks is twice the number of the cores.
ID: 41668 · Report as offensive     Reply Quote
Peter Hucker

Send message
Joined: 12 Aug 06
Posts: 216
Credit: 1,173,688
RAC: 3,087
Message 41670 - Posted: 20 Feb 2020, 15:56:48 UTC - in response to Message 41668.  

Note: my computer is currently processing 10 CMS tasks which are erroring due to the server problems. Is that causing me to get limited in downloading more tasks?
The maximum number of tasks is twice the number of the cores.


It's a 6 core machine, although only using 4, as the other 2 are to assist GPUs on Einstein. Not sure which number it takes, but presumably the 6 as I had 10 CMS at once.

It's completed all the CMS tasks now, so I have 0 tasks, but it won't take the Theory ones that are available. I assume there's either:

A problem with the LHC Network outages:
https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5072&postid=41658
https://cern.service-now.com/service-portal/view-outage.do?n=OTG0054949
https://cern.service-now.com/service-portal/view-outage.do?n=OTG0054961

Or the server is upset about my CMS tasks erroring, so presumably I'll get tasks shortly.
ID: 41670 · Report as offensive     Reply Quote

Message boards : News : CMS@Home -- ongoing problems


©2020 CERN