Message boards : CMS Application : New Version v60.00
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 45324 - Posted: 9 Sep 2021, 12:16:31 UTC - in response to Message 45323.  

This requires a BOINC client update once they publish a version without that bug.

OK, I will look out for it. Forewarned is forearmed.
That answers a long-standing question. Thanks.
ID: 45324 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 45325 - Posted: 9 Sep 2021, 14:21:31 UTC - in response to Message 45318.  

Every 12th minute a problem is reported in StartdLog:
09/09/21 07:45:52 (pid:16121) CONFIGURATION PROBLEM: Failed to insert ClassAd attribute GLIDEIN_Resource_Slots = Iotokens,80,,type=main.  The most common reason for this is that you forgot to quote a string value in the list of attributes being added to the STARTD ad.
09/09/21 07:45:52 (pid:16121) CONFIGURATION PROBLEM: Failed to insert ClassAd attribute STARTD_JOB_ATTRS = ,x509userproxysubject,x509UserProxyFQAN,x509UserProxyVOName,x509UserProxyEmail,x509UserProxyExpiration,MemoryUsage,ResidentSetSize,ProportionalSetSizeKb.  The most common reason for this is that you forgot to quote a string value in the list of attributes being added to the STARTD ad.
09/09/21 07:45:52 (pid:16121) CONFIGURATION PROBLEM: Failed to insert ClassAd attribute STARTD_PARTITIONABLE_SLOT_ATTRS = MemoryUsage,ProportionalSetSizeKb.  The most common reason for this is that you forgot to quote a string value in the list of attributes being added to the STARTD ad.
09/09/21 07:45:52 (pid:16121) slot1: CONFIGURATION PROBLEM: Failed to insert ClassAd attribute GLIDEIN_Resource_Slots = Iotokens,80,,type=main.  The most common reason for this is that you forgot to quote a string value in the list of attributes being added to the slot1 ad.
09/09/21 07:45:52 (pid:16121) slot1: CONFIGURATION PROBLEM: Failed to insert ClassAd attribute STARTD_JOB_ATTRS = ,x509userproxysubject,x509UserProxyFQAN,x509UserProxyVOName,x509UserProxyEmail,x509UserProxyExpiration,MemoryUsage,ResidentSetSize,ProportionalSetSizeKb.  The most common reason for this is that you forgot to quote a string value in the list of attributes being added to the slot1 ad.
09/09/21 07:45:52 (pid:16121) slot1: CONFIGURATION PROBLEM: Failed to insert ClassAd attribute STARTD_PARTITIONABLE_SLOT_ATTRS = MemoryUsage,ProportionalSetSizeKb.  The most common reason for this is that you forgot to quote a string value in the list of attributes being added to the slot1 ad.

According to one of our HTCondor wranglers, that error message is misleading and has yet to be corrected by CMS or WMCore people. Certainly, I've noticed it for a long time, but then I've been using the "new" VM for some while in the -dev application.
ID: 45325 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,358,158
RAC: 101,726
Message 45326 - Posted: 9 Sep 2021, 18:59:20 UTC

here, on the two machines which run CMS, the change from v50.00 to v60.00 happened without any problems :-)
ID: 45326 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 45327 - Posted: 9 Sep 2021, 22:00:43 UTC - in response to Message 45326.  

here, on the two machines which run CMS, the change from v50.00 to v60.00 happened without any problems :-)
Happy to hear that, given that you have had problems in the past IIRC.
Have you implemented a squid proxy? That would have halved your download traffic to get both copies of the (1.6 GB) VM image. :-)
ID: 45327 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,358,158
RAC: 101,726
Message 45330 - Posted: 11 Sep 2021, 18:58:20 UTC - in response to Message 45327.  

here, on the two machines which run CMS, the change from v50.00 to v60.00 happened without any problems :-)
Happy to hear that, given that you have had problems in the past IIRC.
Have you implemented a squid proxy? That would have halved your download traffic to get both copies of the (1.6 GB) VM image. :-)
no squid implemented yet, but I am still hopeful :-)
ID: 45330 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,150,492
RAC: 15,942
Message 45334 - Posted: 13 Sep 2021, 17:56:32 UTC

All CMS tasks seem to be just spinning without any CPU load. This started a few hours ago and they have run normally until then but not anymore. They run 18 hours and terminate with a much lower CPU time.
ID: 45334 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,720,699
RAC: 234,483
Message 45336 - Posted: 13 Sep 2021, 19:26:04 UTC - in response to Message 45334.  

Same for me
ID: 45336 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,150,492
RAC: 15,942
Message 45337 - Posted: 13 Sep 2021, 20:35:52 UTC
Last modified: 13 Sep 2021, 21:07:19 UTC

Tasks that have started after my previous post are running normally at the moment. Fingers crossed.
Edit: All tasks where runtime is below 12 hours are calculating again. Those above 12 hours are idling, they will probably finish at 18 hours without further calculations,
ID: 45337 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,720,699
RAC: 234,483
Message 45338 - Posted: 13 Sep 2021, 21:08:00 UTC - in response to Message 45337.  

Seems like the WU that started couldn't get any work on the back end and there is no retry then based on your observations
ID: 45338 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 45339 - Posted: 14 Sep 2021, 7:11:48 UTC

Something went awry for a couple of hours last night, judging from the job graphs. As far as I can see, it didn't affect other production nodes so it must have been something specific to us. Seems OK again now, though.
ID: 45339 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 372
Credit: 238,712
RAC: 0
Message 45340 - Posted: 14 Sep 2021, 7:34:36 UTC - in response to Message 45339.  

There was a firewall issue. Tightening up the rules went a bit far. Some of the rules were reverted and how everything should be fine.
ID: 45340 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 674
Credit: 43,150,492
RAC: 15,942
Message 45348 - Posted: 18 Sep 2021, 8:36:07 UTC

Something similar like last Monday is happening again. But this time there are no new tasks on Boinc and CPUs are spinning idle without jobs to crunch. What's going on?
ID: 45348 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,720,699
RAC: 234,483
Message 45349 - Posted: 18 Sep 2021, 8:57:52 UTC - in response to Message 45348.  

Yeah same here
ID: 45349 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 45350 - Posted: 18 Sep 2021, 11:04:52 UTC - in response to Message 45348.  

Something similar like last Monday is happening again. But this time there are no new tasks on Boinc and CPUs are spinning idle without jobs to crunch. What's going on?

Just seen that myself. It doesn't seem to be a general problem, but the status I'm seeing from the cmsweb-testbed that we use to run our WMAgent is highly unusual. I'll send a few emails...
ID: 45350 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 45351 - Posted: 18 Sep 2021, 18:53:06 UTC - in response to Message 45350.  

Something similar like last Monday is happening again. But this time there are no new tasks on Boinc and CPUs are spinning idle without jobs to crunch. What's going on?

Just seen that myself. It doesn't seem to be a general problem, but the status I'm seeing from the cmsweb-testbed that we use to run our WMAgent is highly unusual. I'll send a few emails...

OK, I've escalated it to a Service Ticket with CMS IT. No response as yet.
ID: 45351 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 45428 - Posted: 14 Oct 2021, 4:38:21 UTC

After years of trying, I finally completed and validated a CMS task on my Windows 10 PC. Congratulations!
Tullio
ID: 45428 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 997
Credit: 6,264,307
RAC: 71
Message 45429 - Posted: 14 Oct 2021, 20:01:11 UTC - in response to Message 45428.  

After years of trying, I finally completed and validated a CMS task on my Windows 10 PC. Congratulations!
Tullio

Oh, good! Any indication of what made the difference?
ID: 45429 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 45432 - Posted: 15 Oct 2021, 3:41:36 UTC

No, but I am glad to see something working on my PC. Recently I had to stop my ccoperation with QuChemPedIA@home because of a problem with security certificates, as outlined by Dave Anderson on the BOINC home page. I wrote them that I have 4 projects working on Windows and VirtualBox. I am glad that LHC@home has no such problem
Tullio
ID: 45432 · Report as offensive     Reply Quote
Previous · 1 · 2

Message boards : CMS Application : New Version v60.00


©2024 CERN