Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED)
Message board moderation

To post messages, you must log in.

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1691
Credit: 104,524,857
RAC: 120,016
Message 49750 - Posted: 12 Mar 2024, 9:19:17 UTC

On all my hosts, CMS tasks are failing after about 2-3 minutes - see here:

https://lhcathome.cern.ch/lhcathome/result.php?resultid=407746801

exerpt from stderr:

2024-03-12 09:22:16 (6252): Guest Log: Ncat: Could not resolve hostname "vccs.cern.ch": Name or service not known. QUITTING.
2024-03-12 09:22:16 (6252): Guest Log: [ERROR] Could not connect to vccs.cern.ch on port 443

So one would see a network .problem.
However, a ping to "vccs.cern.ch" works well.

Atlas and Theory are being processed without any problem.

Any idea what's going on?
ID: 49750 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1691
Credit: 104,524,857
RAC: 120,016
Message 49751 - Posted: 12 Mar 2024, 9:34:21 UTC

I now detected the same problem at other volunteers' hosts.
So obviously the problem is not a local one, but rather at CERN :-(
ID: 49751 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1284
Credit: 8,513,326
RAC: 2,911
Message 49752 - Posted: 12 Mar 2024, 9:38:11 UTC - in response to Message 49751.  

After the failing connection:

Guest Log: NCAT DEBUG: Using system default trusted CA certificates and those in /usr/share/ncat/ca-bundle.crt.
Guest Log: NCAT DEBUG: Unable to load trusted CA certificates from /usr/share/ncat/ca-bundle.crt: error:02001002:system library:fopen:No such file or directory
ID: 49752 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 378
Credit: 238,712
RAC: 0
Message 49753 - Posted: 12 Mar 2024, 10:17:28 UTC - in response to Message 49750.  

The server is being updated. Should be back soon.
ID: 49753 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 378
Credit: 238,712
RAC: 0
Message 49754 - Posted: 12 Mar 2024, 10:36:49 UTC - in response to Message 49753.  

It is back.
ID: 49754 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1284
Credit: 8,513,326
RAC: 2,911
Message 49755 - Posted: 12 Mar 2024, 10:50:40 UTC - in response to Message 49754.  
Last modified: 12 Mar 2024, 10:50:59 UTC

Now the problem is not getting the X509 credentials from LHC@home and vLHC@home-dev
ID: 49755 · Report as offensive     Reply Quote
ivan
Volunteer moderator
Project tester
Volunteer developer
Volunteer tester
Project scientist
Avatar

Send message
Joined: 29 Aug 05
Posts: 1007
Credit: 6,277,602
RAC: 682
Message 49756 - Posted: 12 Mar 2024, 11:14:55 UTC - in response to Message 49755.  

Now the problem is not getting the X509 credentials from LHC@home and vLHC@home-dev

Yes, see https://lhcathome.cern.ch/lhcathome/result.php?resultid=407751354
Laurence, is the idtoken authorisation in place?
ID: 49756 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 378
Credit: 238,712
RAC: 0
Message 49757 - Posted: 12 Mar 2024, 11:27:50 UTC - in response to Message 49756.  

Yes, the idtoken is in place. The server logs suggest the service is working.
ID: 49757 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2114
Credit: 159,914,613
RAC: 83,929
Message 49758 - Posted: 12 Mar 2024, 12:15:37 UTC - in response to Message 49757.  
Last modified: 12 Mar 2024, 12:40:37 UTC

Laurence,
every 30 sec. searching for a x509 credential from LHCatHome or LHCatHome-dev.
Win11pro - Workstation https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10664116
2024-03-12 13:17:25 (2940): Guest Log: [INFO] Requesting an X509 credential from LHC@home
2024-03-12 13:17:26 (2940): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev
2024-03-12 13:17:57 (2940): Guest Log: [DEBUG] % Total % Received % Xferd Average Speed Time Time Time Current
2024-03-12 13:17:57 (2940): Guest Log: Dload Upload Total Spent Left Speed
2024-03-12 13:17:57 (2940): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2024-03-12 13:17:57 (2940): Guest Log: 100 4924 0 4924 0 0 13462 0 --:--:-- --:--:-- --:--:-- 13490
2024-03-12 13:17:57 (2940): Guest Log: [DEBUG]
2024-03-12 13:17:57 (2940): Guest Log: ERROR: Couldn't read proxy from: /tmp/x509up_u0
2024-03-12 13:17:57 (2940): Guest Log: globus_credential: Error reading proxy credential
2024-03-12 13:17:57 (2940): Guest Log: globus_credential: Error reading proxy credential: Couldn't read PEM from bio
2024-03-12 13:17:57 (2940): Guest Log: OpenSSL Error: pem_lib.c:707: in library: PEM routines, function PEM_read_bio: no start line
2024-03-12 13:17:57 (2940): Guest Log: Use -debug for further information.
2024-03-12 13:17:57 (2940): Guest Log: [ERROR] Could not get an x509 credential
2024-03-12 13:17:57 (2940): Guest Log: [ERROR] The x509 proxy creation failed.
ID: 49758 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 378
Credit: 238,712
RAC: 0
Message 49759 - Posted: 12 Mar 2024, 13:02:25 UTC - in response to Message 49758.  

From the server logs, it looks like it started working for you since 13:11:06 +0100
ID: 49759 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2114
Credit: 159,914,613
RAC: 83,929
Message 49760 - Posted: 12 Mar 2024, 13:15:36 UTC - in response to Message 49759.  

This is from Ivan in an other message:
In this case, the two meanings of the word "proxy" are quite different. The proxy credential is an authorisation to connect to the service; the (squid) proxy server is a caching server that saves requested files so that they don't need to be transported again if re-requested.
ID: 49760 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 378
Credit: 238,712
RAC: 0
Message 49761 - Posted: 12 Mar 2024, 13:30:12 UTC - in response to Message 49760.  

It looks like there is an issue with the proxy generated. I will put the old server back until we can find the cause of the issue.
ID: 49761 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1691
Credit: 104,524,857
RAC: 120,016
Message 49763 - Posted: 12 Mar 2024, 14:30:26 UTC - in response to Message 49761.  

It looks like there is an issue with the proxy generated. I will put the old server back until we can find the cause of the issue.
Laurence, tasks still failing:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=407757073
ID: 49763 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2114
Credit: 159,914,613
RAC: 83,929
Message 49764 - Posted: 12 Mar 2024, 14:54:33 UTC - in response to Message 49763.  

Erich56,
a Server is not back in a few minutes.
ID: 49764 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 378
Credit: 238,712
RAC: 0
Message 49765 - Posted: 12 Mar 2024, 17:27:05 UTC - in response to Message 49764.  

It should be back now.
ID: 49765 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1691
Credit: 104,524,857
RAC: 120,016
Message 49766 - Posted: 12 Mar 2024, 19:18:26 UTC - in response to Message 49765.  
Last modified: 12 Mar 2024, 19:24:48 UTC

It should be back now.
but still not working:
"Could not get an x509 credential":

2024-03-12 19:58:11 (3552): Guest Log: [INFO] Reading volunteer information
2024-03-12 19:58:15 (3552): Guest Log: [INFO] Requesting an X509 credential from LHC@home
2024-03-12 19:58:16 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:58:17 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 19:58:47 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:58:48 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 19:59:18 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:59:19 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 19:59:49 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 19:59:50 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 20:00:20 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 20:00:21 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 20:00:53 (3552): Guest Log: [INFO] Requesting an idtoken from LHC@home
2024-03-12 20:00:55 (3552): Guest Log: [INFO] Requesting an idtoken from vLHC@home-dev
2024-03-12 20:01:30 (3552): Guest Log: [DEBUG] % Total % Received % Xferd Average Speed Time Time Time Current
2024-03-12 20:01:30 (3552): Guest Log: Dload Upload Total Spent Left Speed
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 100 196 100 196 0 0 92 0 0:00:02 0:00:02 --:--:-- 92
2024-03-12 20:01:30 (3552): Guest Log: [DEBUG] % Total % Received % Xferd Average Speed Time Time Time Current
2024-03-12 20:01:30 (3552): Guest Log: Dload Upload Total Spent Left Speed
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0
2024-03-12 20:01:30 (3552): Guest Log: 100 196 100 196 0 0 92 0 0:00:02 0:00:02 --:--:-- 92
2024-03-12 20:01:30 (3552): Guest Log: [ERROR] Could not get an x509 credential
ID: 49766 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1284
Credit: 8,513,326
RAC: 2,911
Message 49767 - Posted: 13 Mar 2024, 7:57:16 UTC

This morning CMS is running OK for me.
ID: 49767 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1691
Credit: 104,524,857
RAC: 120,016
Message 49768 - Posted: 13 Mar 2024, 11:07:43 UTC - in response to Message 49767.  

This morning CMS is running OK for me.
I re-started CMS about 1 hour ago, it's working fine now :-)
ID: 49768 · Report as offensive     Reply Quote

Message boards : CMS Application : since about 2 hours: all tasks failing after few minutes (SOLVED)


©2024 CERN