Message boards :
CMS Application :
Could not get X509 credentials
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Sep 08 Posts: 801 Credit: 649,773,035 RAC: 240,967 |
There seems to be a credential problem, like Theroy had a while back. e.g. https://lhcathome.cern.ch/lhcathome/result.php?resultid=307695455 https://lhcathome.cern.ch/lhcathome/result.php?resultid=305661561 |
Send message Joined: 12 Aug 16 Posts: 4 Credit: 889,358 RAC: 0 |
Agreed. Every CMS task I get is failing. (That said, it's stating that those tasks don't exist when I navigate to those URLs you provided). I might just disable CMS for now so I don't junk up the pipeline with errored-out tasks. |
Send message Joined: 8 Apr 21 Posts: 23 Credit: 32,965,212 RAC: 51,883 |
I have a lot of CMS task failures from my windows box. Looks like the VM has issues getting X509 certs from LHC. https://lhcathome.cern.ch/lhcathome/result.php?resultid=313565843 <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> The filename or extension is too long. (0xce) - exit code 206 (0xce)</message> <stderr_txt> 2021-04-20 03:45:06 (1464): Detected: vboxwrapper 26197 2021-04-20 03:45:06 (1464): Detected: BOINC client v7.7 2021-04-20 03:45:06 (1464): Detected: VirtualBox VboxManage Interface (Version: 6.1.16) 2021-04-20 03:45:06 (1464): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds) 2021-04-20 03:45:06 (1464): Successfully copied 'init_data.xml' to the shared directory. 2021-04-20 03:45:08 (1464): Create VM. (boinc_80a422322ddcce90, slot#11) 2021-04-20 03:45:08 (1464): Setting Memory Size for VM. (2048MB) 2021-04-20 03:45:09 (1464): Setting CPU Count for VM. (1) 2021-04-20 03:45:09 (1464): Setting Chipset Options for VM. 2021-04-20 03:45:09 (1464): Setting Boot Options for VM. 2021-04-20 03:45:09 (1464): Setting Network Configuration for NAT. 2021-04-20 03:45:10 (1464): Enabling VM Network Access. 2021-04-20 03:45:10 (1464): Disabling USB Support for VM. 2021-04-20 03:45:10 (1464): Disabling COM Port Support for VM. 2021-04-20 03:45:10 (1464): Disabling LPT Port Support for VM. 2021-04-20 03:45:11 (1464): Disabling Audio Support for VM. 2021-04-20 03:45:11 (1464): Disabling Clipboard Support for VM. 2021-04-20 03:45:11 (1464): Disabling Drag and Drop Support for VM. 2021-04-20 03:45:12 (1464): Adding storage controller(s) to VM. 2021-04-20 03:45:12 (1464): Adding virtual disk drive to VM. (vm_image.vdi) 2021-04-20 03:45:12 (1464): Adding VirtualBox Guest Additions to VM. 2021-04-20 03:45:12 (1464): Adding network bandwidth throttle group to VM. (Defaulting to 1024GB) 2021-04-20 03:45:13 (1464): forwarding host port 58261 to guest port 80 2021-04-20 03:45:13 (1464): Enabling remote desktop for VM. 2021-04-20 03:45:13 (1464): Enabling shared directory for VM. 2021-04-20 03:45:14 (1464): Starting VM using VBoxManage interface. (boinc_80a422322ddcce90, slot#11) 2021-04-20 03:45:18 (1464): Successfully started VM. (PID = '7968') 2021-04-20 03:45:18 (1464): Reporting VM Process ID to BOINC. 2021-04-20 03:45:18 (1464): Guest Log: BIOS: VirtualBox 6.1.16 2021-04-20 03:45:18 (1464): Guest Log: CPUID EDX: 0x178bfbff 2021-04-20 03:45:18 (1464): Guest Log: BIOS: ata0-0: PCHS=16383/16/63 LCHS=1024/255/63 2021-04-20 03:45:18 (1464): VM state change detected. (old = 'PoweredOff', new = 'Running') 2021-04-20 03:45:18 (1464): Detected: Web Application Enabled (http://localhost:58261) 2021-04-20 03:45:18 (1464): Detected: Remote Desktop Enabled (localhost:58262) 2021-04-20 03:45:18 (1464): Preference change detected 2021-04-20 03:45:18 (1464): Setting CPU throttle for VM. (100%) 2021-04-20 03:45:18 (1464): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 600 seconds) or (Vbox_job.xml: 600 seconds)) 2021-04-20 03:45:20 (1464): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032 2021-04-20 03:45:20 (1464): Guest Log: BIOS: Booting from Hard Disk... 2021-04-20 03:45:22 (1464): Guest Log: BIOS: KBD: unsupported int 16h function 03 2021-04-20 03:45:22 (1464): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000 2021-04-20 03:45:36 (1464): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds 2021-04-20 03:45:36 (1464): Guest Log: vboxguest: misc device minor 56, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000) 2021-04-20 03:45:56 (1464): Guest Log: VBoxService 5.2.6 r120293 (verbosity: 0) linux.amd64 (Jan 15 2018 14:51:00) release log 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000074 main Log opened 2021-04-20T07:45:56.310756000Z 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000244 main OS Product: Linux 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000275 main OS Release: 4.14.157-17.cernvm.x86_64 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000297 main OS Version: #1 SMP Wed Dec 4 17:26:45 CET 2019 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000319 main Executable: /usr/share/vboxguest52/usr/sbin/VBoxService 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000319 main Process ID: 2976 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000320 main Package type: LINUX_64BITS_GENERIC 2021-04-20 03:45:56 (1464): Guest Log: 00:00:00.000886 main 5.2.6 r120293 started. Verbose level = 0 2021-04-20 03:47:51 (1464): Guest Log: [INFO] Mounting the shared directory 2021-04-20 03:47:51 (1464): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] Testing network connection to cern.ch on port 80 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] Connection to cern.ch 80 port [tcp/http] succeeded! 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] 0 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] Testing VCCS connection to vccs.cern.ch on port 443 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] Connection to vccs.cern.ch 443 port [tcp/https] succeeded! 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] 0 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] Testing connection to Condor server on port 9618 2021-04-20 03:47:51 (1464): Guest Log: [DEBUG] Connection to vocms0840.cern.ch 9618 port [tcp/condor] succeeded! 2021-04-20 03:47:52 (1464): Guest Log: [DEBUG] 0 2021-04-20 03:50:39 (1464): Guest Log: [DEBUG] Probing CVMFS ... 2021-04-20 03:50:40 (1464): Guest Log: Probing /cvmfs/grid.cern.ch... OK 2021-04-20 03:50:45 (1464): Guest Log: VERSION PID UPTIME(M) MEM(K) REVISION EXPIRES(M) NOCATALOGS CACHEUSE(K) CACHEMAX(K) NOFDUSE NOFDMAX NOIOERR NOOPEN HITRATE(%) RX(K) SPEED(K/S) HOST PROXY ONLINE 2021-04-20 03:50:45 (1464): Guest Log: 2.4.4.0 3760 4 27856 11625 2 1 1234377 4096000 2 65024 0 3 100 0 0 http://s1asgc-cvmfs.openhtc.io:8080/cvmfs/grid.cern.ch http://128.142.161.84:3126 0 2021-04-20 03:54:16 (1464): Guest Log: [INFO] Reading volunteer information 2021-04-20 03:54:16 (1464): Guest Log: [INFO] Volunteer: scotth (787857) 2021-04-20 03:54:16 (1464): Guest Log: [INFO] VMID: 49b2fac1-df25-48d2-a4ee-4612ca6a31f8 2021-04-20 03:54:16 (1464): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-04-20 03:54:16 (1464): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2021-04-20 03:54:47 (1464): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-04-20 03:54:47 (1464): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2021-04-20 03:55:17 (1464): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-04-20 03:55:18 (1464): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2021-04-20 03:55:48 (1464): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-04-20 03:55:48 (1464): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2021-04-20 03:56:19 (1464): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-04-20 03:56:19 (1464): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2021-04-20 03:56:49 (1464): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-04-20 03:56:49 (1464): Guest Log: [INFO] Requesting an X509 credential from vLHC@home-dev 2021-04-20 03:57:20 (1464): Guest Log: [DEBUG] 2021-04-20 03:57:20 (1464): Guest Log: curl: (60) Peer certificate cannot be authenticated with known CA certificates 2021-04-20 03:57:20 (1464): Guest Log: More details here: http://curl.haxx.se/docs/sslcerts.html 2021-04-20 03:57:20 (1464): Guest Log: curl performs SSL certificate verification by default, using a "bundle" 2021-04-20 03:57:20 (1464): Guest Log: of Certificate Authority (CA) public keys (CA certs). If the default 2021-04-20 03:57:20 (1464): Guest Log: bundle file isn't adequate, you can specify an alternate file 2021-04-20 03:57:20 (1464): Guest Log: using the --cacert option. 2021-04-20 03:57:20 (1464): Guest Log: If this HTTPS server uses a certificate signed by a CA represented in 2021-04-20 03:57:20 (1464): Guest Log: the bundle, the certificate verification probably failed due to a 2021-04-20 03:57:20 (1464): Guest Log: problem with the certificate (it might be expired, or the name might 2021-04-20 03:57:20 (1464): Guest Log: not match the domain name in the URL). 2021-04-20 03:57:20 (1464): Guest Log: If you'd like to turn off curl's verification of the certificate, use 2021-04-20 03:57:20 (1464): Guest Log: the -k (or --insecure) option. 2021-04-20 03:57:20 (1464): Guest Log: [DEBUG] 2021-04-20 03:57:20 (1464): Guest Log: ERROR: Couldn't find a valid proxy. 2021-04-20 03:57:20 (1464): Guest Log: globus_sysconfig: File has zero length: File: /tmp/x509up_u0 2021-04-20 03:57:20 (1464): Guest Log: Use -debug for further information. 2021-04-20 03:57:20 (1464): Guest Log: [ERROR] Could not get an x509 credential 2021-04-20 03:57:20 (1464): Guest Log: [ERROR] The x509 proxy creation failed. 2021-04-20 03:57:20 (1464): Guest Log: [INFO] Shutting Down. 2021-04-20 03:57:20 (1464): VM Completion File Detected. 2021-04-20 03:57:20 (1464): VM Completion Message: The x509 proxy creation failed. . 2021-04-20 03:57:20 (1464): Powering off VM. 2021-04-20 04:02:21 (1464): VM did not power off when requested. 2021-04-20 04:02:21 (1464): VM was successfully terminated. 2021-04-20 04:02:21 (1464): Deregistering VM. (boinc_80a422322ddcce90, slot#11) 2021-04-20 04:02:21 (1464): Removing network bandwidth throttle group from VM. 2021-04-20 04:02:21 (1464): Removing VM from VirtualBox. 04:02:27 (1464): called boinc_finish(206) </stderr_txt> ]]> |
Send message Joined: 27 Sep 08 Posts: 801 Credit: 649,773,035 RAC: 240,967 |
I didn't get anymore for a while so I think it's some transient error |
Send message Joined: 8 Apr 21 Posts: 23 Credit: 32,965,212 RAC: 51,883 |
I removed my one Windows 10 host from getting CMS Sim (vbox64) workunits for a while, thinking that the issue is transient. The errors are back again in new workunits. Below are a few of them. https://lhcathome.cern.ch/lhcathome/result.php?resultid=316190883 https://lhcathome.cern.ch/lhcathome/result.php?resultid=316187982 https://lhcathome.cern.ch/lhcathome/result.php?resultid=316180651 This host is completing some of the CMS Simulation workunits, but the majority are failing from the certificate issue. |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,695,394 RAC: 129,577 |
Win11pro Requesting an idtoken from LHC@Home X509_USER_PROXY = /tmp/x509up_u1000 Since more than 10 minutes, doing nothing! https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5925&postid=47464#47464 2023-10-26 22:10:45 (24608): Guest Log: [INFO] Reading volunteer information 2023-10-26 22:11:06 (24608): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2023-10-26 22:11:07 (24608): Guest Log: [INFO] CMS application starting. Check log files. 2023-10-26 22:11:07 (24608): Guest Log: [INFO] Requesting an idtoken from LHC@home 2023-10-26 22:38:24 (24608): Powering off VM. |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,695,394 RAC: 129,577 |
id-token: Software-solution or hardware-solution? |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,319,021 RAC: 123,225 |
A long number - requested by the (CMS-)VM - issued by a special service at CERN - limited validity - required to contact certain CERN systems |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,695,394 RAC: 129,577 |
freiwilliger Moderator freiwilliger Entwickler freiwilliger Tester Helpdesk Experte hoping for your help. |
Send message Joined: 24 Oct 04 Posts: 1115 Credit: 49,702,849 RAC: 13,784 |
Mine had no problems and then started doing this...... https://lhcathome.cern.ch/lhcathome/results.php?hostid=10451775 |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,695,394 RAC: 129,577 |
TPM 2.0 in W11pro is blocking CMS Task. Cern-IT have to disable this id-token, or registration our account. |
Send message Joined: 14 Jan 10 Posts: 1270 Credit: 8,479,164 RAC: 2,361 |
Mine had no problems and then started doing this...... Have a look into Virtual disk media with VirtualBox Manager. Probably there are several remnants of old VM's. Cannot close medium 'C:\ProgramData\BOINC\projects\lhcathome.cern.ch_lhcathome\CMS_2022_09_07_prod.vdi' because it has 7 child media |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,319,021 RAC: 123,225 |
TPM 2.0 in W11pro is blocking CMS Task. Why do you think so? Neither vboxwrapper nor the CMS VM configure/enable any TPM option since both have been created long before VirtualBox offered it. Beside that, other volunteer's Win11Pro computers seem to run fine. |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,695,394 RAC: 129,577 |
Requesting an idtoken from LHC@Home |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,319,021 RAC: 123,225 |
Requesting an idtoken from LHC@Home As already explained the VM contacts a system at CERN. That connection is encrypted (HTTPS port 443), hence your host can't see what the VM asks for and what the CERN system replies. Now, where is the link to TPM? So far the "TPM-failure" you are asking to be solved is nothing but a red herring. |
Send message Joined: 24 Oct 04 Posts: 1115 Credit: 49,702,849 RAC: 13,784 |
Mine had no problems and then started doing this...... Yes I checked that the first time and there was nothing there so I tried again and they all crashed in one second and that is when 7 of the 8 were suspended but when I went over to -dev it had no problem with that version of CMS so it wasn't that Manager problem |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,319,021 RAC: 123,225 |
The root cause is that all of the new series start concurrently which makes them run into an old VirtualBox issue. In short: VirtualBox does not allow to set the multiattach option within the main config file to remain compatible with version 4.x (which is otherwise not supported for many years). Nonetheless they set that option themselves(!) under certain circumstances. Vboxwrapper tries to clear this when it stumbles over it but must ensure it does not accidentally kill a valid running VM. Advice (in short, as it has been given a couple of times): - clean the VirtualBox media environment - next time you start VMs begin with just 1 of them Once the media settings have been done starting additional VMs (even concurrently) shouldn't be a problem. In addition: It is recommended not to suspend/interrupt a (CMS-)VM during startup since it makes lots of connections to external ressources. If those connections run into timeouts (server side) important data may be missing later on the client side. |
Send message Joined: 2 May 07 Posts: 2090 Credit: 158,695,394 RAC: 129,577 |
2023-11-10 06:56:02 (25648): Guest Log: [INFO] Requesting an idtoken from LHC@home 2023-11-10 06:59:50 (25648): Preference change detected 2023-11-10 06:59:50 (25648): Setting CPU throttle for VM. (100%) 2023-11-10 06:59:50 (25648): Setting checkpoint interval to 1200 seconds. (Higher value of (Preference: 1200 seconds) or (Vbox_job.xml: 600 seconds)) 2023-11-10 08:34:40 (25648): Status Report: Job Duration: '64800.000000' 2023-11-10 08:34:40 (25648): Status Report: Elapsed Time: '6000.000000' two and a half hour ? later first job inside the task starts. |
Send message Joined: 15 Jun 08 Posts: 2401 Credit: 225,319,021 RAC: 123,225 |
That VM started at 2023-11-10 06:54:20 2023-11-10 06:54:20 (25648): Detected: vboxwrapper 26206 . . . The VM requested an idtoken and got it without any issues. Issues would have been reported to the log. The VMs are very verbose if there are issues and are configured to finally shut down in this case. 2023-11-10 06:56:02 (25648): Guest Log: [INFO] Requesting an idtoken from LHC@home Reminder: This idtoken has nothing to do with TPM on the host and the VM is also not configured to use TPM. Any TPM relation is pure fantasy. This log entry is related to the heartbeat file check. The VM updates the heartbeat file's timestamp every minute as described here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=6060&postid=48850 This has no relation to the idtoken. 2023-11-10 06:59:50 (25648): Setting checkpoint interval to 1200 seconds. (Higher value of (Preference: 1200 seconds) or (Vbox_job.xml: 600 seconds)) These lines are just a runtime info from inside the VM. They update once every 6000 seconds. 2023-11-10 08:34:40 (25648): Status Report: Job Duration: '64800.000000' 2023-11-10 08:34:40 (25648): Status Report: Elapsed Time: '6000.000000' The 3rd corresponding line is missing in the post: 2023-11-10 08:34:40 (25648): Status Report: CPU Time: '5483.343750' So, the VM's internal accounting shows a CPU Time of 5483 s (~1:31 h) after ~1:40 total runtime => a nicely running subtask Obviously a miscalculation: two and a half hour ? later first job inside the task starts. |
Send message Joined: 24 Oct 04 Posts: 1115 Credit: 49,702,849 RAC: 13,784 |
All I can add is that this only happened with me on the 10th and before that it didn't happen and the way I run these has never changed and they are always Valid until this happened here. And of course the last one that failed was run after I set the pref.. to only give me ONE task instead of the usual 8 and it did the same thing. I may just do a clean install of VB and try a single task again and see if it works. |
©2024 CERN