Message boards :
CMS Application :
CMS@Home -- jobs update
Message board moderation
Author | Message |
---|---|
![]() Send message Joined: 29 Aug 05 Posts: 1072 Credit: 8,427,688 RAC: 6,684 ![]() |
Sorry for the long delay, I'd been hoping for good news... As you've no doubt noticed, we haven't had any CMS@Home jobs for some time -- tasks sit idle until they time out. It looks like we have finally tracked down the reason. Somehow in starting up the Condor server, after it had been updated to run Alma9 Linux, the firewall rules get corrupted meaning that port 546 is blocked, so connection to the DHCPv6 server is disabled and IPv6 communication stops working. Now that the problem is identified, the responsible experts are working on a fix. I don't have an estimate on when it will be running again, but hopefully before Christmas! :-) ![]() |
Send message Joined: 14 Jan 10 Posts: 1440 Credit: 9,662,815 RAC: 1,380 ![]() ![]() |
Thanks Ivan for letting us know! |
![]() ![]() Send message Joined: 24 Oct 04 Posts: 1193 Credit: 59,188,696 RAC: 68,372 ![]() ![]() |
![]() Ivan Claus |
![]() Send message Joined: 28 Sep 04 Posts: 748 Credit: 52,016,316 RAC: 30,270 ![]() ![]() ![]() |
New Boinc tasks available but they don't get any new jobs. stderr gives: <core_client_version>8.0.2</core_client_version> <![CDATA[ <message> The global filename characters, * or ?, are entered incorrectly or too many global filename characters are specified. (0xd0) - exit code 208 (0xd0)</message> <stderr_txt> And further down: 2024-12-20 14:15:15 (200056): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2024-12-20 14:15:15 (200056): Guest Log: [INFO] Requesting an idtoken from LHC@home 2024-12-20 14:15:16 (200056): Guest Log: [INFO] CMS application starting. Check log files. 2024-12-20 14:34:24 (200056): Guest Log: [ERROR] glidein exited with return value 1. ![]() |
Send message Joined: 18 Dec 15 Posts: 1843 Credit: 126,601,542 RAC: 128,802 ![]() ![]() ![]() |
same here :-( when I saw new tasks being available late morning, I was confident the the problem of no jobs was finally solved (for what other reasons would tasks have been made available again?). However, no luck - same problem as before :-( |
Send message Joined: 18 Dec 15 Posts: 1843 Credit: 126,601,542 RAC: 128,802 ![]() ![]() ![]() |
- same problem as before :-(well, not quite "same problem as before" - whereas BEFORE, the tasks ran for about 30 minutes, then got finished and yielded a small amount of credit, NOW they stop after about 22 minutes with "computation error", and no credit. So there is a slight difference to what the situation was before. |
![]() Send message Joined: 28 Sep 04 Posts: 748 Credit: 52,016,316 RAC: 30,270 ![]() ![]() ![]() |
same here :-( Not quite the same as before, now it gives an error also in Boinc. So not even minimal credit gets awarded. ![]() |
Send message Joined: 15 Jul 05 Posts: 26 Credit: 2,428,108 RAC: 188 ![]() ![]() |
looks like I've the same error 2024-12-20 14:24:46 (12684): Guest Log: [INFO] Probing /cvmfs/grid.cern.ch... OK 2024-12-20 14:24:49 (12684): Guest Log: [INFO] Probing /cvmfs/cms-ib.cern.ch... OK 2024-12-20 14:24:49 (12684): Guest Log: [INFO] Probing /cvmfs/singularity.opensciencegrid.org... OK 2024-12-20 14:24:50 (12684): Guest Log: [INFO] Probing /cvmfs/cms.cern.ch... OK 2024-12-20 14:24:51 (12684): Guest Log: [INFO] Probing /cvmfs/oasis.opensciencegrid.org... OK 2024-12-20 14:24:52 (12684): Guest Log: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 2024-12-20 14:24:52 (12684): Guest Log: [INFO] 2.7.2.0 http://s1fnal-cvmfs.openhtc.io:8080 DIRECT 2024-12-20 14:24:52 (12684): Guest Log: [INFO] Environment HTTP proxy: not set 2024-12-20 14:24:53 (12684): Guest Log: [INFO] Reading volunteer information 2024-12-20 14:25:25 (12684): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2024-12-20 14:25:26 (12684): Guest Log: [INFO] Requesting an idtoken from LHC@home 2024-12-20 14:25:27 (12684): Guest Log: [INFO] CMS application starting. Check log files. 2024-12-20 14:45:34 (12684): Guest Log: [ERROR] glidein exited with return value 1. Matthias |
![]() ![]() Send message Joined: 12 Jul 08 Posts: 20 Credit: 340,498 RAC: 0 ![]() ![]() |
Same error here ... I crunch for Ukraine |
Send message Joined: 18 Dec 15 Posts: 1843 Credit: 126,601,542 RAC: 128,802 ![]() ![]() ![]() |
Ivan - why do you send out tasks as long as no jobs are coming in ? |
![]() Send message Joined: 15 Jun 08 Posts: 2607 Credit: 262,481,733 RAC: 137,876 ![]() ![]() |
It is obvious that tasks must be sent out to get the error(s) in the process chain located. Unfortunately it can't be restricted to computers run by the developers. Hence, until the issues are solved - best would be to uncheck CMS in the prefs and wait for a go in the forum - do not run a full buffer of envelope tasks (the short ones running around 0.5 h with very few CPU usage, even if they claim to be valid) - if you want to do some tests, run only a handful of tasks spread over the whole day |
Send message Joined: 23 Dec 19 Posts: 18 Credit: 46,554,083 RAC: 30,444 ![]() ![]() ![]() |
I've investigated a bit on how to limit CMS jobs as they are failing. The boinc prefs, unticking CMS, had no effect in my case. I run Win11, Win10, Ubuntu24.04 and Ubuntu22.04 machines, Virtualbox in each. I am now trying an alternative. I have created app_config for CMS and defined there mac_concurrency "1" which seems to have an effect. Hence, most of cpu is used for something purposeful and still contributing to project (at least pile up error logs). Br Pekka |
![]() Send message Joined: 15 Jun 08 Posts: 2607 Credit: 262,481,733 RAC: 137,876 ![]() ![]() |
The boinc prefs, unticking CMS, had no effect in my case. Most likely you have enabled this at your prefs page: "If no work for selected applications is available, accept work from other applications?" Disable this and disable all apps you don't want to run. |
Send message Joined: 23 Dec 19 Posts: 18 Credit: 46,554,083 RAC: 30,444 ![]() ![]() ![]() |
The boinc prefs, unticking CMS, had no effect in my case. Correct, I missed that tab. |
Send message Joined: 13 May 20 Posts: 38 Credit: 2,045,533 RAC: 2,462 ![]() ![]() ![]() |
bonjour pas loin de 200 taches csm avec code d'erreur 208. hello not far from 200 csm spots with error code 208. |
Send message Joined: 31 Dec 11 Posts: 2 Credit: 6,833,061 RAC: 3,538 ![]() ![]() |
Aim failed task code 208 https://lhcathome.cern.ch/lhcathome/result.php?resultid=418488241. :-( |
![]() Send message Joined: 15 Jun 08 Posts: 2607 Credit: 262,481,733 RAC: 137,876 ![]() ![]() |
Looks like CMS sends out jobs again since this afternoon. Cheers and happy crunching. |
![]() ![]() Send message Joined: 24 Oct 04 Posts: 1193 Credit: 59,188,696 RAC: 68,372 ![]() ![]() |
I have some running at -dev but that site needs to be poked with a stick since it is just blank pages for everything so we can't see what is going on. |
Send message Joined: 27 Sep 08 Posts: 859 Credit: 703,653,654 RAC: 156,567 ![]() ![]() ![]() |
|
![]() Send message Joined: 29 Aug 05 Posts: 1072 Credit: 8,427,688 RAC: 6,684 ![]() |
Looks like CMS sends out jobs again since this afternoon. Yes, it took a while to get all our ducks in a row, what with the long holiday break (my uni was closed for 16 days!). However, it looks like Laurence got the rght glide-in wrappers and id-tokens installed yesterday, with a bit of help from Federica, and things are starting to take off again. I notice there's one user with a high number of failures to access the Frontier servers (conditions database) -- that's usually a sign of a network misconfiguration on the client site. ![]() |
©2025 CERN