Message boards : CMS Application : CMS tasks failing
Message board moderation
Previous · 1 · 2 · 3 · 4 · 5
| Author | Message |
|---|---|
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
|
|
Send message Joined: 5 Apr 25 Posts: 66 Credit: 1,784,447 RAC: 13,818 |
Got a bunch that failed again the past couple of hours, switched over to ATLAS for the time being Actually, it seems to be in perfect correlation with the number of jobs constantly dropping since 10:24 UTC
|
|
Send message Joined: 18 Dec 15 Posts: 1941 Credit: 156,018,218 RAC: 107,484 |
I just noticed that since about noontime of today most of the tasks failed after various periods of time (between a few minutes and half an hour). In many cases, stderr says that filename or extension is too long: Der Dateiname oder die Erweiterung ist zu lang. (0xce) - exit code 206 (0xce)</message> what's going on? |
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
In reply to Erich56's message of 28 Jan 2026: I just noticed that since about noontime of today most of the tasks failed after various periods of time (between a few minutes and half an hour). Yes, I've noticed the jobs falling over the last hours. Not sure what's going on yet, that's not a common error message I'm tempted to believe it's coming from the operating system, but more digging needed. |
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
In reply to Garrulus glandarius's message of 28 Jan 2026: Got a bunch that failed again the past couple of hours, switched over to ATLAS for the time being Yes, that does seem to be the case. |
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
In reply to Erich56's message of 28 Jan 2026: I just noticed that since about noontime of today most of the tasks failed after various periods of time (between a few minutes and half an hour). Some of your failures are network related; could be a problem at cern: 2026-01-28 15:27:42 (24776): Guest Log: Ncat: Version 7.50 ( https://nmap.org/ncat ) 2026-01-28 15:27:42 (24776): Guest Log: Ncat: Connection to 137.138.53.124 failed: Connection timed out. 2026-01-28 15:27:42 (24776): Guest Log: Ncat: Trying next address... 2026-01-28 15:27:42 (24776): Guest Log: Ncat: Network is unreachable. [TinyPC:words] > nslookup 137.138.53.124 Server: UnKnown Address: 10.174.98.157 Name: vocms0204.cern.ch Address: 137.138.53.124 |
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1261 Credit: 92,106,969 RAC: 109,679 |
Same here Volunteer Mad Scientist For Life unbelievable are you trying to promote linux again? |
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
|
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1261 Credit: 92,106,969 RAC: 109,679 |
In reply to ivan's message of 28 Jan 2026: It looks like the dip might be bottoming out. Digits cruciate! I hope so Ivan.......fingers crossed |
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
|
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1261 Credit: 92,106,969 RAC: 109,679 |
I figured 2 fingers crossed might work so I tried a new CMS and since the ones crashing would do it in 25 minutes or less it looks like this one is running and just passed the first hour so if this ends up Valid I will start up more on here and if they keep running reload the other hosts. I did have to get in the VB Manager and remove all those crashed ones |
|
Send message Joined: 29 Aug 05 Posts: 1134 Credit: 11,579,777 RAC: 14,842 |
|
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1261 Credit: 92,106,969 RAC: 109,679 |
Sounds good and have a fine day.......it is a holiday here on the 29th |
|
Send message Joined: 5 Apr 25 Posts: 66 Credit: 1,784,447 RAC: 13,818 |
Things seem to be going smoothly now. My main LHC rig is still busy with ATLAS, but an old i7 has been running a new-ish CMS task for over 13 hours with apparently normal progress.
|
|
Send message Joined: 18 Dec 15 Posts: 1941 Credit: 156,018,218 RAC: 107,484 |
same here (with the exception of only 1 task this late morning) :-) |
©2026 CERN