Message boards : Number crunching : Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout
Message board moderation
| Author | Message |
|---|---|
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
Is anyone else seeing this in there logs? |
|
Send message Joined: 28 Sep 04 Posts: 814 Credit: 66,559,580 RAC: 28,950 |
Yes, I'm seeing this also. But not on every request. My oldest message logs start from 5th of March and this message was there already then but I think situation has been getting worse since then.
|
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
I agree it does seem like it got worse. It a little irritating as it seems to stop the scheduler request so completed WU pile up it seem until I can get a sucessful connection |
|
Send message Joined: 17 Oct 06 Posts: 100 Credit: 65,682,801 RAC: 9,816 |
This has been happening to me to. For the latest outage it started around 7:19 PM Est yesterday (Mar 12th) and just cleared up at 8:51 am EST (Mar 13th). Is the scheduling server getting turned off over night or something lol? |
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
Any update to this? I'm still seeing: HTTP gateway timeout / (HTTP gateway timeout + Scheduler request completed ) Failure rate: 21% Failure rate: 21% Failure rate: 16% Failure rate: 21% Failure rate: 27% |
|
Send message Joined: 15 Jun 08 Posts: 2760 Credit: 306,943,501 RAC: 142,537 |
In reply to Toby Broom's message of 7 Apr 2026: Any update to this? I'm still seeing: @Laurence @Nils This is around for a couple of months! It happens every day although we currently don't have any load from ATLAS/CMS/XTrack. Please investigate and fix it. |
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
The timeout is after 1 min, which is the default in squid, should we increase the setting? |
|
Send message Joined: 15 Jun 08 Posts: 2760 Credit: 306,943,501 RAC: 142,537 |
In reply to Toby Broom's message of 8 Apr 2026: The timeout is after 1 min, which is the default in squid, should we increase the setting? Which timeout do you mean? Squid has many of them. |
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
this one https://www.squid-cache.org/Doc/config/connect_timeout/ the other don't seem to be 1 min by default |
|
Send message Joined: 15 Jun 08 Posts: 2760 Credit: 306,943,501 RAC: 142,537 |
This means Squid times out if a connection can't be established within 1 min. Usually connections should be established within a few ms. If connections to other destinations via Squid work fine a timeout in this phase means this destination has some issues. Transfers happen once the connection is fully established. They have other (longer) timeouts. |
|
Send message Joined: 14 Sep 08 Posts: 53 Credit: 85,318,612 RAC: 23,448 |
I honestly don't think the scheduler request timeout has anything to do with squid (or our local setup) as I don't run squid but still getting it. I noticed that my own tasks page also no longer opens, with 504 Gateway Timeout. This feels like some kind of overload on the server side. |
|
Send message Joined: 22 Sep 08 Posts: 5 Credit: 14,318,070 RAC: 4,758 |
Maybe try to flush your local DNS-Cache: sudo resolvectl flush-caches |
|
Send message Joined: 7 Aug 11 Posts: 122 Credit: 34,229,089 RAC: 2,112 |
Tue 14 Apr 2026 14:24:45 | LHC@home | Sending scheduler request: To fetch work. Tue 14 Apr 2026 14:24:45 | LHC@home | Requesting new tasks for CPU Tue 14 Apr 2026 14:25:17 | LHC@home | Computation for task d051a5a379a4_1 finished Tue 14 Apr 2026 14:25:18 | LHC@home | Starting task e599a5b8536a_0 Tue 14 Apr 2026 14:25:20 | LHC@home | Started upload of d051a5a379a4_1_r1930168046_ATLAS_result Tue 14 Apr 2026 14:25:20 | LHC@home | Started upload of d051a5a379a4_1_r1930168046_ATLAS_hits Tue 14 Apr 2026 14:25:22 | LHC@home | [error] Error reported by file upload server: Server is out of disk space Tue 14 Apr 2026 14:25:22 | LHC@home | [error] Error reported by file upload server: Server is out of disk space Tue 14 Apr 2026 14:25:22 | LHC@home | Temporarily failed upload of d051a5a379a4_1_r1930168046_ATLAS_result: transient upload error Tue 14 Apr 2026 14:25:22 | LHC@home | Backing off 00:03:05 on upload of d051a5a379a4_1_r1930168046_ATLAS_result Tue 14 Apr 2026 14:25:22 | LHC@home | Temporarily failed upload of d051a5a379a4_1_r1930168046_ATLAS_hits: transient upload error Tue 14 Apr 2026 14:25:22 | LHC@home | Backing off 00:02:01 on upload of d051a5a379a4_1_r1930168046_ATLAS_hits Tue 14 Apr 2026 14:25:48 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout |
Magic Quantum MechanicSend message Joined: 24 Oct 04 Posts: 1320 Credit: 100,264,268 RAC: 141,063 |
2.5 hours so far here 4/13/2026 9:47:46 PM | LHC@home | Started upload of 21b63aa11053_0_r1027443174_ATLAS_hits 4/13/2026 9:47:48 PM | LHC@home | [error] Error reported by file upload server: Server is out of disk space 4/13/2026 9:47:48 PM | LHC@home | Temporarily failed upload of 21b63aa11053_0_r1027443174_ATLAS_hits: transient upload error 4/13/2026 9:47:48 PM | LHC@home | Backing off 02:15:43 on upload of 21b63aa11053_0_r1027443174_ATLAS_hits |
|
Send message Joined: 28 Sep 04 Posts: 814 Credit: 66,559,580 RAC: 28,950 |
Back to normal here.
|
|
Send message Joined: 4 Mar 20 Posts: 27 Credit: 8,279,158 RAC: 5,755 |
Not for me 16/04/2026 06:46:03 | LHC@home | Sending scheduler request: To report completed tasks. 16/04/2026 06:46:03 | LHC@home | Reporting 2 completed tasks 16/04/2026 06:46:03 | LHC@home | Requesting new tasks for CPU 16/04/2026 06:47:05 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout 16/04/2026 06:52:17 | LHC@home | Sending scheduler request: To report completed tasks. 16/04/2026 06:52:17 | LHC@home | Reporting 2 completed tasks 16/04/2026 06:52:17 | LHC@home | Requesting new tasks for CPU 16/04/2026 06:53:19 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout |
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
Claude wrote me a script that retries the communication if there is a timeout. Failure rate: 42% Failure rate: 42% Failure rate: 43% Failure rate: 37% Failure rate: 42% Failure rate: 17% |
|
Send message Joined: 7 May 08 Posts: 276 Credit: 2,138,230 RAC: 109 |
In reply to Anne Havinga's message of 16 Apr 2026: 16/04/2026 06:53:19 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout Now a little different message 19/04/2026 07:02:26 | LHC@home | Temporarily failed upload of Theory_2922-4884085-861_0_r509483109_result: transient upload error |
|
Send message Joined: 27 Sep 08 Posts: 960 Credit: 786,249,649 RAC: 108,216 |
This is since the service ran out of disk space. |
|
Send message Joined: 18 Dec 15 Posts: 1994 Credit: 164,329,187 RAC: 111,915 |
In reply to computezrmle's message of 8 Apr 2026: In reply to Toby Broom's message of 7 Apr 2026:these gateway timeouts have become even worse lately :-( why is it that difficult to get the problem fixed ? |
©2026 CERN