Message boards : Number crunching : Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53160 - Posted: 12 Mar 2026, 8:21:35 UTC

Is anyone else seeing this in there logs?
ID: 53160 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 814
Credit: 66,559,580
RAC: 28,950
Message 53161 - Posted: 12 Mar 2026, 8:29:44 UTC

Yes, I'm seeing this also. But not on every request. My oldest message logs start from 5th of March and this message was there already then but I think situation has been getting worse since then.
ID: 53161 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53164 - Posted: 12 Mar 2026, 19:53:47 UTC - in response to Message 53161.  

I agree it does seem like it got worse. It a little irritating as it seems to stop the scheduler request so completed WU pile up it seem until I can get a sucessful connection
ID: 53164 · Report as offensive     Reply Quote
CloverField

Send message
Joined: 17 Oct 06
Posts: 100
Credit: 65,682,801
RAC: 9,816
Message 53168 - Posted: 13 Mar 2026, 12:55:44 UTC

This has been happening to me to. For the latest outage it started around 7:19 PM Est yesterday (Mar 12th) and just cleared up at 8:51 am EST (Mar 13th).
Is the scheduling server getting turned off over night or something lol?
ID: 53168 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53374 - Posted: 7 Apr 2026, 19:43:36 UTC
Last modified: 7 Apr 2026, 20:00:38 UTC

Any update to this? I'm still seeing:

HTTP gateway timeout / (HTTP gateway timeout + Scheduler request completed )


Failure rate: 21%
Failure rate: 21%
Failure rate: 16%
Failure rate: 21%
Failure rate: 27%
ID: 53374 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2760
Credit: 306,943,501
RAC: 142,537
Message 53376 - Posted: 8 Apr 2026, 6:41:01 UTC - in response to Message 53374.  

In reply to Toby Broom's message of 7 Apr 2026:
Any update to this? I'm still seeing:

HTTP gateway timeout / (HTTP gateway timeout + Scheduler request completed )


Failure rate: 21%
Failure rate: 21%
Failure rate: 16%
Failure rate: 21%
Failure rate: 27%


@Laurence
@Nils
This is around for a couple of months!
It happens every day although we currently don't have any load from ATLAS/CMS/XTrack.

Please investigate and fix it.
ID: 53376 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53377 - Posted: 8 Apr 2026, 7:00:44 UTC - in response to Message 53376.  

The timeout is after 1 min, which is the default in squid, should we increase the setting?
ID: 53377 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2760
Credit: 306,943,501
RAC: 142,537
Message 53378 - Posted: 8 Apr 2026, 7:21:30 UTC - in response to Message 53377.  

In reply to Toby Broom's message of 8 Apr 2026:
The timeout is after 1 min, which is the default in squid, should we increase the setting?

Which timeout do you mean?
Squid has many of them.
ID: 53378 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53381 - Posted: 8 Apr 2026, 17:26:50 UTC - in response to Message 53378.  

this one https://www.squid-cache.org/Doc/config/connect_timeout/

the other don't seem to be 1 min by default
ID: 53381 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2760
Credit: 306,943,501
RAC: 142,537
Message 53385 - Posted: 8 Apr 2026, 20:55:17 UTC - in response to Message 53381.  

This means Squid times out if a connection can't be established within 1 min.
Usually connections should be established within a few ms.
If connections to other destinations via Squid work fine a timeout in this phase means this destination has some issues.

Transfers happen once the connection is fully established.
They have other (longer) timeouts.
ID: 53385 · Report as offensive     Reply Quote
wujj123456

Send message
Joined: 14 Sep 08
Posts: 53
Credit: 85,318,612
RAC: 23,448
Message 53389 - Posted: 10 Apr 2026, 1:08:56 UTC

I honestly don't think the scheduler request timeout has anything to do with squid (or our local setup) as I don't run squid but still getting it. I noticed that my own tasks page also no longer opens, with 504 Gateway Timeout. This feels like some kind of overload on the server side.
ID: 53389 · Report as offensive     Reply Quote
Boone

Send message
Joined: 22 Sep 08
Posts: 5
Credit: 14,318,070
RAC: 4,758
Message 53402 - Posted: 13 Apr 2026, 15:57:55 UTC - in response to Message 53389.  

Maybe try to flush your local DNS-Cache:
sudo resolvectl flush-caches
ID: 53402 · Report as offensive     Reply Quote
Dark Angel
Avatar

Send message
Joined: 7 Aug 11
Posts: 122
Credit: 34,229,089
RAC: 2,112
Message 53408 - Posted: 14 Apr 2026, 4:28:33 UTC

Tue 14 Apr 2026 14:24:45 | LHC@home | Sending scheduler request: To fetch work.
Tue 14 Apr 2026 14:24:45 | LHC@home | Requesting new tasks for CPU
Tue 14 Apr 2026 14:25:17 | LHC@home | Computation for task d051a5a379a4_1 finished
Tue 14 Apr 2026 14:25:18 | LHC@home | Starting task e599a5b8536a_0
Tue 14 Apr 2026 14:25:20 | LHC@home | Started upload of d051a5a379a4_1_r1930168046_ATLAS_result
Tue 14 Apr 2026 14:25:20 | LHC@home | Started upload of d051a5a379a4_1_r1930168046_ATLAS_hits
Tue 14 Apr 2026 14:25:22 | LHC@home | [error] Error reported by file upload server: Server is out of disk space
Tue 14 Apr 2026 14:25:22 | LHC@home | [error] Error reported by file upload server: Server is out of disk space
Tue 14 Apr 2026 14:25:22 | LHC@home | Temporarily failed upload of d051a5a379a4_1_r1930168046_ATLAS_result: transient upload error
Tue 14 Apr 2026 14:25:22 | LHC@home | Backing off 00:03:05 on upload of d051a5a379a4_1_r1930168046_ATLAS_result
Tue 14 Apr 2026 14:25:22 | LHC@home | Temporarily failed upload of d051a5a379a4_1_r1930168046_ATLAS_hits: transient upload error
Tue 14 Apr 2026 14:25:22 | LHC@home | Backing off 00:02:01 on upload of d051a5a379a4_1_r1930168046_ATLAS_hits
Tue 14 Apr 2026 14:25:48 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout
ID: 53408 · Report as offensive     Reply Quote
ProfileMagic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1320
Credit: 100,264,268
RAC: 141,063
Message 53409 - Posted: 14 Apr 2026, 4:51:40 UTC - in response to Message 53408.  

2.5 hours so far here

4/13/2026 9:47:46 PM | LHC@home | Started upload of 21b63aa11053_0_r1027443174_ATLAS_hits
4/13/2026 9:47:48 PM | LHC@home | [error] Error reported by file upload server: Server is out of disk space
4/13/2026 9:47:48 PM | LHC@home | Temporarily failed upload of 21b63aa11053_0_r1027443174_ATLAS_hits: transient upload error
4/13/2026 9:47:48 PM | LHC@home | Backing off 02:15:43 on upload of 21b63aa11053_0_r1027443174_ATLAS_hits
ID: 53409 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 814
Credit: 66,559,580
RAC: 28,950
Message 53411 - Posted: 14 Apr 2026, 7:41:18 UTC

Back to normal here.
ID: 53411 · Report as offensive     Reply Quote
Anne Havinga

Send message
Joined: 4 Mar 20
Posts: 27
Credit: 8,279,158
RAC: 5,755
Message 53422 - Posted: 16 Apr 2026, 6:06:46 UTC - in response to Message 53411.  

Not for me

16/04/2026 06:46:03 | LHC@home | Sending scheduler request: To report completed tasks.
16/04/2026 06:46:03 | LHC@home | Reporting 2 completed tasks
16/04/2026 06:46:03 | LHC@home | Requesting new tasks for CPU
16/04/2026 06:47:05 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout
16/04/2026 06:52:17 | LHC@home | Sending scheduler request: To report completed tasks.
16/04/2026 06:52:17 | LHC@home | Reporting 2 completed tasks
16/04/2026 06:52:17 | LHC@home | Requesting new tasks for CPU
16/04/2026 06:53:19 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout
ID: 53422 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53423 - Posted: 16 Apr 2026, 7:10:00 UTC

Claude wrote me a script that retries the communication if there is a timeout.

Failure rate: 42%
Failure rate: 42%
Failure rate: 43%
Failure rate: 37%
Failure rate: 42%
Failure rate: 17%
ID: 53423 · Report as offensive     Reply Quote
[VENETO] boboviz
Avatar

Send message
Joined: 7 May 08
Posts: 276
Credit: 2,138,230
RAC: 109
Message 53440 - Posted: 19 Apr 2026, 5:06:34 UTC - in response to Message 53422.  

In reply to Anne Havinga's message of 16 Apr 2026:
16/04/2026 06:53:19 | LHC@home | Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout


Now a little different message
19/04/2026 07:02:26 | LHC@home | Temporarily failed upload of Theory_2922-4884085-861_0_r509483109_result: transient upload error
ID: 53440 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 960
Credit: 786,249,649
RAC: 108,216
Message 53443 - Posted: 19 Apr 2026, 6:49:44 UTC - in response to Message 53440.  

This is since the service ran out of disk space.
ID: 53443 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1994
Credit: 164,329,187
RAC: 111,915
Message 53468 - Posted: 22 Apr 2026, 18:02:51 UTC - in response to Message 53376.  

In reply to computezrmle's message of 8 Apr 2026:
In reply to Toby Broom's message of 7 Apr 2026:
Any update to this? I'm still seeing:

HTTP gateway timeout / (HTTP gateway timeout + Scheduler request completed )


Failure rate: 21%
Failure rate: 21%
Failure rate: 16%
Failure rate: 21%
Failure rate: 27%


@Laurence
@Nils
This is around for a couple of months!
It happens every day although we currently don't have any load from ATLAS/CMS/XTrack.

Please investigate and fix it.
these gateway timeouts have become even worse lately :-(
why is it that difficult to get the problem fixed ?
ID: 53468 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Scheduler request to https://lhcathome.cern.ch/lhcathome_cgi/cgi failed: HTTP gateway timeout


©2026 CERN