Message boards :
ATLAS application :
hits file upload fails immediately
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 15 Jun 08 Posts: 2534 Credit: 254,022,666 RAC: 46,571 |
Let them finish and upload the smaller logfile. When the huge logfile upload gets stuck, cancel that upload (only the upload, not the task!). That way you may get credits for the task (worked for 2 of them from my hosts that got stuck recently). The scientific work gets lost but may be rescheduled by the backend systems. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
Are those Uploadfiles compressed before upload? |
Send message Joined: 15 Jul 05 Posts: 248 Credit: 5,974,599 RAC: 0 |
We have added swap on most of the file servers, so hopefully they should finish uploads and not crash. |
Send message Joined: 4 Jul 06 Posts: 7 Credit: 339,475 RAC: 0 |
Still not working on my 1.37GB file. I'll likely try canceling the upload tomorrow as suggested, if it still doesn't work. |
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 1,652 |
Fehler Paket abgebrochen This is shown in the details of the Task. Saw the last days some tasks as wingman: Canceled from the Server. So, thinking your Task is also one of them. |
Send message Joined: 15 Jul 05 Posts: 248 Credit: 5,974,599 RAC: 0 |
Sorry for this, we have added more file servers, so hopefully the upload for those tasks should work now if you retry. |
Send message Joined: 14 Sep 08 Posts: 52 Credit: 63,618,302 RAC: 29,267 |
Still failing the same way and still only for those 1.4G uploads while the smaller ones upload just fine. I saw some WUs were aborted from server side two days ago. Example: https://lhcathome.cern.ch/lhcathome/result.php?resultid=407100028. Does that mean those WUs are just mistakenly generated and have no science value anyway? If that's the case for all other such big uploads, I feel we might as well just abort them. Personally I don't care much about credits if the results aren't meaningful anyway. Losing them is better than crashing upload server all the time. |
Send message Joined: 6 Sep 08 Posts: 118 Credit: 12,573,537 RAC: 1,238 |
Failing here, too. Using a local proxy. This is an extract from the BOINC log. From the "Payload too large" error does it look as if a definite limit is being exceeded rather than something is running out of memory? [http] [ID#12] Sent header to server: Content-Type: application/x-www-form-urlencoded [http] [ID#12] Sent header to server: Accept-Language: en_GB [http] [ID#12] Sent header to server: Content-Length: 1486489603 [http] [ID#12] Sent header to server: Expect: 100-continue [http] [ID#12] Sent header to server: [http] [ID#12] Received header from server: HTTP/1.1 413 Payload Too Large [http] [ID#12] Received header from server: Date: Fri, 08 Mar 2024 00:21:32 GMT [http] [ID#12] Received header from server: Server: Apache [http] [ID#12] Received header from server: Content-Type: text/html; charset=iso-8859-1 [http] [ID#12] Received header from server: X-Cache: MISS from Teec00 [http] [ID#12] Received header from server: X-Cache-Lookup: MISS from Teec00:3128 [http] [ID#12] Received header from server: Transfer-Encoding: chunked [http] [ID#12] Received header from server: Via: 1.1 Teec00 (squid/3.5.12) [http] [ID#12] Received header from server: Connection: keep-alive [http] [ID#12] Info: HTTP error before end of send, stop sending [http] [ID#12] Received header from server: [http_xfer] [ID#12] HTTP: wrote 316 bytes [http] [ID#12] Info: Closing connection 3 [file_xfer] http op done; retval -224 (permanent HTTP error) [file_xfer] file transfer status -224 (permanent HTTP error) Backing off 05:58:26 on upload of ID3NDmgTuz4np2BDcpmwOghnABFKDmABFKDm73LSDmv4hKDmP85t6n_1_r717556734_ATLAS_hits |
Send message Joined: 6 Sep 08 Posts: 118 Credit: 12,573,537 RAC: 1,238 |
I ran out of editing time... I have changed the "client_request_buffer_max_size" setting in squid_conf to 1500 MB (was previously set at 10240 KB) see here which applies to much later squid versions than that in use here and may be relevant. A log extract is:- Fri 08 Mar 2024 13:58:16 GMT | LHC@home | [http] [ID#17] Sent header to server: Accept-Language: en_GB Fri 08 Mar 2024 13:58:16 GMT | LHC@home | [http] [ID#17] Sent header to server: Content-Length: 1483171612 Fri 08 Mar 2024 13:58:16 GMT | LHC@home | [http] [ID#17] Sent header to server: Expect: 100-continue Fri 08 Mar 2024 13:58:16 GMT | LHC@home | [http] [ID#17] Sent header to server: Fri 08 Mar 2024 13:58:16 GMT | LHC@home | [http] [ID#17] Received header from server: HTTP/1.1 100 Continue Fri 08 Mar 2024 13:58:16 GMT | LHC@home | [http] [ID#17] Received header from server: Connection: keep-alive Fri 08 Mar 2024 13:58:28 GMT | LHC@home | [http] [ID#17] Info: Recv failure: Connection reset by peer Fri 08 Mar 2024 13:58:28 GMT | LHC@home | [http] [ID#17] Info: Closing connection 7 Fri 08 Mar 2024 13:58:28 GMT | LHC@home | [http] HTTP error: Failure when receiving data from the peer Fri 08 Mar 2024 13:58:29 GMT | | Project communication failed: attempting access to reference site Fri 08 Mar 2024 13:58:29 GMT | | [http] HTTP_OP::init_get(): http://www.google.com/ Fri 08 Mar 2024 13:58:29 GMT | LHC@home | [file_xfer] http op done; retval -184 (transient HTTP error) Fri 08 Mar 2024 13:58:29 GMT | LHC@home | [file_xfer] file transfer status -184 (transient HTTP error) Fri 08 Mar 2024 13:58:29 GMT | LHC@home | Temporarily failed upload of ID3NDmgTuz4np2BDcpmwOghnABFKDmABFKDm73LSDmv4hKDmP85t6n_1_r717556734_ATLAS_hits: transient HTTP error Fri 08 Mar 2024 13:58:29 GMT | LHC@home | Backing off 04:35:21 on upload of ID3NDmgTuz4np2BDcpmwOghnABFKDmABFKDm73LSDmv4hKDmP85t6n_1_r717556734_ATLAS_hits Which looks completely different... although the upload still fails. |
Send message Joined: 15 Jul 05 Posts: 248 Credit: 5,974,599 RAC: 0 |
More recent web servers may have a default limit of 1GB, while there was no limit in the past. We have increased the limit to 2 on our latest file servers, so uploads should normally work on the next attempt. However, if the squids are limited too, you might be blocked earlier. |
Send message Joined: 14 Sep 08 Posts: 52 Credit: 63,618,302 RAC: 29,267 |
Thank you for fixing this. I see my pending uploads start draining since a few hours ago. Cheers. |
©2024 CERN