Message boards : ATLAS application : Unable to upload an Atlas task
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1418
Credit: 9,456,566
RAC: 2,066
Message 34546 - Posted: 5 Mar 2018, 9:51:12 UTC - in response to Message 34542.  

The upload problems for ATLAS tasks are back.

Yeah, 15 tasks are stuck uploading :-(

The number of running jobs is over 10,000 again. Almost 11,000, maybe therefore the upload problems again?
ID: 34546 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2532
Credit: 253,788,415
RAC: 35,203
Message 34549 - Posted: 6 Mar 2018, 12:47:01 UTC

A couple of my ATLAS WUs are now trying to upload for nearly 30h.
Is anybody working on a (short term) solution?
ID: 34549 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 455
Credit: 201,236,367
RAC: 27,752
Message 34550 - Posted: 6 Mar 2018, 13:00:04 UTC - in response to Message 34542.  

The upload problems for ATLAS tasks are back.

Yeah, 15 tasks are stuck uploading :-(

Meanwhile 43 stuck :-((


Supporting BOINC, a great concept !
ID: 34550 · Report as offensive     Reply Quote
nairb

Send message
Joined: 1 May 07
Posts: 27
Credit: 2,336,992
RAC: 1
Message 34551 - Posted: 6 Mar 2018, 16:42:10 UTC

I am getting a 100% transfer of file and then fails. It then tries to upload the entire file again and fails at 100%.... frustrating.
ID: 34551 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 34552 - Posted: 6 Mar 2018, 19:15:17 UTC - in response to Message 34551.  

Is it the same "HTTP transient error" that has been seen before? I see there is increased load on our file servers but I'm not sure what is causing it. However some results are getting through ok.
ID: 34552 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2243
Credit: 173,902,375
RAC: 2,454
Message 34553 - Posted: 6 Mar 2018, 19:43:33 UTC - in response to Message 34552.  
Last modified: 6 Mar 2018, 19:44:34 UTC

File Upload server error..... file_upload_handler PID=-1

Some WU's are uploaded - one 30 min. ago, but a lot are waiting......
ID: 34553 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 22 Mar 17
Posts: 63
Credit: 14,576,403
RAC: 6,872
Message 34554 - Posted: 6 Mar 2018, 23:06:27 UTC

1 for me as well:
3570k

6313 LHC@home 3/6/2018 5:53:12 PM Temporarily failed upload of j70KDm0NUCsnDDn7oo6G73TpABFKDmABFKDmSwKKDmABFKDmohi1Vn_2_r1676589227_ATLAS_result: transient upload error
ID: 34554 · Report as offensive     Reply Quote
Profile rbpeake

Send message
Joined: 17 Sep 04
Posts: 105
Credit: 32,824,853
RAC: 475
Message 34555 - Posted: 6 Mar 2018, 23:49:58 UTC

I have 10 waiting, some from as long as 2 days ago.

Here is the BOINC log:

3/6/2018 6:41:17 PM | LHC@home | Started upload of qzJODmxEjCsnSu7Ccp2YYBZmABFKDmABFKDmyQMKDmABFKDmUGlfOn_2_r1113909317_ATLAS_result
3/6/2018 6:41:17 PM | LHC@home | Started upload of vUaKDmlmrCsnSu7Ccp2YYBZmABFKDmABFKDmfhGKDmABFKDmEH2fzm_0_r963091061_ATLAS_result
3/6/2018 6:41:28 PM | LHC@home | [error] Error reported by file upload server: [vUaKDmlmrCsnSu7Ccp2YYBZmABFKDmABFKDmfhGKDmABFKDmEH2fzm_0_r963091061_ATLAS_result] locked by file_upload_handler PID=-1
3/6/2018 6:41:28 PM | LHC@home | Temporarily failed upload of vUaKDmlmrCsnSu7Ccp2YYBZmABFKDmABFKDmfhGKDmABFKDmEH2fzm_0_r963091061_ATLAS_result: transient upload error
3/6/2018 6:41:28 PM | LHC@home | Backing off 03:18:54 on upload of vUaKDmlmrCsnSu7Ccp2YYBZmABFKDmABFKDmfhGKDmABFKDmEH2fzm_0_r963091061_ATLAS_result
3/6/2018 6:41:29 PM | LHC@home | [error] Error reported by file upload server: [qzJODmxEjCsnSu7Ccp2YYBZmABFKDmABFKDmyQMKDmABFKDmUGlfOn_2_r1113909317_ATLAS_result] locked by file_upload_handler PID=-1
3/6/2018 6:41:29 PM | LHC@home | Temporarily failed upload of qzJODmxEjCsnSu7Ccp2YYBZmABFKDmABFKDmyQMKDmABFKDmUGlfOn_2_r1113909317_ATLAS_result: transient upload error
3/6/2018 6:41:29 PM | LHC@home | Backing off 05:35:19 on upload of qzJODmxEjCsnSu7Ccp2YYBZmABFKDmABFKDmyQMKDmABFKDmUGlfOn_2_r1113909317_ATLAS_result

Regards,
Bob P.
ID: 34555 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 34556 - Posted: 7 Mar 2018, 8:09:14 UTC

Are things better now? The file server load has gone back to normal overnight and the half-uploaded files which cause the "locked by file_upload_handler" messages should have been cleaned.
ID: 34556 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2243
Credit: 173,902,375
RAC: 2,454
Message 34557 - Posted: 7 Mar 2018, 8:17:37 UTC
Last modified: 7 Mar 2018, 8:18:31 UTC

Since Yesterday morning only ONE Atlas-Task was downloading.
Upload is always waiting. After a manual activating, it fall back to waiting.

Edit: -dev Atlas run normally with the new storage backend!
ID: 34557 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2532
Credit: 253,788,415
RAC: 35,203
Message 34558 - Posted: 7 Mar 2018, 8:47:58 UTC - in response to Message 34556.  

David Cameron wrote:
Are things better now?

Partly.

David Cameron wrote:
... and the half-uploaded files which cause the "locked by file_upload_handler" messages should have been cleaned.

Unfortunately not all of them:
Mi 07 Mär 2018 09:25:46 CET | LHC@home | Started upload of fkHKDmW16CsnDDn7oo6G73TpABFKDmABFKDmXfLKDmABFKDmcutGYo_0_r629916232_ATLAS_result
Mi 07 Mär 2018 09:25:48 CET | LHC@home | [error] Error reported by file upload server: [fkHKDmW16CsnDDn7oo6G73TpABFKDmABFKDmXfLKDmABFKDmcutGYo_0_r629916232_ATLAS_result] locked by file_upload_handler PID=-1
Mi 07 Mär 2018 09:25:48 CET | LHC@home | Temporarily failed upload of fkHKDmW16CsnDDn7oo6G73TpABFKDmABFKDmXfLKDmABFKDmcutGYo_0_r629916232_ATLAS_result: transient upload error
Mi 07 Mär 2018 09:25:48 CET | LHC@home | Backing off 03:22:08 on upload of fkHKDmW16CsnDDn7oo6G73TpABFKDmABFKDmXfLKDmABFKDmcutGYo_0_r629916232_ATLAS_result

Also affected by the same type of error:
GeoNDmDJOCsnSu7Ccp2YYBZmABFKDmABFKDmvALKDmABFKDmuGxLLo_0_r931142306_ATLAS_result
h4eLDmFDLDsnSu7Ccp2YYBZmABFKDmABFKDm0aIKDmABFKDm5bw40n_0_r12622427_ATLAS_result
sGvNDmXMqCsnDDn7oo6G73TpABFKDmABFKDm3eFKDmABFKDmaYr87m_0_r1115052517_ATLAS_result
79MLDm9DJDsnDDn7oo6G73TpABFKDmABFKDm71HKDmABFKDmmbpShm_2_r233222736_ATLAS_result


It may be necessary to run the cleanup script more frequently.
When do you expect the new storage backend will be activated?

Thanks for taking care.
ID: 34558 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1418
Credit: 9,456,566
RAC: 2,066
Message 34559 - Posted: 7 Mar 2018, 11:51:49 UTC - in response to Message 34556.  
Last modified: 7 Mar 2018, 11:52:08 UTC

Are things better now?

I had and have only one in Upload pending status. Did not request more ATLAS tasks until it's fixed, but my retry upload does not function:

LHC@home 07 Mar 12:47:56 [error] Error reported by file upload server: [lFvLDmDP1CsnDDn7oo6G73TpABFKDmABFKDmsUJKDmABFKDmwnH64m_0_r894275464_ATLAS_result] locked by file_upload_handler PID=-1
LHC@home 07 Mar 12:47:56 Temporarily failed upload of lFvLDmDP1CsnDDn7oo6G73TpABFKDmABFKDmsUJKDmABFKDmwnH64m_0_r894275464_ATLAS_result: transient upload error
ID: 34559 · Report as offensive     Reply Quote
Profile thomasroderick

Send message
Joined: 22 May 17
Posts: 15
Credit: 1,226,011
RAC: 29
Message 34560 - Posted: 7 Mar 2018, 15:32:26 UTC - in response to Message 34556.  

Yes, better now. I had two tasks stuck for over 24 hours, and this morning when I booted the machine, both of them immediately completed their upload and transferred. Thank you!
ID: 34560 · Report as offensive     Reply Quote
grumpy

Send message
Joined: 1 Sep 04
Posts: 57
Credit: 2,835,005
RAC: 0
Message 34562 - Posted: 8 Mar 2018, 0:37:57 UTC

2018-03-07 5:40:17 PM | LHC@home | Temporarily failed upload of su3MDmEYCCsnSu7Ccp2YYBZmABFKDmABFKDmnbGKDmABFKDm1owStm_0_r1889291022_ATLAS_result: transient upload error
ID: 34562 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2532
Credit: 253,788,415
RAC: 35,203
Message 34563 - Posted: 8 Mar 2018, 9:53:52 UTC - in response to Message 34558.  

Finally all pending uploads could be finished and reported.
ID: 34563 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2243
Credit: 173,902,375
RAC: 2,454
Message 34564 - Posted: 8 Mar 2018, 10:57:36 UTC - in response to Message 34563.  

+1 :-)
ID: 34564 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 22 Mar 17
Posts: 63
Credit: 14,576,403
RAC: 6,872
Message 34587 - Posted: 12 Mar 2018, 10:42:16 UTC

Another locked:
316316 LHC@home 3/12/2018 6:41:14 AM [error] Error reported by file upload server: [E8dMDmPJSFsnDDn7oo6G73TpABFKDmABFKDmquKKDmABFKDmkyA17m_0_r860623117_ATLAS_result] locked by file_upload_handler PID=-1
ID: 34587 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2243
Credit: 173,902,375
RAC: 2,454
Message 34592 - Posted: 12 Mar 2018, 14:53:36 UTC

same here.
ID: 34592 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2532
Credit: 253,788,415
RAC: 35,203
Message 34593 - Posted: 12 Mar 2018, 15:11:51 UTC - in response to Message 34587.  

Same here.
ID: 34593 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2532
Credit: 253,788,415
RAC: 35,203
Message 34598 - Posted: 12 Mar 2018, 18:06:58 UTC - in response to Message 34593.  

My uploads are back to normal.
May have been just a glitch.
ID: 34598 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : ATLAS application : Unable to upload an Atlas task


©2024 CERN