Message boards : ATLAS application : Uploads of finished tasks not possible since last night
Message board moderation

To post messages, you must log in.

1 · 2 · 3 · 4 . . . 6 · Next

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33289 - Posted: 13 Dec 2017, 5:53:47 UTC

Since last night, uploads of finished tasks get stuck in "backoff" Status.
Any problem with the ATLAS upload server?
ID: 33289 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 33290 - Posted: 13 Dec 2017, 6:58:53 UTC - in response to Message 33289.  

got the same problem, upload goes to 100% but will not finish.
At morning i thought first it will be a problem of the Windows 10 update last night.
ID: 33290 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33291 - Posted: 13 Dec 2017, 7:09:20 UTC - in response to Message 33290.  

upload goes to 100% but will not finish.
here, the progress bar in the BOINC Manager shows 0,00% upload :-(
ID: 33291 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 33292 - Posted: 13 Dec 2017, 7:58:25 UTC - in response to Message 33291.  

tested with Primegrid, no upload problem.

with my Telefon DSL line, 600 kbit upload speed, the upload won't start.
with the Kabel-DSL line, 10Mbit upload Speed, upload starts and goes to 100% but will not finish.
When Boinc restarts the upload, the upload will be reset and start at 0%
Upload Speed seems to be OK, limited to 200 KB for each Boinc session so 2 upload Task per session will run run with 100KB each.
ID: 33292 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33293 - Posted: 13 Dec 2017, 8:12:00 UTC
Last modified: 13 Dec 2017, 8:15:21 UTC

I have one that just got stuck too. After two automatic upload attempts, I tried a manual retry. It started out at 350 Kbps and slowed down to 200 Kpbs at the end, and finally stuck at 100%. So it is not a network problem.
EDIT: Finally, after sitting a 100% for a minute or two, it completed and has been reported. It is now waiting for validation.
ID: 33293 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33294 - Posted: 13 Dec 2017, 8:24:35 UTC - in response to Message 33292.  

When Boinc restarts the upload, the upload will be reset and start at 0%
same here: I just noticed the following: an upload which was in "retry" status started again, very slowly, with many interrruptions inbetween, then it stopped at 99,73% with the status reverting to "retry", and after it restarted from there, the progress status fell back to 0.00 %. And here it stopped again.
ID: 33294 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33298 - Posted: 13 Dec 2017, 9:27:16 UTC

would be interesting to know how many users are affected by this problem.
Only very few ones? Or more, and many of them have not noticed it?
ID: 33298 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33299 - Posted: 13 Dec 2017, 9:52:52 UTC
Last modified: 13 Dec 2017, 9:56:06 UTC

Same issue here. But, like the download issue the upload issues are erratic. 1 WU finished and checked out, others didn't.

My BOINC log says (paraphrasing) "Project Communication failed ... Internet Access OK. Must be server side. Retrying later".

EDIT: Looking at the status page you can now see a lot of WUs are queuing. A lot of volunteers should be experiencing the same issue.
ID: 33299 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33301 - Posted: 13 Dec 2017, 10:30:49 UTC - in response to Message 33300.  
Last modified: 13 Dec 2017, 10:31:08 UTC

Temporarily failed upload of ...ATLAS_result: transient HTTP error
I now checked the BOINC event log. The first such entry here was last night at 00:22:59 hrs (UTC+1).

I just could observe another try of a task getting sent: the progress bar proceeds to 100%, then the upload stops, and a few minutes later it's back at 0.00%. Always the same game.
I'm wondering whether no one at CERN has found out about this problem yet. Are there no bells around that would start ringing if something like this happens?

The bad thing is that as long as uploads don't get finished, no new tasked are being downloaded :-(
ID: 33301 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 249
Credit: 5,974,599
RAC: 0
Message 33302 - Posted: 13 Dec 2017, 10:36:50 UTC - in response to Message 33289.  

One of our file servers crashed during the night, and the fail-over server took over. Should be back to normal soon.

There is an increase of load with the current ATLAS simulation campaign, so we will add another file server to increase capacity.
ID: 33302 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33305 - Posted: 13 Dec 2017, 10:58:30 UTC - in response to Message 33302.  

Nils, many thanks for the Information :-)
ID: 33305 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 22 Mar 17
Posts: 64
Credit: 14,576,403
RAC: 858
Message 33306 - Posted: 13 Dec 2017, 11:04:42 UTC

Thanks for the update. I have one stuck as well.

Theory and LHCb are uploading on the same PC for me.
ID: 33306 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 33311 - Posted: 13 Dec 2017, 13:15:48 UTC - in response to Message 33302.  

.....There is an increase of load with the current ATLAS simulation campaign, so we will add another file server to increase capacity.


Thanks for Information.

i'll hope the new file Server will work soon, at the moment all uploads will stuck at 100% and not finish.
Intersting that upload will work but not the finishing. Seems to be the Database entry is blocked.

My WU-Cache will be empty in 10 hours and it's snowing......
ID: 33311 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33312 - Posted: 13 Dec 2017, 14:22:42 UTC - in response to Message 33302.  

Nils Høimyr wrote at 10:36 hrs:

One of our file servers crashed during the night, and the fail-over server took over. Should be back to normal soon.
There is an increase of load with the current ATLAS simulation campaign, so we will add another file server to increase capacity.
Nils, any idea when you will be back to normal?
At this point, uploads are still not working.
ID: 33312 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33314 - Posted: 13 Dec 2017, 17:10:23 UTC - in response to Message 33312.  

So far, none of my finished ATLAS tasks could be uploaded succesfully.
How is the situation with the other crunchers who had reported here the same problem during the course of the day?
Are your finished tasks also still waiting, or have they been uploaded meanwhile?
ID: 33314 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2244
Credit: 173,902,375
RAC: 307
Message 33315 - Posted: 13 Dec 2017, 17:33:20 UTC
Last modified: 13 Dec 2017, 18:32:47 UTC

First task was uploading successful at 17 UTC.
https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=81710691

Thank you Nils and your Team.
ID: 33315 · Report as offensive     Reply Quote
hsdecalc

Send message
Joined: 26 Jan 15
Posts: 10
Credit: 6,517,680
RAC: 0
Message 33316 - Posted: 13 Dec 2017, 18:35:49 UTC

Still have four tasks, which are not uploaded. Stops at 100% or less or does not start. No upload at the moment.
ID: 33316 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33317 - Posted: 13 Dec 2017, 19:05:18 UTC - in response to Message 33316.  

Still have four tasks, which are not uploaded. Stops at 100% or less or does not start. No upload at the moment.
same Situation here!
Nils, is someone out there taking care of this problem?
ID: 33317 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33318 - Posted: 13 Dec 2017, 20:58:42 UTC - in response to Message 33317.  

It's still hit and miss for me too. The assurance that the issue is being worked on is all I wanted really. It could have come a day earlier when downloads started failing, but ... oh well.

You can always check the server status page to see if the problem has been resolved. The WUs in question remain stuck "in progress". Once that number falls to reasonable levels you can assume the problem has been fixed for most users.

On the upside: time for maintenance!
ID: 33318 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1821
Credit: 118,983,656
RAC: 18,514
Message 33319 - Posted: 13 Dec 2017, 21:06:17 UTC - in response to Message 33318.  

It's still hit and miss for me too. The assurance that the issue is being worked on is all I wanted really.
Well, fact is: the whole day is over, and nothing has happened :-(
So, I am not so sure that the issue has been worked on.
ID: 33319 · Report as offensive     Reply Quote
1 · 2 · 3 · 4 . . . 6 · Next

Message boards : ATLAS application : Uploads of finished tasks not possible since last night


©2024 CERN