Message boards : ATLAS application : Uploads of finished tasks not possible since last night
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

AuthorMessage
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33386 - Posted: 15 Dec 2017, 10:47:40 UTC - in response to Message 33384.  
Last modified: 15 Dec 2017, 10:48:39 UTC

The cleanup script is stuck due to load ...
which I can fully believe; that's why I was so surprised that even more new tasks (3257 are available right now, according to the info from the Server Status Page) were bumped into the mills - which is bound to be counter-productive in such a situation.
ID: 33386 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33387 - Posted: 15 Dec 2017, 13:06:50 UTC

the upload server obviously was switched on again.
And, how nice, two of my finished tasked were uploaded. For the others, same thing happens as on Tuesday, right after the problems began: the upload progresses to 100%, then it stays there for a while, and then it's reset to 0.00% :-(
ID: 33387 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2228
Credit: 173,749,323
RAC: 19,106
Message 33388 - Posted: 15 Dec 2017, 15:33:15 UTC

problem of http ?? for upload-error

http://boinc.berkeley.edu/dev/forum_thread.php?id=954
ID: 33388 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33389 - Posted: 15 Dec 2017, 15:38:24 UTC - in response to Message 33388.  

problem of http ??
whatever the problem really is - stopping the server for some time delivered good results for very short time only.
Since hours, the uploads again don't work :-(
ID: 33389 · Report as offensive     Reply Quote
csbyseti

Send message
Joined: 6 Jul 17
Posts: 22
Credit: 29,430,354
RAC: 0
Message 33390 - Posted: 15 Dec 2017, 16:27:35 UTC - in response to Message 33389.  

it's not a good idea to place new Atlas Task on server.
Even the new once placed today run into the upload failure.
So the size of the problem will increase only.
ID: 33390 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33391 - Posted: 15 Dec 2017, 16:39:59 UTC

I finally aborted all four of my stuck ATLAS tasks (two on each of two machines). I didn't even get an acknowledgement back that they had been aborted.
I think they are dead to the world.
ID: 33391 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33392 - Posted: 15 Dec 2017, 16:47:52 UTC - in response to Message 33391.  

it's not a good idea to place new Atlas Task on server.
Even the new once placed today run into the upload failure.
So the size of the problem will increase only.
I guess they want to reach the 25.000 task mark by all means. Even knowing that this doesn't make any sense in the current situation :-(

I finally aborted all four of my stuck ATLAS tasks
I'll also change to other projects. Even more that most probably not much will happen over the upcoming weekend.
ID: 33392 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33394 - Posted: 15 Dec 2017, 17:30:27 UTC

Well speak of the devil. After aborting my ATLAS tasks (VBox), they sent me "2.52 ATLAS Simulation (native_mt)" for my Ubuntu machine.

It was worth the trouble.
ID: 33394 · Report as offensive     Reply Quote
Profile ritterm
Avatar

Send message
Joined: 30 May 08
Posts: 93
Credit: 5,160,246
RAC: 0
Message 33396 - Posted: 15 Dec 2017, 21:38:58 UTC

Some progress, perhaps... I've manged to upload and report six of eight tasks that were stuck uploading. The two that are still there were the first ones to get stuck, I believe.
ID: 33396 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2228
Credit: 173,749,323
RAC: 19,106
Message 33397 - Posted: 15 Dec 2017, 22:24:54 UTC

+1
ID: 33397 · Report as offensive     Reply Quote
mmonnin

Send message
Joined: 22 Mar 17
Posts: 60
Credit: 13,864,571
RAC: 34,778
Message 33398 - Posted: 15 Dec 2017, 22:43:27 UTC

47290 LHC@home 12/15/2017 5:42:38 PM Temporarily failed upload of 63gNDml8GirnDDn7oo6G73TpABFKDmABFKDmrUNKDmABFKDm1XIRLo_1_r1909933188_ATLAS_result: transient HTTP error

This one is still stuck for me. Others have uploaded since.
ID: 33398 · Report as offensive     Reply Quote
Tom*

Send message
Joined: 11 Aug 11
Posts: 6
Credit: 16,714,519
RAC: 0
Message 33399 - Posted: 16 Dec 2017, 1:49:16 UTC

After all this I now get this message
12/15/2017 8:44:05 PM | LHC@home | [error] Error reported by file upload server: Server is out of disk space


What gives? multiple new upload servers and no disk space????
ID: 33399 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33400 - Posted: 16 Dec 2017, 7:16:11 UTC - in response to Message 33399.  

Meanwhile, all my ATLAS tasks (the "old" ones as well as the new ones) are in backoff status. Regardless of how often I push the "retry now" button, they fall back to backoff within 1 or 2 seconds. So the whole upload system seems to have broken down by now. No wonder though, after despite of the problems that persisted since Tuesday, more and more new tasks were and are still bumped into the pipeline (a procedure which I find very unprofessional).

Some of my "waiting" tasks have their deadline later today, so they will become invalid, and all these many hours of crunching time finally was for nothing :-(

Since we are having the weekend now, I suspect that nothing will happen till Monday, or even later. And so, many more tasks from many volunteers will meanwhile reach their deadline and hence become invalid. And many more tasks will increase the queue of tasks waiting for upload - everyone can imagine the overload that is bound to follow.
Does this all make any sense?
ID: 33400 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2228
Credit: 173,749,323
RAC: 19,106
Message 33401 - Posted: 16 Dec 2017, 7:35:17 UTC - in response to Message 33400.  

Erich,
your first ATLAS getting out of the Time-limit is on Monday.
The problem of Atlas is at the moment the upload-Server.
Download is ok, running the tasks is ok. Last night some tasks had uploaded.
So, when we have luck, there is someone in Cern-IT working today, otherwise monday.
ID: 33401 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 33403 - Posted: 16 Dec 2017, 8:34:36 UTC

I have been patient as requested, but I still can't upload:
16/12/2017 08:29:55 | LHC@home | Started upload of 4FYKDm9M7irnSu7Ccp2YYBZmABFKDmABFKDmStGKDmABFKDm3GgJWn_1_r494246443_ATLAS_result
16/12/2017 08:29:58 | LHC@home | [error] Error reported by file upload server: Server is out of disk space
16/12/2017 08:29:58 | LHC@home | Temporarily failed upload of 4FYKDm9M7irnSu7Ccp2YYBZmABFKDmABFKDmStGKDmABFKDm3GgJWn_1_r494246443_ATLAS_result: transient upload error
16/12/2017 08:29:58 | LHC@home | Backing off 04:46:25 on upload of 4FYKDm9M7irnSu7Ccp2YYBZmABFKDmABFKDmStGKDmABFKDm3GgJWn_1_r494246443_ATLAS_result
ID: 33403 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 33404 - Posted: 16 Dec 2017, 8:40:41 UTC - in response to Message 33400.  

We stopped submitting new tasks yesterday afternoon but there are still a few left in the queue.

The upload server got completely full so I'm cleaning it now - there is an automatic cleaning tool but it wasn't cleaning enough to handle the huge volumes of data we got this week.

It is clear we have reached the limits of the current infrastructure (thank you all for getting us to these limits :) , but early in the new year we will move to different filesystems which will handle this load much better.
ID: 33404 · Report as offensive     Reply Quote
Brummig
Avatar

Send message
Joined: 9 Feb 16
Posts: 48
Credit: 537,111
RAC: 0
Message 33405 - Posted: 16 Dec 2017, 9:24:36 UTC - in response to Message 33404.  
Last modified: 16 Dec 2017, 9:24:52 UTC

We'll do our best to make that fall over ASAP, I'm sure :)
Thank you for coming out on a Saturday morning with the drain cleaning rods.
It is clear we have reached the limits of the current infrastructure (thank you all for getting us to these limits :) , but early in the new year we will move to different filesystems which will handle this load much better.
ID: 33405 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33406 - Posted: 16 Dec 2017, 9:44:28 UTC - in response to Message 33401.  

Erich, your first ATLAS getting out of the Time-limit is on Monday.
that's what is shown in the list here on the Homepage. However, in the BOINC Manager, today's date is indicated as deadline. No idea how this discrepancy comes about, plus no idea which date finally will be the valid one.

Last night some tasks had uploaded
none of mine, unfortunately :-(
ID: 33406 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33408 - Posted: 16 Dec 2017, 10:54:21 UTC
Last modified: 16 Dec 2017, 10:56:17 UTC

Good News: My first three native ATLAS on my two Ubuntu machines ran properly and completed without error.
They averaged about 3 1/2 hours on two cores per work unit (i7-4770, i7-4790).

Bad News: They are stuck in upload also.

May the New Year bring us some joy.
ID: 33408 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1784
Credit: 117,025,369
RAC: 67,380
Message 33409 - Posted: 16 Dec 2017, 11:15:25 UTC - in response to Message 33404.  

David wrote:
The upload server got completely full so I'm cleaning it now
any results yet?
ID: 33409 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · Next

Message boards : ATLAS application : Uploads of finished tasks not possible since last night


©2024 CERN