Message boards : ATLAS application : Download failures
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next

AuthorMessage
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 35840 - Posted: 9 Jul 2018, 12:45:02 UTC - in response to Message 35837.  
Last modified: 9 Jul 2018, 12:45:54 UTC

There doesn't seem to be a fundamental problem. Internally everything seems to work fine. This result uploaded 143M in 14s. We are monitoring the transfers and will post some details on failures shortly.

Check this host and this host. There is a problem. If your monitoring tools aren't seeing a problem then, with all due respect, your monitoring tools are broken.
ID: 35840 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,192,791
RAC: 103,819
Message 35845 - Posted: 10 Jul 2018, 4:24:04 UTC

Is there a permission issue with the scheduler?
Now since three days 87 download errors with http-Error!
It needs some investigation, please.
ID: 35845 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,039,416
RAC: 136,902
Message 35846 - Posted: 10 Jul 2018, 5:49:22 UTC

Error rate "download error" (last 24 h)

ATLAS: 62 %
other projects: 0 %
ID: 35846 · Report as offensive     Reply Quote
David Cameron
Project administrator
Project developer
Project scientist

Send message
Joined: 13 May 14
Posts: 387
Credit: 15,314,184
RAC: 0
Message 35848 - Posted: 10 Jul 2018, 7:04:46 UTC - in response to Message 35846.  

There are actually two problems at the moment:

- The instant download failures are caused by the database problem we had last week which wiped all running and recently completed WU. Basically BOINC is generating new WU for tasks which already finished before the database accident, and for those tasks the input was already deleted. These are relatively harmless since the failure is instant.

- The stalled transfer problem, when downloads start and the at some point stop working. This problem is worse because it can be many hours before the transfers succeed. We currently don't know the reason for this problem but we are investigating.
ID: 35848 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1114
Credit: 49,504,188
RAC: 3,842
Message 35849 - Posted: 10 Jul 2018, 7:20:01 UTC - in response to Message 35848.  

Yes David as you probably know the Atlas multi-cores are running with no problems over at -dev with Linux and Windows with the ATLAS Simulation v0.50 (native_mt) x86_64-pc-linux-gnu and v0.51 (vbox64_mt_mcore_atlas)
windows_x86_64 .....so 1.01 is having a problem at the server since none of the OS versions work here right now.

I only tried it once to see if it was all OS's and what part of the planet since that is a problem once in a while (server hand shaking)

But Theory, SixTrack and LHCb work here fine.
ID: 35849 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 35850 - Posted: 10 Jul 2018, 7:55:15 UTC - in response to Message 35849.  
Last modified: 10 Jul 2018, 8:20:35 UTC

I only tried it once to see if it was all OS's and what part of the planet since that is a problem once in a while (server hand shaking).

I just reattached, and the .vdi (ATLASM_2017_03_01.vdi) downloaded OK at 2500 Kbps (Ubuntu 16.04, i7-4790). All the others did too, except:
"BJAODmuhrwsnyYickojUe11pABFKDmABFKDmIeJZDmABFKDmO06ZKm_EVNT.14296435._001513.pool.root.1" is now down to 20 Kbps, and slowing down. This is in eastern Pennsylvania.

EDIT: I just attached a Win7 64-bit machine to LHCb, and LHCb_2017_12_14.vdi is now stuck at 9% downloaded. All the other downloaded OK. Whether this is related to ATLAS is another question.
ID: 35850 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 372
Credit: 238,712
RAC: 0
Message 35851 - Posted: 10 Jul 2018, 8:49:08 UTC - in response to Message 35848.  

The instant download failures issues should now be resolved. There may still be a few but the number of failures should be much less. We are still investigating the slowing downloads.
ID: 35851 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 35855 - Posted: 10 Jul 2018, 14:43:59 UTC

Two Atlas tasks downloaded on my Windows 10 PC. One downloaded OK, ran and validated with no HITS file, as usual on this PC. The other is stuck.
Tullio
ID: 35855 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 35856 - Posted: 10 Jul 2018, 14:48:40 UTC - in response to Message 35855.  

Two native ATLAS just downloaded here without a problem.
ID: 35856 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,192,791
RAC: 103,819
Message 36536 - Posted: 23 Aug 2018, 8:59:34 UTC
Last modified: 23 Aug 2018, 9:03:28 UTC

Have today some download-failure 8.30 UTC.
Upload stops also.
ID: 36536 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 36537 - Posted: 23 Aug 2018, 9:59:38 UTC - in response to Message 36536.  

ID: 36537 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 36538 - Posted: 23 Aug 2018, 10:17:21 UTC

All Atlas downloads fail on my Windows 10 PC. I have not changed anything.
Tullio
ID: 36538 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,433,416
RAC: 3,056
Message 36539 - Posted: 23 Aug 2018, 11:24:39 UTC

LHC@home 23 Aug 13:20:14 CEST Finished download of jlQNDmympCtnlyackoJh5iwnABFKDmABFKDmZFEODmABFKDmHuy9Eo_EVNT.14808120._000761.pool.root.1
LHC@home 23 Aug 13:20:23 CEST Finished download of APMMDmOezCtnyYickojUe11pABFKDmABFKDmhA5ODmABFKDmCh75un_EVNT.14808120._000793.pool.root.1
ID: 36539 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 36541 - Posted: 23 Aug 2018, 11:56:42 UTC - in response to Message 36539.  

One more download failure. The other is running.
Tullio
ID: 36541 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1686
Credit: 100,482,593
RAC: 104,457
Message 36542 - Posted: 23 Aug 2018, 12:22:48 UTC

hm, strange.

All my ATLAS downloads are okay.
ID: 36542 · Report as offensive     Reply Quote
mrchips

Send message
Joined: 16 May 14
Posts: 15
Credit: 7,343,729
RAC: 0
Message 36544 - Posted: 23 Aug 2018, 14:37:52 UTC

i'm getting this on both atlas and lhc downloads

8/23/2018 9:34:11 AM | LHC@home | Giving up on download of w-c5_-0.012_job.B1inj_c5_-0.012.2938__47__s__64.28_59.31__6.1_8.1__6__27_1_sixvf_boinc8042.zip: permanent HTTP error
ID: 36544 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 798
Credit: 644,856,397
RAC: 226,371
Message 36545 - Posted: 23 Aug 2018, 17:17:31 UTC

I see the issue with sixtrack
ID: 36545 · Report as offensive     Reply Quote
bronco

Send message
Joined: 13 Apr 18
Posts: 443
Credit: 8,438,885
RAC: 0
Message 36546 - Posted: 24 Aug 2018, 0:18:54 UTC

Numerous download errors here too.
ID: 36546 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1268
Credit: 8,433,416
RAC: 3,056
Message 36547 - Posted: 24 Aug 2018, 4:59:52 UTC

Me too now:

24 Aug 06:56:24 CEST Giving up on download of AIMNDmexGDtnlyackoJh5iwnABFKDmABFKDmHlEVDmABFKDmU9wZOo_EVNT.15161510._000143.pool.root.1: permanent HTTP error
24 Aug 06:56:24 CEST Giving up on download of AIMNDmexGDtnlyackoJh5iwnABFKDmABFKDmHlEVDmABFKDmU9wZOo_input.tar.gz: permanent HTTP error
24 Aug 06:56:27 CEST Giving up on download of rte_AIMNDmexGDtnlyackoJh5iwnABFKDmABFKDmHlEVDmABFKDmU9wZOo.tar.gz: permanent HTTP error
24 Aug 06:56:27 CEST Giving up on download of boinc_job_script.1gxrGC: permanent HTTP error
ID: 36547 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 223,039,416
RAC: 136,902
Message 36548 - Posted: 24 Aug 2018, 5:53:31 UTC

Here too.
It can be seen on all of my hosts and Sixtrack is also affected.

Typical log entries:
Fr 24 Aug 2018 06:22:48 CEST | LHC@home | Scheduler request completed: got 1 new tasks
Fr 24 Aug 2018 06:22:50 CEST | LHC@home | Started download of IVlKDmIiGDtnlyackoJh5iwnABFKDmABFKDmmGAVDmABFKDmpPIvin_EVNT.15161510._000139.pool.root.1
Fr 24 Aug 2018 06:22:50 CEST | LHC@home | Started download of IVlKDmIiGDtnlyackoJh5iwnABFKDmABFKDmmGAVDmABFKDmpPIvin_input.tar.gz
Fr 24 Aug 2018 06:22:52 CEST | LHC@home | Giving up on download of IVlKDmIiGDtnlyackoJh5iwnABFKDmABFKDmmGAVDmABFKDmpPIvin_EVNT.15161510._000139.pool.root.1: permanent HTTP error
Fr 24 Aug 2018 06:22:52 CEST | LHC@home | Giving up on download of IVlKDmIiGDtnlyackoJh5iwnABFKDmABFKDmmGAVDmABFKDmpPIvin_input.tar.gz: permanent HTTP error
Fr 24 Aug 2018 06:22:52 CEST | LHC@home | Started download of rte_IVlKDmIiGDtnlyackoJh5iwnABFKDmABFKDmmGAVDmABFKDmpPIvin.tar.gz
Fr 24 Aug 2018 06:22:52 CEST | LHC@home | Started download of boinc_job_script.ma5FQa
Fr 24 Aug 2018 06:22:54 CEST | LHC@home | Giving up on download of rte_IVlKDmIiGDtnlyackoJh5iwnABFKDmABFKDmmGAVDmABFKDmpPIvin.tar.gz: permanent HTTP error
Fr 24 Aug 2018 06:22:54 CEST | LHC@home | Giving up on download of boinc_job_script.ma5FQa: permanent HTTP error



Same host was succesful just a few minutes later:
Fr 24 Aug 2018 06:34:39 CEST | LHC@home | Scheduler request completed: got 1 new tasks
Fr 24 Aug 2018 06:34:41 CEST | LHC@home | Started download of KlZNDmTQADtnlyackoJh5iwnABFKDmABFKDml1KTDmABFKDmSU8xVo_EVNT.15161510._000103.pool.root.1
Fr 24 Aug 2018 06:34:41 CEST | LHC@home | Started download of KlZNDmTQADtnlyackoJh5iwnABFKDmABFKDml1KTDmABFKDmSU8xVo_input.tar.gz
Fr 24 Aug 2018 06:34:43 CEST | LHC@home | Finished download of KlZNDmTQADtnlyackoJh5iwnABFKDmABFKDml1KTDmABFKDmSU8xVo_input.tar.gz
Fr 24 Aug 2018 06:34:43 CEST | LHC@home | Started download of rte_KlZNDmTQADtnlyackoJh5iwnABFKDmABFKDml1KTDmABFKDmSU8xVo.tar.gz
Fr 24 Aug 2018 06:34:44 CEST | LHC@home | Finished download of rte_KlZNDmTQADtnlyackoJh5iwnABFKDmABFKDml1KTDmABFKDmSU8xVo.tar.gz
Fr 24 Aug 2018 06:34:44 CEST | LHC@home | Started download of boinc_job_script.qkP1OA
Fr 24 Aug 2018 06:34:45 CEST | LHC@home | Finished download of boinc_job_script.qkP1OA
Fr 24 Aug 2018 06:35:06 CEST | LHC@home | Finished download of KlZNDmTQADtnlyackoJh5iwnABFKDmABFKDml1KTDmABFKDmSU8xVo_EVNT.15161510._000103.pool.root.1
ID: 36548 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 · 7 . . . 8 · Next

Message boards : ATLAS application : Download failures


©2024 CERN