Message boards : ATLAS application : Download failures
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2230
Credit: 173,849,615
RAC: 17,881
Message 32754 - Posted: 10 Oct 2017, 12:25:22 UTC - in response to Message 32724.  

Since about one week, the download-file of Atlas (200MByte) is dropping very slow.
The Counter of the network starts with for example 100 kps and reduced up to Zero.
It need about 1 hour instead of one minute regulary.


The speed is back since this morning, thank you.
ID: 32754 · Report as offensive     Reply Quote
captainjack

Send message
Joined: 21 Jun 10
Posts: 40
Credit: 11,235,512
RAC: 6,175
Message 32824 - Posted: 13 Oct 2017, 17:51:22 UTC

The task fetch seems to ignore the parameter for "Max # CPUs". For computer 10476963 the Max # CPUs was changed to 2, but the server keeps sending 4 core tasks. The client_state.xml says

<app_version>
<app_name>ATLAS</app_name>
<version_num>101</version_num>
<platform>windows_x86_64</platform>
<avg_ncpus>4.000000</avg_ncpus>
<max_ncpus>2.000000</max_ncpus>


Seems odd that the max_ncpus is 2, but the avg_ncpus is 4.

Computer is using the default preferences.

Please let me know if I can provide more information.
ID: 32824 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1413
Credit: 9,434,983
RAC: 9,630
Message 32849 - Posted: 17 Oct 2017, 5:28:19 UTC

ID: 32849 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 1413
Credit: 9,434,983
RAC: 9,630
Message 33575 - Posted: 30 Dec 2017, 7:44:27 UTC

I'm only getting download errors for ATLAS on the huge tar.gz file and I'm not the only one: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=83130712

419 LHC@home 30 Dec 07:36:10 Scheduler request completed: got 1 new tasks
420 LHC@home 30 Dec 07:36:12 Started download of jf_e9c416e28c3a0df3577d1a4347ea5818
421 LHC@home 30 Dec 07:36:12 Started download of Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm_input.tar.gz
422 LHC@home 30 Dec 07:36:17 Finished download of Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm_input.tar.gz
423 LHC@home 30 Dec 07:36:17 Started download of rte_Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm.tar.gz
424 LHC@home 30 Dec 07:36:17 [error] MD5 check failed for Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm_input.tar.gz
425 LHC@home 30 Dec 07:36:17 [error] expected d4c45f1281f2e4f7721d92e90f7908e1, got 8df9381a4fd7bc20104d0fc3d2683aaf
426 LHC@home 30 Dec 07:36:17 [error] Checksum or signature error for Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm_input.tar.gz
427 LHC@home 30 Dec 07:36:21 Finished download of rte_Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm.tar.gz
428 LHC@home 30 Dec 07:36:21 Started download of boinc_job_script.ecranI
429 LHC@home 30 Dec 07:36:21 [error] MD5 check failed for rte_Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm.tar.gz
430 LHC@home 30 Dec 07:36:21 [error] expected 216663a5b915aec743d59c590c29651a, got 00a3cb556ef47a5523352b641eb827ef
431 LHC@home 30 Dec 07:36:21 [error] Checksum or signature error for rte_Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm.tar.gz
432 LHC@home 30 Dec 07:36:23 Finished download of boinc_job_script.ecranI
433 LHC@home 30 Dec 07:36:45 update requested by user
434 LHC@home 30 Dec 07:36:46 Sending scheduler request: Requested by user.
435 LHC@home 30 Dec 07:36:46 Reporting 1 completed tasks
436 LHC@home 30 Dec 07:36:46 Requesting new tasks for CPU
437 LHC@home 30 Dec 07:36:47 Scheduler request completed: got 1 new tasks
438 LHC@home 30 Dec 07:36:49 Started download of jf_e6d03d9d12fcd2e55803461b00c56f59
439 LHC@home 30 Dec 07:36:49 Started download of Y4yNDmyLNprnDDn7oo6G73TpABFKDmABFKDmSIFKDmABFKDmyBEmSn_input.tar.gz
441 LHC@home 30 Dec 07:36:51 update requested by user
442 LHC@home 30 Dec 07:36:53 Sending scheduler request: Requested by user.
443 LHC@home 30 Dec 07:36:53 Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: )
444 LHC@home 30 Dec 07:36:54 Scheduler request completed
445 LHC@home 30 Dec 07:36:55 Finished download of Y4yNDmyLNprnDDn7oo6G73TpABFKDmABFKDmSIFKDmABFKDmyBEmSn_input.tar.gz
446 LHC@home 30 Dec 07:36:55 Started download of rte_Y4yNDmyLNprnDDn7oo6G73TpABFKDmABFKDmSIFKDmABFKDmyBEmSn.tar.gz
447 LHC@home 30 Dec 07:36:58 Finished download of rte_Y4yNDmyLNprnDDn7oo6G73TpABFKDmABFKDmSIFKDmABFKDmyBEmSn.tar.gz
448 LHC@home 30 Dec 07:36:58 Started download of boinc_job_script.QE7MWX
451 LHC@home 30 Dec 07:37:01 Finished download of boinc_job_script.QE7MWX
458 LHC@home 30 Dec 07:37:36 update requested by user
460 LHC@home 30 Dec 07:37:40 Sending scheduler request: Requested by user.
461 LHC@home 30 Dec 07:37:40 Not requesting tasks: don't need (CPU: job cache full; AMD/ATI GPU: )
462 LHC@home 30 Dec 07:37:42 Scheduler request completed
471 LHC@home 30 Dec 07:41:50 Temporarily failed download of jf_e6d03d9d12fcd2e55803461b00c56f59: transient HTTP error
472 LHC@home 30 Dec 07:41:50 Backing off 00:02:41 on download of jf_e6d03d9d12fcd2e55803461b00c56f59
490 30 Dec 07:42:30 Project communication failed: attempting access to reference site
491 30 Dec 07:42:32 Internet access OK - project servers may be temporarily down.
503 LHC@home 30 Dec 07:44:53 work fetch suspended by user
515 LHC@home 30 Dec 07:46:00 Sending scheduler request: To report completed tasks.
516 LHC@home 30 Dec 07:46:00 Reporting 1 completed tasks
517 LHC@home 30 Dec 07:46:00 Not requesting tasks: "no new tasks" requested via Manager
518 LHC@home 30 Dec 07:46:03 Scheduler request completed
519 LHC@home 30 Dec 07:46:17 work fetch resumed by user
520 LHC@home 30 Dec 07:46:18 Sending scheduler request: To fetch work.
521 LHC@home 30 Dec 07:46:18 Requesting new tasks for CPU
522 LHC@home 30 Dec 07:46:19 update requested by user
523 LHC@home 30 Dec 07:46:20 Scheduler request completed: got 1 new tasks
524 LHC@home 30 Dec 07:46:23 Started download of jf_74bc076db3e3dbd01293ed14d8e0c447
525 LHC@home 30 Dec 07:46:23 Started download of dOnNDmLVKprnSu7Ccp2YYBZmABFKDmABFKDmR0JKDmABFKDmF10I3m_input.tar.gz
526 LHC@home 30 Dec 07:46:27 Finished download of dOnNDmLVKprnSu7Ccp2YYBZmABFKDmABFKDmR0JKDmABFKDmF10I3m_input.tar.gz
527 LHC@home 30 Dec 07:46:27 Started download of rte_dOnNDmLVKprnSu7Ccp2YYBZmABFKDmABFKDmR0JKDmABFKDmF10I3m.tar.gz
528 LHC@home 30 Dec 07:46:30 Finished download of rte_dOnNDmLVKprnSu7Ccp2YYBZmABFKDmABFKDmR0JKDmABFKDmF10I3m.tar.gz
529 LHC@home 30 Dec 07:46:30 Started download of boinc_job_script.W5ORuc
530 LHC@home 30 Dec 07:46:33 Finished download of boinc_job_script.W5ORuc
531 30 Dec 07:46:45 Project communication failed: attempting access to reference site
532 LHC@home 30 Dec 07:46:45 Temporarily failed download of jf_74bc076db3e3dbd01293ed14d8e0c447: transient HTTP error
533 LHC@home 30 Dec 07:46:45 Backing off 00:02:25 on download of jf_74bc076db3e3dbd01293ed14d8e0c447
534 30 Dec 07:46:48 Internet access OK - project servers may be temporarily down.
557 LHC@home 30 Dec 07:50:54 Sending scheduler request: To fetch work.
558 LHC@home 30 Dec 07:50:54 Reporting 1 completed tasks
559 LHC@home 30 Dec 07:50:54 Requesting new tasks for CPU
560 LHC@home 30 Dec 07:50:57 Scheduler request completed: got 0 new tasks
561 LHC@home 30 Dec 07:50:57 No tasks sent
562 LHC@home 30 Dec 07:50:57 No tasks are available for ATLAS Simulation
563 LHC@home 30 Dec 07:50:57 Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
608 LHC@home 30 Dec 08:01:25 Sending scheduler request: To fetch work.
609 LHC@home 30 Dec 08:01:25 Requesting new tasks for CPU
610 LHC@home 30 Dec 08:01:28 Scheduler request completed: got 1 new tasks
611 LHC@home 30 Dec 08:01:31 Started download of jf_2bda7d45a82124f62e31770e0949aec0
612 LHC@home 30 Dec 08:01:31 Started download of TbILDmwwNprnDDn7oo6G73TpABFKDmABFKDmzvHKDmABFKDmEC1VJn_input.tar.gz
613 LHC@home 30 Dec 08:01:34 Finished download of TbILDmwwNprnDDn7oo6G73TpABFKDmABFKDmzvHKDmABFKDmEC1VJn_input.tar.gz
614 LHC@home 30 Dec 08:01:34 Started download of rte_TbILDmwwNprnDDn7oo6G73TpABFKDmABFKDmzvHKDmABFKDmEC1VJn.tar.gz
616 LHC@home 30 Dec 08:01:36 Finished download of rte_TbILDmwwNprnDDn7oo6G73TpABFKDmABFKDmzvHKDmABFKDmEC1VJn.tar.gz
617 LHC@home 30 Dec 08:01:36 Started download of boinc_job_script.B1nSao
618 LHC@home 30 Dec 08:01:38 Finished download of boinc_job_script.B1nSao
630 LHC@home 30 Dec 08:06:33 Temporarily failed download of jf_2bda7d45a82124f62e31770e0949aec0: transient HTTP error
631 LHC@home 30 Dec 08:06:33 Backing off 00:03:07 on download of jf_2bda7d45a82124f62e31770e0949aec0
632 30 Dec 08:06:34 Project communication failed: attempting access to reference site
633 30 Dec 08:06:35 Internet access OK - project servers may be temporarily down.
699 LHC@home 30 Dec 08:11:35 Sending scheduler request: To fetch work.
700 LHC@home 30 Dec 08:11:35 Reporting 1 completed tasks
701 LHC@home 30 Dec 08:11:35 Requesting new tasks for CPU
706 LHC@home 30 Dec 08:11:38 Scheduler request completed: got 1 new tasks
708 LHC@home 30 Dec 08:11:40 Started download of jf_cd064344c8fd8aa023594b6c9e170d82
709 LHC@home 30 Dec 08:11:40 Started download of Bq5KDm5LLprnDDn7oo6G73TpABFKDmABFKDm6MJKDmABFKDmspqlkm_input.tar.gz
710 LHC@home 30 Dec 08:11:43 Finished download of Bq5KDm5LLprnDDn7oo6G73TpABFKDmABFKDm6MJKDmABFKDmspqlkm_input.tar.gz
711 LHC@home 30 Dec 08:11:43 Started download of rte_Bq5KDm5LLprnDDn7oo6G73TpABFKDmABFKDm6MJKDmABFKDmspqlkm.tar.gz
714 LHC@home 30 Dec 08:11:45 Finished download of rte_Bq5KDm5LLprnDDn7oo6G73TpABFKDmABFKDm6MJKDmABFKDmspqlkm.tar.gz
715 LHC@home 30 Dec 08:11:45 Started download of boinc_job_script.TP89TQ
718 LHC@home 30 Dec 08:11:48 Finished download of boinc_job_script.TP89TQ
729 LHC@home 30 Dec 08:16:42 Temporarily failed download of jf_cd064344c8fd8aa023594b6c9e170d82: transient HTTP error
730 LHC@home 30 Dec 08:16:42 Backing off 00:03:04 on download of jf_cd064344c8fd8aa023594b6c9e170d82
738 30 Dec 08:17:34 Project communication failed: attempting access to reference site
739 30 Dec 08:17:38 Internet access OK - project servers may be temporarily down.
762 LHC@home 30 Dec 08:21:46 Sending scheduler request: To fetch work.
763 LHC@home 30 Dec 08:21:46 Reporting 1 completed tasks
764 LHC@home 30 Dec 08:21:46 Requesting new tasks for CPU
765 LHC@home 30 Dec 08:21:48 Scheduler request completed: got 0 new tasks
766 LHC@home 30 Dec 08:21:48 No tasks sent
767 LHC@home 30 Dec 08:21:48 No tasks are available for ATLAS Simulation
768 LHC@home 30 Dec 08:21:48 Tasks for AMD/ATI GPU are available, but your preferences are set to not accept them
813 LHC@home 30 Dec 08:31:43 Sending scheduler request: To fetch work.
814 LHC@home 30 Dec 08:31:43 Requesting new tasks for CPU
815 LHC@home 30 Dec 08:31:45 Scheduler request completed: got 1 new tasks
816 LHC@home 30 Dec 08:31:47 Started download of jf_f11424b99f4c26ae26ce0165f9890cea
817 LHC@home 30 Dec 08:31:47 Started download of diYMDmFAQprnSu7Ccp2YYBZmABFKDmABFKDmksKKDmABFKDm5v2g7m_input.tar.gz
818 LHC@home 30 Dec 08:31:50 Finished download of diYMDmFAQprnSu7Ccp2YYBZmABFKDmABFKDmksKKDmABFKDm5v2g7m_input.tar.gz
819 LHC@home 30 Dec 08:31:50 Started download of rte_diYMDmFAQprnSu7Ccp2YYBZmABFKDmABFKDmksKKDmABFKDm5v2g7m.tar.gz
821 LHC@home 30 Dec 08:31:52 Finished download of rte_diYMDmFAQprnSu7Ccp2YYBZmABFKDmABFKDmksKKDmABFKDm5v2g7m.tar.gz
822 LHC@home 30 Dec 08:31:52 Started download of boinc_job_script.OVbz9K
823 LHC@home 30 Dec 08:31:53 Finished download of boinc_job_script.OVbz9K
827 LHC@home 30 Dec 08:32:10 Temporarily failed download of jf_f11424b99f4c26ae26ce0165f9890cea: connect() failed
828 LHC@home 30 Dec 08:32:10 Backing off 00:03:18 on download of jf_f11424b99f4c26ae26ce0165f9890cea
829 30 Dec 08:32:11 Project communication failed: attempting access to reference site
830 30 Dec 08:32:15 Internet access OK - project servers may be temporarily down.
853 LHC@home 30 Dec 08:36:39 Sending scheduler request: To fetch work.
854 LHC@home 30 Dec 08:36:39 Reporting 1 completed tasks
855 LHC@home 30 Dec 08:36:39 Requesting new tasks for CPU
856 LHC@home 30 Dec 08:36:42 Scheduler request completed: got 1 new tasks
857 LHC@home 30 Dec 08:36:44 Started download of jf_0986c136d979b64aa0f2e017b298c3a3
858 LHC@home 30 Dec 08:36:44 Started download of YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn_input.tar.gz
859 LHC@home 30 Dec 08:36:50 Finished download of YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn_input.tar.gz
860 LHC@home 30 Dec 08:36:50 Started download of rte_YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn.tar.gz
861 LHC@home 30 Dec 08:36:50 [error] MD5 check failed for YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn_input.tar.gz
862 LHC@home 30 Dec 08:36:50 [error] expected 67fe0de8eb27149eac296262af780f95, got dbb0c0ec8aa8ff083df303586af4478e
863 LHC@home 30 Dec 08:36:50 [error] Checksum or signature error for YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn_input.tar.gz
864 LHC@home 30 Dec 08:36:54 Finished download of rte_YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn.tar.gz
865 LHC@home 30 Dec 08:36:54 Started download of boinc_job_script.GpSCMj
866 LHC@home 30 Dec 08:36:54 [error] MD5 check failed for rte_YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn.tar.gz
867 LHC@home 30 Dec 08:36:54 [error] expected df7c7ead51d4190a8be32e0204d09b8b, got d9d97e19dad9d7df3636741fc0828a5a
868 LHC@home 30 Dec 08:36:54 [error] Checksum or signature error for rte_YXtNDmUBGprnDDn7oo6G73TpABFKDmABFKDmhHNKDmABFKDmm9rFMn.tar.gz
869 LHC@home 30 Dec 08:36:56 Finished download of boinc_job_script.GpSCMj[/size]
[/i]
ID: 33575 · Report as offensive     Reply Quote
gyllic

Send message
Joined: 9 Dec 14
Posts: 202
Credit: 2,533,875
RAC: 0
Message 33576 - Posted: 30 Dec 2017, 8:59:12 UTC
Last modified: 30 Dec 2017, 9:06:15 UTC

same here, but some downloads worked.
30.12.2017 09:42:22 |  | Project communication failed: attempting access to reference site
30.12.2017 09:42:22 | LHC@home | Temporarily failed download of jf_21942dcaeaed52b07bbdfcd9fcd1721a: transient HTTP error
30.12.2017 09:42:22 | LHC@home | Backing off 00:17:02 on download of jf_21942dcaeaed52b07bbdfcd9fcd1721a
30.12.2017 09:42:22 | LHC@home | Started download of jf_a694038633034ba51282676943ac38c5
30.12.2017 09:42:26 |  | Internet access OK - project servers may be temporarily down.
30.12.2017 09:42:26 | LHC@home | Temporarily failed download of jf_224977cb72b04b940e6db3d73e82d690: transient HTTP error
30.12.2017 09:42:26 | LHC@home | Backing off 00:20:10 on download of jf_224977cb72b04b940e6db3d73e82d690
30.12.2017 09:42:26 | LHC@home | Started download of jf_f758c229cb04f251c90c863d96a983fc
30.12.2017 09:47:24 |  | Project communication failed: attempting access to reference site
30.12.2017 09:47:24 | LHC@home | Temporarily failed download of jf_a694038633034ba51282676943ac38c5: transient HTTP error
30.12.2017 09:47:24 | LHC@home | Backing off 00:04:03 on download of jf_a694038633034ba51282676943ac38c5
30.12.2017 09:47:24 | LHC@home | Started download of ty7MDmJCJprnSu7Ccp2YYBZmABFKDmABFKDmfRJKDmABFKDmce7Yfn_input.tar.gz
30.12.2017 09:47:26 |  | Internet access OK - project servers may be temporarily down.
30.12.2017 09:47:29 | LHC@home | Finished download of ty7MDmJCJprnSu7Ccp2YYBZmABFKDmABFKDmfRJKDmABFKDmce7Yfn_input.tar.gz
30.12.2017 09:47:29 | LHC@home | Started download of jf_0ab0e5a259caa86d390aea49f32cbf8f
30.12.2017 09:48:42 |  | Project communication failed: attempting access to reference site
30.12.2017 09:48:42 | LHC@home | Temporarily failed download of jf_0ab0e5a259caa86d390aea49f32cbf8f: transient HTTP error
30.12.2017 09:48:42 | LHC@home | Backing off 00:04:51 on download of jf_0ab0e5a259caa86d390aea49f32cbf8f
30.12.2017 09:48:42 | LHC@home | Started download of rte_ty7MDmJCJprnSu7Ccp2YYBZmABFKDmABFKDmfRJKDmABFKDmce7Yfn.tar.gz
30.12.2017 09:48:43 |  | Internet access OK - project servers may be temporarily down.
30.12.2017 09:48:43 | LHC@home | Finished download of rte_ty7MDmJCJprnSu7Ccp2YYBZmABFKDmABFKDmfRJKDmABFKDmce7Yfn.tar.gz
30.12.2017 09:48:43 | LHC@home | Started download of boinc_job_script.uXmTfU
30.12.2017 09:48:45 | LHC@home | Finished download of boinc_job_script.uXmTfU
30.12.2017 09:50:34 | LHC@home | Finished download of jf_f758c229cb04f251c90c863d96a983fc
30.12.2017 09:51:28 | LHC@home | Started download of jf_a694038633034ba51282676943ac38c5
30.12.2017 09:53:33 | LHC@home | Started download of jf_0ab0e5a259caa86d390aea49f32cbf8f
30.12.2017 09:54:48 |  | Project communication failed: attempting access to reference site
30.12.2017 09:54:48 | LHC@home | Temporarily failed download of jf_0ab0e5a259caa86d390aea49f32cbf8f: transient HTTP error
30.12.2017 09:54:48 | LHC@home | Backing off 00:15:21 on download of jf_0ab0e5a259caa86d390aea49f32cbf8f
30.12.2017 09:54:52 |  | Internet access OK - project servers may be temporarily down.
30.12.2017 09:56:30 | LHC@home | Started download of jf_224977cb72b04b940e6db3d73e82d690
30.12.2017 09:59:35 | LHC@home | update requested by user
30.12.2017 09:59:39 | LHC@home | Sending scheduler request: Requested by user.
30.12.2017 09:59:39 | LHC@home | Reporting 2 completed tasks
30.12.2017 09:59:39 | LHC@home | Requesting new tasks for CPU
30.12.2017 09:59:40 | LHC@home | Scheduler request completed: got 0 new tasks
30.12.2017 09:59:40 | LHC@home | No tasks sent
30.12.2017 09:59:40 | LHC@home | No tasks are available for ATLAS Simulation
30.12.2017 09:59:41 | LHC@home | Started download of jf_21942dcaeaed52b07bbdfcd9fcd1721a
30.12.2017 10:01:31 |  | Project communication failed: attempting access to reference site
30.12.2017 10:01:31 | LHC@home | Temporarily failed download of jf_224977cb72b04b940e6db3d73e82d690: transient HTTP error
30.12.2017 10:01:31 | LHC@home | Backing off 00:53:11 on download of jf_224977cb72b04b940e6db3d73e82d690
30.12.2017 10:01:31 | LHC@home | Started download of jf_0ab0e5a259caa86d390aea49f32cbf8f
30.12.2017 10:01:34 |  | Internet access OK - project servers may be temporarily down.
ID: 33576 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 722
Credit: 48,414,670
RAC: 27,571
Message 33577 - Posted: 30 Dec 2017, 9:03:10 UTC

I have problems downloading the jf_* file on two hosts now. Download doesn't progress at all:
30-Dec-2017 10:50:38 [LHC@home] Started download of jf_f89dd7c2231e901df3b1d7bb7647e4b3
30-Dec-2017 10:50:38 [LHC@home] [file_xfer] URL: http://atlasathome.cern.ch/ATLAS/download/16b/X2qLDmz5KprnSu7Ccp2YYBZmABFKDmABFKDmj1HKDmABFKDme0AnFn_EVNT.12502723._000670.pool.root.1
30-Dec-2017 10:52:55 [LHC@home] update requested by user
30-Dec-2017 10:52:55 [LHC@home] sched RPC pending: Requested by user
30-Dec-2017 10:52:55 [LHC@home] [sched_op] Starting scheduler request
30-Dec-2017 10:52:56 [LHC@home] Sending scheduler request: Requested by user.
30-Dec-2017 10:52:56 [LHC@home] Not requesting tasks: don't need (CPU: job cache full; NVIDIA GPU: not highest priority project; Intel GPU: )
30-Dec-2017 10:52:56 [LHC@home] [sched_op] CPU work request: 0.00 seconds; 0.00 devices
30-Dec-2017 10:52:56 [LHC@home] [sched_op] NVIDIA GPU work request: 0.00 seconds; 0.00 devices
30-Dec-2017 10:52:56 [LHC@home] [sched_op] Intel GPU work request: 0.00 seconds; 0.00 devices
30-Dec-2017 10:52:59 [LHC@home] Scheduler request completed
30-Dec-2017 10:52:59 [LHC@home] [sched_op] Server version 707
30-Dec-2017 10:52:59 [LHC@home] Project requested delay of 6 seconds
30-Dec-2017 10:52:59 [LHC@home] [sched_op] Deferring communication for 00:00:06
30-Dec-2017 10:52:59 [LHC@home] [sched_op] Reason: requested by project
30-Dec-2017 10:55:39 [LHC@home] [file_xfer] http op done; retval -184 (transient HTTP error)
30-Dec-2017 10:55:39 [LHC@home] [file_xfer] file transfer status -184 (transient HTTP error)
30-Dec-2017 10:55:39 [LHC@home] Temporarily failed download of jf_f89dd7c2231e901df3b1d7bb7647e4b3: transient HTTP error
30-Dec-2017 10:55:39 [LHC@home] [file_xfer] project-wide xfer delay for 737.893612 sec
30-Dec-2017 10:55:39 [LHC@home] Backing off 00:15:44 on download of jf_f89dd7c2231e901df3b1d7bb7647e4b3

ID: 33577 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1169
Credit: 54,348,787
RAC: 60,041
Message 33578 - Posted: 30 Dec 2017, 9:11:10 UTC
Last modified: 30 Dec 2017, 9:14:15 UTC

Posting is overloading the server

Or maybe

<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>Cx4MDmxPFprnDDn7oo6G73TpABFKDmABFKDmtcKKDmABFKDm3lTrXm_input.tar.gz</file_name>
<error_code>-200 (wrong size)</error_code>
</file_xfer_error>

Yeah I have been running the alpha version for a couple years and that tar.gz file is huge ......I have had to run the d/l from 3 to 10 hours just to get a task I could then run for a few hours to finish.

I think somebody needs to check the file and find the error and fix that so you can get them here and run some Valids (d/l URL error)
ID: 33578 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2520
Credit: 252,427,586
RAC: 135,630
Message 33871 - Posted: 15 Jan 2018, 12:17:36 UTC

Just got a new ATLAS job.

Download of the largest file (>200 MB) was at full DSL speed.
Download of the smaller files failed, e.g. boinc_job_script (14 KB).

Does it help to identify the "bottleneck of today"?
ID: 33871 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 722
Credit: 48,414,670
RAC: 27,571
Message 34710 - Posted: 21 Mar 2018, 12:36:39 UTC

Some of the new Atlas WUs fail to download: https://lhcathome.cern.ch/lhcathome/workunit.php?wuid=90621781
One of these is mine. All have failed with -186 (0xFFFFFF46) ERR_RESULT_DOWNLOAD

<core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>TwgMDmoYUIsnDDn7oo6G73TpABFKDmABFKDmi0FKDmABFKDmCftWXn_input.tar.gz</file_name>
  <error_code>-200 (wrong size)</error_code>
</file_xfer_error>
<file_xfer_error>
  <file_name>rte_TwgMDmoYUIsnDDn7oo6G73TpABFKDmABFKDmi0FKDmABFKDmCftWXn.tar.gz</file_name>
  <error_code>-119 (md5 checksum failed for file)</error_code>
  <error_message>MD5 check failed</error_message>
</file_xfer_error>

</message>
]]>


I have now two of these on two different hosts. New Atlas tasks were downloaded correctly after these.
ID: 34710 · Report as offensive     Reply Quote
rushmash

Send message
Joined: 19 Nov 14
Posts: 2
Credit: 2,250,744
RAC: 549
Message 35721 - Posted: 1 Jul 2018, 2:19:12 UTC
Last modified: 1 Jul 2018, 2:20:28 UTC

Also got several tasks stuck at downloading. What could be the reason? Tasks from other project(Einstein@home) download without any issue.
RBYLDmCMUtsnlyackoJh5iwnABFKDmABFKDmi61MDmABFKDminD6Jm_EVNT is paused for no reason for about an hour (could have been downloaded 10 times for that amount of time).

7/1/2018 5:02:01 AM | LHC@home | Requesting new tasks for CPU
7/1/2018 5:02:03 AM | LHC@home | Scheduler request completed: got 4 new tasks
7/1/2018 5:02:03 AM | LHC@home | Resent lost task 03CODmufTtsnyYickojUe11pABFKDmABFKDmCTOZDmABFKDmOJqgKo_0
7/1/2018 5:02:03 AM | LHC@home | Resent lost task RBYLDmCMUtsnlyackoJh5iwnABFKDmABFKDmi61MDmABFKDminD6Jm_0
7/1/2018 5:02:03 AM | LHC@home | Resent lost task Theory_4120747_1530378908.001814_0
7/1/2018 5:02:03 AM | LHC@home | Resent lost task Theory_4120754_1530378908.088289_0
7/1/2018 5:02:05 AM | LHC@home | Started download of vboxwrapper_26196_windows_x86_64.exe
7/1/2018 5:02:05 AM | LHC@home | Started download of ATLAS_2017_01_09.xml
7/1/2018 5:02:08 AM | LHC@home | Finished download of vboxwrapper_26196_windows_x86_64.exe
7/1/2018 5:02:08 AM | LHC@home | Finished download of ATLAS_2017_01_09.xml
7/1/2018 5:02:08 AM | LHC@home | Started download of ATLASM_2017_03_01.vdi
7/1/2018 5:02:08 AM | LHC@home | Started download of vboxwrapper_26196_windows_x86_64.pdb
7/1/2018 5:02:12 AM | LHC@home | Finished download of vboxwrapper_26196_windows_x86_64.pdb
7/1/2018 5:02:12 AM | LHC@home | Started download of vboxwrapper_26198ab7_windows_x86_64.exe
7/1/2018 5:02:13 AM | LHC@home | Finished download of vboxwrapper_26198ab7_windows_x86_64.exe
7/1/2018 5:02:13 AM | LHC@home | Started download of Theory_2017_05_29.xml
7/1/2018 5:02:14 AM | LHC@home | Finished download of Theory_2017_05_29.xml
7/1/2018 5:02:14 AM | LHC@home | Started download of Theory_2018_06_25.vdi
7/1/2018 5:03:26 AM | LHC@home | Finished download of Theory_2018_06_25.vdi
7/1/2018 5:03:26 AM | LHC@home | Started download of 03CODmufTtsnyYickojUe11pABFKDmABFKDmCTOZDmABFKDmOJqgKo_EVNT.14296435._000098.pool.root.1
7/1/2018 5:03:31 AM | LHC@home | Starting task Theory_4120747_1530378908.001814_0
7/1/2018 5:03:31 AM | LHC@home | Starting task Theory_4120754_1530378908.088289_0
7/1/2018 5:05:52 AM | LHC@home | Finished download of ATLASM_2017_03_01.vdi
7/1/2018 5:05:52 AM | LHC@home | Started download of 03CODmufTtsnyYickojUe11pABFKDmABFKDmCTOZDmABFKDmOJqgKo_input.tar.gz
7/1/2018 5:05:54 AM | LHC@home | Finished download of 03CODmufTtsnyYickojUe11pABFKDmABFKDmCTOZDmABFKDmOJqgKo_input.tar.gz
7/1/2018 5:05:54 AM | LHC@home | Started download of rte_03CODmufTtsnyYickojUe11pABFKDmABFKDmCTOZDmABFKDmOJqgKo.tar.gz
7/1/2018 5:05:55 AM | LHC@home | Finished download of rte_03CODmufTtsnyYickojUe11pABFKDmABFKDmCTOZDmABFKDmOJqgKo.tar.gz
7/1/2018 5:05:55 AM | LHC@home | Started download of boinc_job_script.CfqS7Y
7/1/2018 5:05:56 AM | LHC@home | Finished download of boinc_job_script.CfqS7Y
7/1/2018 5:05:56 AM | LHC@home | Started download of RBYLDmCMUtsnlyackoJh5iwnABFKDmABFKDmi61MDmABFKDminD6Jm_EVNT.14296450._001787.pool.root.1
ID: 35721 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2230
Credit: 173,849,615
RAC: 17,881
Message 35825 - Posted: 8 Jul 2018, 7:13:23 UTC

last 12 hours - five download errors:
<core_client_version>7.10.2</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>od6NDmrXBtsnlyackoJh5iwnABFKDmABFKDmlFwXDmABFKDmjQsYfn_EVNT.14296435._000047.pool.root.1</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>
ID: 35825 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 708
Credit: 4,336,250
RAC: 0
Message 35826 - Posted: 8 Jul 2018, 8:08:51 UTC

I have no problem downloading Atlas tasks on my Linux hosts. It is true that I download few of them because I have shifted my interests to GPUGRID which sends both GPU tasks (on Windows and Linux) and CPU tasks on Linux only, which use all my two cores. They use neural networks as post processing on their computers.
Tullio
ID: 35826 · Report as offensive     Reply Quote
flashawk

Send message
Joined: 13 Jul 13
Posts: 1
Credit: 708,432
RAC: 0
Message 35827 - Posted: 8 Jul 2018, 9:55:37 UTC

I have stuck downloads also, I've had to abort several now. The problem is only with Atlas WU's.
ID: 35827 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 722
Credit: 48,414,670
RAC: 27,571
Message 35828 - Posted: 8 Jul 2018, 10:09:17 UTC - in response to Message 35825.  

I had also a couple of download errors today: https://lhcathome.cern.ch/lhcathome/result.php?resultid=199868057 and https://lhcathome.cern.ch/lhcathome/result.php?resultid=199868057 and https://lhcathome.cern.ch/lhcathome/result.php?resultid=199866996

These errors happened instantly, they were not such that keep on downloading for hours.
ID: 35828 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2520
Credit: 252,427,586
RAC: 135,630
Message 35832 - Posted: 8 Jul 2018, 15:29:15 UTC

Things are getting worse:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199875734
<core_client_version>7.8.4</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>jCOKDmU57ssnyYickojUe11pABFKDmABFKDmmcZRDmABFKDm9PSXim_EVNT.14296450._001568.pool.root.1</file_name>
  <error_code>-224 (permanent HTTP error)</error_code>
  <error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>
ID: 35832 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2230
Credit: 173,849,615
RAC: 17,881
Message 35835 - Posted: 9 Jul 2018, 7:25:12 UTC - in response to Message 35825.  

last 12 hours - five download errors:
<core_client_version>7.10.2</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>od6NDmrXBtsnlyackoJh5iwnABFKDmABFKDmlFwXDmABFKDmjQsYfn_EVNT.14296435._000047.pool.root.1</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>

Download-Error are growing up to 35 at the moment since this weekend!
ID: 35835 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1169
Credit: 54,348,787
RAC: 60,041
Message 35836 - Posted: 9 Jul 2018, 7:50:24 UTC

I decided to grab a couple of these tasks to see if I get the same thing here.
And I am.

<core_client_version>7.8.3</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
<file_name>u2ILDmL9GtsnlyackoJh5iwnABFKDmABFKDmt26ZDmABFKDmFFLvvm_EVNT.14296435._000073.pool.root.1</file_name>
<error_code>-224 (permanent HTTP error)</error_code>
<error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>

https://lhcathome.cern.ch/lhcathome/result.php?resultid=199890751

Back to Theory,SixTrack,and LHCb's here again........goodnight
ID: 35836 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 380
Credit: 238,712
RAC: 0
Message 35837 - Posted: 9 Jul 2018, 9:35:01 UTC - in response to Message 35836.  

There doesn't seem to be a fundamental problem. Internally everything seems to work fine. This result uploaded 143M in 14s. We are monitoring the transfers and will post some details on failures shortly.
ID: 35837 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2520
Credit: 252,427,586
RAC: 135,630
Message 35838 - Posted: 9 Jul 2018, 10:15:59 UTC

3 in a row this morning:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199891597
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199890836
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199890839
<core_client_version>7.8.4</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>b3JLDmnTQtsnlyackoJh5iwnABFKDmABFKDm4VhLDmABFKDmTonoIn_EVNT.14296450._001314.pool.root.1</file_name>
  <error_code>-224 (permanent HTTP error)</error_code>
  <error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>
ID: 35838 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 722
Credit: 48,414,670
RAC: 27,571
Message 35839 - Posted: 9 Jul 2018, 11:12:20 UTC - in response to Message 35838.  

3 in a row this morning:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199891597
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199890836
https://lhcathome.cern.ch/lhcathome/result.php?resultid=199890839
<core_client_version>7.8.4</core_client_version>
<![CDATA[
<message>
WU download error: couldn't get input files:
<file_xfer_error>
  <file_name>b3JLDmnTQtsnlyackoJh5iwnABFKDmABFKDm4VhLDmABFKDmTonoIn_EVNT.14296450._001314.pool.root.1</file_name>
  <error_code>-224 (permanent HTTP error)</error_code>
  <error_message>permanent HTTP error</error_message>
</file_xfer_error>
</message>
]]>

I got 24 failed downloads during the last 2.5 days. All successful downloads seem to be working speedwise, at least I haven't spotted any stuck downloads during the same period.
ID: 35839 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 8 · Next

Message boards : ATLAS application : Download failures


©2024 CERN