Message boards :
News :
File upload issues
Message board moderation
Author | Message |
---|---|
![]() Volunteer moderator Project administrator Project developer Project tester Send message Joined: 15 Jul 05 Posts: 165 Credit: 1,827,412 RAC: 4,291 ![]() ![]() ![]() |
Our NFS storage backend got saturated and hence uploads are failing intermittently. The underlying cause is an issue with file deletion, we are trying to resolve that. Sorry for the trouble and thanks for your patience with transfers to LHC@home. |
Peter Ingham Send message Joined: 22 Sep 04 Posts: 6 Credit: 571,590 RAC: 692 ![]() |
There also seems to be problems (possibly closely related) with Downloads. I have some downloads that have made 28 attempts without success. I have two uploads that have been retrying for approx 10 days without success. Thanks for looking into these issues - a nice thing to come back to after your break! |
Erich56 Send message Joined: 18 Dec 15 Posts: 686 Credit: 4,851,241 RAC: 4,738 ![]() ![]() ![]() |
What has changed since around noon is that the finished but still not uploaded ATLAS tasks do upload once in a while, with the progress bar going from 0% to 100%, then everything stops for a while, and finally the 100% value reverts to 0%. This, in fact, is exactly what we had some 3 weeks ago, when there was this big trouble with the ATLAS tasks. So whatever you were trying to fix so far - it didn't work (yet). |
Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 414 Credit: 108,872,630 RAC: 166,916 ![]() ![]() ![]() |
Things seem better for me, my only stuck one are the ones that are pending locks to be released. |
Empie Send message Joined: 28 Jul 05 Posts: 24 Credit: 2,365,248 RAC: 2,441 |
Some wu's are uploaded/downloaded over here, but still have one stuck on Windows. It's one of the first ones to get stuck: 9-1-2018 0:27:09 | LHC@home | [error] Error reported by file upload server: [LHC_2015_LHC_2015_260_BOINC_errors__59__s__62.31_60.32__5.5_5.6__5__84_1_sixvf_boinc207441_0_r348705869_0] locked by file_upload_handler PID=-1 Deadline 9-1-2018 9:40:54 Linux still showing some issues: locked by file_upload_handler PID=-1 |
Erich56 Send message Joined: 18 Dec 15 Posts: 686 Credit: 4,851,241 RAC: 4,738 ![]() ![]() ![]() |
my finished ATLAS tasks, when trying to upload, still show "transient HTTP error" - So, whatever the CERN people tried to fix yesterday - obviously without success :-( |
Alessio Mereghetti Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 29 Feb 16 Posts: 76 Credit: 917,474 RAC: 3,843 ![]() ![]() ![]() |
A first action (cleanup of upload/download files) worked, and allowed to un-block the situation yesterday around noon. Still with hiccups, but, as a volunteer, I managed to upload my results and download new WUs, until ~23:00 PM GVA local time. I guess that the IT guys are planning a deeper intervention on the NFS storage backend (see post by Nils - https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4567&postid=33713) - let's wait for more news from their side |
Lasse Hintze Tøndering-Jensen Send message Joined: 24 Jun 14 Posts: 1 Credit: 1,212,892 RAC: 943 ![]() ![]() |
One of my servers is steadily but surely getting clogged by LHC tasks it can’t finish uploading, with 60+ upload tasks and just as many download, at the moment. Its seems that LHC_2015 tasks, while tedious, is slowly, very slowly, getting up- and downloaded, however workspace1_hl13 tasks is a dead end. They simply will not upload. I have tasks finished in December that is still not uploaded. |
![]() Volunteer moderator Project administrator Project developer Project tester Send message Joined: 15 Jul 05 Posts: 165 Credit: 1,827,412 RAC: 4,291 ![]() ![]() ![]() |
Thanks Lasse, that is useful information. These files were probably half-uploaded earlier, and should under normal circumstances be deleted on the server to allow a fresh upload. We also ran out of space again, and will stop uploads/downloads for a while today to add more disk space on the old NFS backend server. Thanks again to you all for your patience. |
entigy Send message Joined: 24 Oct 04 Posts: 7 Credit: 142,096 RAC: 314 ![]() ![]() |
This. 09/01/2018 07:52:34 | LHC@home | Started upload of 6isKDmfKtsrnDDn7oo6G73TpABFKDmABFKDmx9FKDmABFKDmNK13Mo_0_r1496110899_ATLAS_result 09/01/2018 07:52:34 | LHC@home | Started upload of Rk1MDmP6xsrnDDn7oo6G73TpABFKDmABFKDmKgJKDmABFKDmDER4Xm_0_r1968357439_ATLAS_result 09/01/2018 07:52:34 | LHC@home | Started download of LHC_2015_LHC_2015_290_BOINC_errors__34__s__62.31_60.32__5.9_6.0__5__87_1_sixvf_boinc119179.zip 09/01/2018 07:52:37 | LHC@home | Temporarily failed upload of 6isKDmfKtsrnDDn7oo6G73TpABFKDmABFKDmx9FKDmABFKDmNK13Mo_0_r1496110899_ATLAS_result: connect() failed 09/01/2018 07:52:37 | LHC@home | Backing off 05:33:56 on upload of 6isKDmfKtsrnDDn7oo6G73TpABFKDmABFKDmx9FKDmABFKDmNK13Mo_0_r1496110899_ATLAS_result 09/01/2018 07:52:37 | LHC@home | Temporarily failed upload of Rk1MDmP6xsrnDDn7oo6G73TpABFKDmABFKDmKgJKDmABFKDmDER4Xm_0_r1968357439_ATLAS_result: connect() failed 09/01/2018 07:52:37 | LHC@home | Backing off 00:07:38 on upload of Rk1MDmP6xsrnDDn7oo6G73TpABFKDmABFKDmKgJKDmABFKDmDER4Xm_0_r1968357439_ATLAS_result 09/01/2018 07:52:37 | LHC@home | Temporarily failed download of LHC_2015_LHC_2015_290_BOINC_errors__34__s__62.31_60.32__5.9_6.0__5__87_1_sixvf_boinc119179.zip: connect() failed 09/01/2018 07:52:37 | LHC@home | Backing off 01:52:25 on download of LHC_2015_LHC_2015_290_BOINC_errors__34__s__62.31_60.32__5.9_6.0__5__87_1_sixvf_boinc119179.zip 09/01/2018 07:52:37 | | Project communication failed: attempting access to reference site 09/01/2018 07:52:39 | | Internet access OK - project servers may be temporarily down. |
Harri Liljeroos![]() Send message Joined: 28 Sep 04 Posts: 261 Credit: 8,186,438 RAC: 14,025 ![]() ![]() ![]() |
|
![]() Volunteer moderator Project administrator Project developer Project tester Send message Joined: 15 Jul 05 Posts: 165 Credit: 1,827,412 RAC: 4,291 ![]() ![]() ![]() |
The NFS server upgrade takes longer than expected as it is trying to clear pending file deletes. Sorry for this, uploads should resume later once the server status is back to green. |
![]() Volunteer moderator Project administrator Project developer Project tester Send message Joined: 15 Jul 05 Posts: 165 Credit: 1,827,412 RAC: 4,291 ![]() ![]() ![]() |
Our NFS server is finally up again, and hopefully with better performance. Transfers should resume again, at least my Sixtrack tasks uploaded correctly. Sorry again for the trouble with uploads caused by this. |
Erich56 Send message Joined: 18 Dec 15 Posts: 686 Credit: 4,851,241 RAC: 4,738 ![]() ![]() ![]() |
also my 44kb Sixtrack file got uploaded okay. However, non of the finished ATLAS files succeed; the progress bar goes from 0% to 100% (with a few interruptions inbetween), then it sits at 100% for a short while, and later it reverts back to 0% and jumps to "retry". BOINC event log always shows "transient upload error" :-( Same what we hat around Mid-December. |
Harri Liljeroos![]() Send message Joined: 28 Sep 04 Posts: 261 Credit: 8,186,438 RAC: 14,025 ![]() ![]() ![]() |
My sixtrack and Atlas tasks have all been uploaded. Those that were over deadline (sixtrack) are now pending and waiting for wingmates. [edit]New tasks have also been downloaded and crunched.[/edit] ![]() |
Gunde Send message Joined: 9 Jan 15 Posts: 8 Credit: 82,384,344 RAC: 144,451 ![]() ![]() ![]() |
My host manage to upload half yesterday but now stuck again, as it got some task back it started to download new but as it looks now download got pending and take 2 min to download a task that have 30 sec runtime. Just wait it out and hopefully server would catch up. Edit: Got most task uploaded and download no longer have pending time. |
Lars Vindal Send message Joined: 24 Sep 08 Posts: 4 Credit: 336,587 RAC: 331 |
I don't do ATLAS tasks, but my most recently completed sixTrack task is stuck like this: 10.01.2018 02:31:18 | LHC@home | Started upload of workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__4_6__5__45_1_sixvf_boinc1876_1_r612308196_0 10.01.2018 02:31:21 | LHC@home | [error] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__4_6__5__45_1_sixvf_boinc1876_1_r612308196_0] locked by file_upload_handler PID=-1 10.01.2018 02:31:21 | LHC@home | Temporarily failed upload of workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__4_6__5__45_1_sixvf_boinc1876_1_r612308196_0: transient upload error Is this file locking issue depending on NFS storage issues, or is it something on my end? |
Erich56 Send message Joined: 18 Dec 15 Posts: 686 Credit: 4,851,241 RAC: 4,738 ![]() ![]() ![]() |
transient upload errormost probably nothing on your end. I, too, got this "transient upload error" for several days, and finally last night my remaining ATLAS tasks were uploaded :-) |
Lars Vindal Send message Joined: 24 Sep 08 Posts: 4 Credit: 336,587 RAC: 331 |
Strange thing is that this file lock issue only affect one of my tasks so far. After my previous post BOINC started another Sixtrack task and successfully uploaded it, while the one mentioned above still has problems. 10.01.2018 13:26:58 | LHC@home | Starting task workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__6_8__5__15_1_sixvf_boinc1883_1 10.01.2018 14:17:34 | LHC@home | Computation for task workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__6_8__5__15_1_sixvf_boinc1883_1 finished 10.01.2018 14:17:36 | LHC@home | Started upload of workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__6_8__5__15_1_sixvf_boinc1883_1_r821681728_0 10.01.2018 14:17:56 | LHC@home | Finished upload of workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__6_8__5__15_1_sixvf_boinc1883_1_r821681728_0 10.01.2018 14:17:57 | LHC@home | Sending scheduler request: To report completed tasks. 10.01.2018 14:17:57 | LHC@home | Reporting 1 completed tasks 10.01.2018 18:02:55 | LHC@home | Started upload of workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__4_6__5__45_1_sixvf_boinc1876_1_r612308196_0 10.01.2018 18:02:58 | LHC@home | [error] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__4_6__5__45_1_sixvf_boinc1876_1_r612308196_0] locked by file_upload_handler PID=-1 10.01.2018 18:02:58 | LHC@home | Temporarily failed upload of workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B1__22__s__62.31_60.32__4_6__5__45_1_sixvf_boinc1876_1_r612308196_0: transient upload error |
Saharak Send message Joined: 28 Apr 07 Posts: 1 Credit: 103,843 RAC: 324 |
Strange thing is that this file lock issue only affect one of my tasks so far. After my previous post BOINC started another Sixtrack task and successfully uploaded it, while the one mentioned above still has problems. I am experiencing the same issue. |
©2018 CERN