Message boards : News : File upload issues
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · Next

AuthorMessage
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 707
Credit: 47,271,317
RAC: 28,705
Message 33791 - Posted: 12 Jan 2018, 8:12:38 UTC

I have two of these. They have now been in upload status for 67 hours both. The removal of partly uploaded files does not seem to work. They both have uploaded about 0.5 % and never proceed above that. These two are the only ones that I have currently with upload or download problems. Here's the log:
6547	LHC@home	12.1.2018 10:04:09	Started upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0	
6548	LHC@home	12.1.2018 10:04:09	Started upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0	
6549	LHC@home	12.1.2018 10:04:15	[error] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0] locked by file_upload_handler PID=-1	
6550	LHC@home	12.1.2018 10:04:15	[error] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0] locked by file_upload_handler PID=-1	
6551	LHC@home	12.1.2018 10:04:15	Temporarily failed upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0: transient upload error	
6552	LHC@home	12.1.2018 10:04:15	Backing off 04:23:48 on upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0	
6553	LHC@home	12.1.2018 10:04:15	Temporarily failed upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0: transient upload error	
6554	LHC@home	12.1.2018 10:04:15	Backing off 04:33:03 on upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0	

ID: 33791 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 707
Credit: 47,271,317
RAC: 28,705
Message 33836 - Posted: 13 Jan 2018, 22:59:49 UTC
Last modified: 13 Jan 2018, 23:01:43 UTC

I have now two new tasks (now Atlas tasks) locked by file_upload_handler PID=-1. They are on an other host than the two I reported on Friday.

Tasks are here: https://www.cpdn.org/cpdnboinc/result.php?resultid=20921445 and here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=173295916
ID: 33836 · Report as offensive     Reply Quote
Greger

Send message
Joined: 9 Jan 15
Posts: 151
Credit: 431,596,822
RAC: 0
Message 33838 - Posted: 14 Jan 2018, 1:09:48 UTC - in response to Message 33836.  

LHC would probably not be able to help your cpdn task issue.
ID: 33838 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33848 - Posted: 14 Jan 2018, 13:30:23 UTC

... and the servers are overwhelmed again. Both uploads and downloads fail with transient http errors.
ID: 33848 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 157
Credit: 2,659,975
RAC: 0
Message 33850 - Posted: 14 Jan 2018, 14:15:20 UTC - in response to Message 33848.  
Last modified: 14 Jan 2018, 14:17:29 UTC

so it seems - I have problems downloading a couple of SixTrack tasks... transient HTTP errors, as you
I am contacting the IT guys
ID: 33850 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1155
Credit: 52,305,052
RAC: 57,440
Message 33851 - Posted: 14 Jan 2018, 15:16:48 UTC - in response to Message 33850.  

so it seems - I have problems downloading a couple of SixTrack tasks... transient HTTP errors, as you
I am contacting the IT guys


THANKS
I have 32 cores running these and the problem just started for me 2 hours ago and I sure hope that problem doesn't happen again where lots of complete tasks turn into nothing and just get aborted.

1/14/2018 5:02:18 AM | LHC@home | Temporarily failed upload of LHC_2015_LHC_2015_234_BOINC_errors__8__s__62.31_60.32__7.9_8.0__5__28.5_1_sixvf_boinc28280_0_r1268262920_0: transient HTTP error
Volunteer Mad Scientist For Life
ID: 33851 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 707
Credit: 47,271,317
RAC: 28,705
Message 33857 - Posted: 14 Jan 2018, 19:00:43 UTC - in response to Message 33838.  

LHC would probably not be able to help your cpdn task issue.

DUH! I don't know how I managed to f*ck that up. Anyway more LHC tasks are on upload queue. Downloads seem to be coming thru better but not all on one go.
ID: 33857 · Report as offensive     Reply Quote
Lars Vindal

Send message
Joined: 24 Sep 08
Posts: 4
Credit: 397,080
RAC: 0
Message 33859 - Posted: 14 Jan 2018, 22:04:30 UTC

Now the one I reported as stuck earlier have timed out and shows up as error in my account because the server locked it with just over half a percent uploaded... :-(

Please fix this locking issue on server side before lots of more tasks error out because of this and have to be sent out again!
ID: 33859 · Report as offensive     Reply Quote
Alessio Mereghetti
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 29 Feb 16
Posts: 157
Credit: 2,659,975
RAC: 0
Message 33863 - Posted: 15 Jan 2018, 8:32:13 UTC - in response to Message 33859.  

ok upload/download seems to be back to functional - at least, my problems got automatically solved without any particular action from me
ID: 33863 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1742
Credit: 114,934,486
RAC: 93,948
Message 33864 - Posted: 15 Jan 2018, 8:54:43 UTC - in response to Message 33863.  

ok upload/download seems to be back to functional
I can confirm :-)
ID: 33864 · Report as offensive     Reply Quote
Harri Liljeroos
Avatar

Send message
Joined: 28 Sep 04
Posts: 707
Credit: 47,271,317
RAC: 28,705
Message 33866 - Posted: 15 Jan 2018, 10:09:20 UTC

My uploads are still stuck:

25615	LHC@home	15.1.2018 11:59:25	Started upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0	
25616	LHC@home	15.1.2018 11:59:25	Started upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0	
25617	LHC@home	15.1.2018 11:59:51	[error] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0] locked by file_upload_handler PID=-1	
25618	LHC@home	15.1.2018 11:59:51	[error] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0] locked by file_upload_handler PID=-1	
25619	LHC@home	15.1.2018 11:59:51	Temporarily failed upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0: transient upload error	
25620	LHC@home	15.1.2018 11:59:51	Backing off 04:26:11 on upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__6_8__5__45_1_sixvf_boinc4967_0_r902298660_0	
25621	LHC@home	15.1.2018 11:59:51	Temporarily failed upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0: transient upload error	
25622	LHC@home	15.1.2018 11:59:51	Backing off 03:24:11 on upload of workspace1_hl13_collision_scan_62.3250_60.3300_chrom_15_oct_-300_B4__57__s__62.31_60.32__10_12__5__82.5_1_sixvf_boinc4994_0_r1263781623_0	

Upload is being retried automatically every five minutes by BoincTasks. Both tasks will expire later today. The crunch time for both was below 10 seconds so aborting the uploads would not be any major loss.
ID: 33866 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 455
Credit: 198,139,853
RAC: 86,453
Message 33867 - Posted: 15 Jan 2018, 10:33:45 UTC

Same here, several WUs are stuck in upload:

LHC_2015_LHC_2015_234_BOINC_errors__20__s__62.31_60.32__5.5_5.6__5__55.5_1_sixvf_boinc69362_0_r960578163_0 0,517 44,00 K 00:07:22 - 25:55:03 0,00 Kbps Upload pending (Retry in: 03:41:54), retried: 11 AHuW74
workspace1_hl13_collision_scan_62.3250_60.3100_chrom_15_oct_-300_B1__59__s__62.31_60.32__6_8__5__45_1_sixvf_boinc5143_1_r726992513_0 0,575 44,00 K 00:17:39 - 196:19:28 0,00 Kbps Upload pending (Project backoff: 00:09:31) DEV21
workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B4__24__s__62.31_60.32__8_10__5__30_1_sixvf_boinc2072_1_r145347505_0 0,577 44,00 K 00:11:00 - 179:08:13 0,00 Kbps Upload pending (Project backoff: 00:09:31) DEV21
workspace1_hl13_collision_scan_62.3250_60.3125_chrom_15_oct_-300_B4__24__s__62.31_60.32__10_12__5__15_1_sixvf_boinc2081_1_r1521857171_0 0,581 44,00 K 00:18:17 - 179:49:21 0,00 Kbps Upload pending (Project backoff: 00:09:31) DEV21
LHC_2015_LHC_2015_234_BOINC_errors__20__s__62.31_60.32__7.0_7.1__5__10.5_1_sixvf_boinc70217_1_r1739503837_0 0,519 44,00 K 00:14:48 - 27:02:47 0,00 Kbps Upload pending (Retry in: 05:05:20), retried: 12 PHuW72


Supporting BOINC, a great concept !
ID: 33867 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1742
Credit: 114,934,486
RAC: 93,948
Message 33869 - Posted: 15 Jan 2018, 10:53:20 UTC

My assumption would be that the upload problems with Sixtrack have to do with these many tasks in the mills.

From what I just saw, there are about 1,700 unsent ATLAS tasks - so we might end up with the same problems which came up a month ago.
ID: 33869 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 602
Credit: 24,371,321
RAC: 0
Message 33870 - Posted: 15 Jan 2018, 11:11:33 UTC - in response to Message 33869.  
Last modified: 15 Jan 2018, 11:20:47 UTC

Every time I have tried ATLAS recently, I have gotten burned. But since they now have a full crew back at CERN, I am trying again. I have one VirtualBox machine that does only CMS, LCHb and Theory. They work fine, with no upload issues. And any problems on ATLAS or Theory do not affect them. Yesterday I started a second machine without VirtualBox, set to receive only ATLAS (native) and Sixtrack. So far, so good. But we will see how long LHC can keep the servers running properly.
EDIT: One ATLAS is stuck in download, and two SixTrack are stuck in upload. But I have enough work to keep going.
ID: 33870 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 246
Credit: 5,974,599
RAC: 0
Message 33874 - Posted: 15 Jan 2018, 14:57:16 UTC

We obviously hit the limit of the current infrastructure with 10k ATLAS and 200k Sixtrack tasks. The NFS server is ok, but the upload-download servers are struggling,

Sorry about this, please be patient (again). :-(
ID: 33874 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1742
Credit: 114,934,486
RAC: 93,948
Message 33884 - Posted: 17 Jan 2018, 6:00:50 UTC - in response to Message 33874.  

We obviously hit the limit of the current infrastructure with 10k ATLAS and 200k Sixtrack tasks. The NFS server is ok, but the upload-download servers are struggling
one of my finished ATLAS tasks has unsuccessfully been trying to get uploaded since yesterday.
It always shows "locked by upload handler ... transient upload error", regardless of how often (for sure several hundred times) I push the "retry now" button.
Obviously this tool which was said to be run every 6 hours in order to delete partly uploaded files is not doing it's job.

I am questioning what sense it makes to permanently fill the "unsent" queue with new ATLAS tasks as long as these severe transfer problems exist.
ID: 33884 · Report as offensive     Reply Quote
nairb

Send message
Joined: 1 May 07
Posts: 27
Credit: 2,336,954
RAC: 335
Message 33894 - Posted: 17 Jan 2018, 18:37:22 UTC

Same here with :-
17/01/2018 18:33:32 | LHC@home | Temporarily failed upload of Au0KDmLGRurnDDn7oo6G73TpABFKDmABFKDmQrFKDmABFKDmRzxOQn_0_r70336218_ATLAS_result: transient upload error

I presume it will get up loaded at some stage...before its deadline.
ID: 33894 · Report as offensive     Reply Quote
Gunnar Hjern

Send message
Joined: 14 Jul 17
Posts: 7
Credit: 260,936
RAC: 0
Message 33895 - Posted: 17 Jan 2018, 19:03:08 UTC - in response to Message 33874.  

Hi!

I'm not sure if I have understood this issue correctly:

Those "partly uploaded file", are they on my machine or on the server?

Do I need to take any actions, or is the problem going to solve itself when the servers are less busy?

I currently have half a dozen or so tasks that are stuck in uploading state, and they represent
together several days of hard computing so I'd hate to have to abort them! :-(

As I can see on the server stat page, there are several thousands of items in the tasks and WU's
"waiting for deletion" queues, and a whopping 768973 tasks to send!! :-O

Will this issue be solved by itself once they are crunched and validated?
(hopefully before the deadlines expires)

Kindest regards,
Gunnar
ID: 33895 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1742
Credit: 114,934,486
RAC: 93,948
Message 33897 - Posted: 17 Jan 2018, 19:16:00 UTC - in response to Message 33895.  

Do I need to take any actions, or is the problem going to solve itself when the servers are less busy?
There is nothing you can do than waiting and hoping that the tasks will be uploaded before the expiration date.
ID: 33897 · Report as offensive     Reply Quote
AuxRx

Send message
Joined: 16 Sep 17
Posts: 100
Credit: 1,618,469
RAC: 0
Message 33898 - Posted: 17 Jan 2018, 20:02:56 UTC - in response to Message 33897.  

This answer needs clarification. Most tasks have a chance of being returned and validated with credit even after the deadline has passed. The first (or first two, depending on the quorum) results to be *returned* will receive credit, regardless of deadline.

Therefore the answer should be that nothing can be done other than hoping they will upload before the minimum quorum has been reached.
ID: 33898 · Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · Next

Message boards : News : File upload issues


©2024 CERN