Message boards :
Sixtrack Application :
Transfer issues
Message board moderation
Previous · 1 · 2 · 3 · 4 · Next
Author | Message |
---|---|
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
Two Sixtrack tasks for me. Deadline is the 27th according to BOINC or 28th according to LHC. This time I'll wait to see when it will actually happen. |
Send message Joined: 28 Sep 04 Posts: 728 Credit: 48,863,601 RAC: 21,448 |
I have one sixtrack that is about to expire in 6 hours according to Boinc. Task is here: https://lhcathome.cern.ch/lhcathome/result.php?resultid=169614100 It has been uploading for 5 days now. Intresting to see what will happen. |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
I noticed you have several tasks waiting for validation just like I do, too. What happens if results cannot be validated, because the second volunteer hasn't been able to deliver on time i.e. the necessary quorum is not met? Many tasks have been aborted, failed or cancelled, but with more than a million tasks in queue, those retries probably won't be completed in time. Even more file fragments clogging up the server? |
Send message Joined: 18 Dec 15 Posts: 1811 Credit: 118,315,837 RAC: 27,666 |
...Many tasks have been aborted, failed or cancelled, but with more than a million tasks in queue, those retries probably won't be completed in time. Even more file fragments clogging up the server? I am afraid your assumption is correct :-( |
Send message Joined: 28 Sep 04 Posts: 728 Credit: 48,863,601 RAC: 21,448 |
I noticed you have several tasks waiting for validation just like I do, too. I think that the maximum number a task is sent out is 5 copies. If a task goes beyond its deadline next copy will be sent out. But as these resents will go to the end of the ready to send queue, it will take some time before they actually arrive to a host for crunching. If all 5 copies are out in the field and have gone over their deadline I think that the WU is pronounced as an error (I'm not sure if this is how it works, correct me if I'm wrong). Anyway with RTS queue being so long at the moment there will be a lot of time to upload your results before final failure. That is if the servers can hold up the fortress with the constant pounding. The RTS queue shows some signs of slowly draining although the number of in progress tasks is also dropping. So let's keep the holiday spirit up and not sink in despair. ;-) |
Send message Joined: 18 Dec 15 Posts: 1811 Credit: 118,315,837 RAC: 27,666 |
... So let's keep the holiday spirit up and not sink in despair. ;-)Harri - I think, that's all we can do anyway :-) |
Send message Joined: 22 Sep 13 Posts: 11 Credit: 660,161 RAC: 0 |
Unable to upload at the moment; I get an error message I have not seen before: 26/12/2017 15:00:35 | LHC@home | [error] Error reported by file upload server: [w5_hllhc10_sqz1500_Qcol_chr20_w5__6__s__62.31_60.32__18_20__5__22.5_1_sixvf_boinc806_1_r2118365745_0] locked by file_upload_handler PID=-1 Tom |
Send message Joined: 18 Dec 15 Posts: 1811 Credit: 118,315,837 RAC: 27,666 |
Unable to upload at the moment; I get an error message I have not seen before:this is exactly the kind of error message which we got for ATLAS 2 weeks ago, when the connections and servers were totally overloaded. Obviously, when uploads get stuck and only fragments of them arrive at the server, retries of the same upload won't be successful until some cleaning tool is being run there (no idea whether this tool is in operation over the holidays or not - as far as I remember, it had to be initiated manually). Same thing seems to be true now for Sixtrack, probably due to a too high a number of tasks in the mills (as seen from the Project Status Page). |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
Error reported by file upload server: Server is out of disk space Uh-oh. |
Send message Joined: 1 May 07 Posts: 27 Credit: 2,336,992 RAC: 1 |
Yup, same thing here... 27/12/2017 16:10:33 | LHC@home | Started upload of BT2KDmof9nrnDDn7oo6G73TpABFKDmABFKDmRLFKDmABFKDmtodCCn_0_r1075799161_ATLAS_result 27/12/2017 16:10:35 | LHC@home | [error] Error reported by file upload server: Server is out of disk space Oops... its an ATLAS issue. Not Sixtrack.... Ignore |
Send message Joined: 22 Mar 17 Posts: 63 Credit: 14,576,403 RAC: 10,212 |
Still have one locked :( Others have completed and sent. LHC@home 12/27/2017 10:07:38 PM [error] Error reported by file upload server: [LHC_2015_LHC_2015_260_BOINC_errors__19__s__62.31_60.32__5.6_5.7__5__39_1_sixvf_boinc65870_0_r1527001410_0] locked by file_upload_handler PID=-1 |
Send message Joined: 18 Dec 15 Posts: 1811 Credit: 118,315,837 RAC: 27,666 |
my two remaining ones (which got finished about 2 days ago) finally got uploaded an hour ago :-) |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
I have lost my first Sixtrack result of 180 GFLOPs. The next task expires this afternoon. In addition, two more tasks are stuck since last night. Wouldn't it be wiser to cancel now and open the blocked slots for other volunteers? Chances of recovery are pretty slim as far as I can tell and my account -- not the project -- is credited with the failure. What would be the harm if I cancelled now? |
Send message Joined: 28 Sep 04 Posts: 728 Credit: 48,863,601 RAC: 21,448 |
I think that you still have a chance of getting your credit for expired tasks if they have not yet been granted any credit to anybody. The first two who will return valid results will get the credit. I have one task that the second host that it was sent to aborted it a couple of hours after it was issued. That was 9 days ago but the third copy which was created soon after the abortion has not yet reached a new host. It is probably still in the Ready To Send queue waiting to be downloaded (as is the fourth copy which was created yesterday after my copy expired too). My copy has been stuck in transfer tab for 164 hours now. |
Send message Joined: 16 Sep 17 Posts: 100 Credit: 1,618,469 RAC: 0 |
I think that you still have a chance of getting your credit for expired tasks if they have not yet been granted any credit to anybody. The first two who will return valid results will get the credit. This seems to be accurate. Although the project shows the WUs in question as Errors, the results have not been cancelled. Although it has been days, the results are still being uploaded -- and continue to fail. So far only one result has been returned per WU. |
Send message Joined: 17 Feb 07 Posts: 86 Credit: 968,855 RAC: 0 |
Happy New Year. I would think all the staff have been back from the holiday's and start looking at the servers. Maybe they have done, maybe they are still on holiday but I have still two WU's that will not upload, one is already 14 days trying... It is also reported to my results list as an error, which is of course as the return deadline already passed. So the issue "not uploading" is still ongoing. Thanks. Greetings from, TJ |
Send message Joined: 26 Dec 17 Posts: 2 Credit: 1,205,590 RAC: 0 |
Yep same here, yesterday I had three of my WU's pass deadline and now have the same status as yours.(Timed out - no response) I also have another six that will presumably end in the same manner, so far I haven't seen any WU's upload successfully after getting the dreaded "locked by file_upload_handler PID=-1" message. What a waste. If the partial cached upload are the fly in the ointment? Then, does anyone know how to hack the client state so that the WU is seen as new upload again? I did try a couple of things myself but my edits didn't work as planned and were ignored.:( |
Send message Joined: 28 Jul 05 Posts: 24 Credit: 6,603,623 RAC: 0 |
Same here. One that doesn't want to upload (Windows) and is already stuck a couple of days. Other Sixtrack workunits went fine before and after this one completed. I am now also seeing the message "05-Jan-2018 19:37:30 [LHC@home] Error reported by file upload server: [workspace1_hl13_collision_scan_62.3275_60.3000_chrom_15_oct_-300_B4__46__s__62.31_60.32__6_8__5__60_1_sixvf_boinc4001_1_r1970795041_0] locked by file_upload_handler PID=-1" on Linux too. |
Send message Joined: 18 Dec 15 Posts: 1811 Credit: 118,315,837 RAC: 27,666 |
What is likewise frustrating: there are 1.027.147 "unsent" Sixtrack tasks waiting for download, but the download doesn't work :-( I guess best for us crunchers would be to change to other projects meanwhile, until LHC gets their infrastructure straightened out sometime this year, as promised last month. |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 |
Yes it is frustrating - I am experiencing similar issues - see https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4539&postid=33680 But it works in fits and starts. |
©2024 CERN