Message boards : News : Server Intervention 10-Feb-2014
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile pete
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 4 Aug 11
Posts: 12
Credit: 1,982,950
RAC: 22
Message 27127 - Posted: 9 Feb 2015, 10:17:46 UTC
Last modified: 10 Feb 2015, 14:51:13 UTC

There will be a short server interruption on Tuesday 10-Feb-2014 from 14:00-15:00 CET for a hardware upgrade.


Update: The upgrade finished at 15:00 and the service is back up.
ID: 27127 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27129 - Posted: 10 Feb 2015, 17:26:18 UTC - in response to Message 27127.  
Last modified: 10 Feb 2015, 17:41:59 UTC

You might have a problem (like access permissions?) with some download files:

10/02/2015 16:46:50 | LHC@home 1.0 | Giving up on download of w22_job_base_bb_np_nt_fset_240214_1k_2_x100_cc__47__s__62.31_60.32__10_12__6__49.5_1_sixvf_boinc16553.zip: permanent HTTP error
10/02/2015 16:46:50 | LHC@home 1.0 | Giving up on download of w22_job_base_bb_np_nt_fset_240214_1k_2_x100_cc__47__s__62.31_60.32__10_12__6__4.5_1_sixvf_boinc16523.zip: permanent HTTP error

Doesn't apply to all files - this was allocated in the same request.

10/02/2015 16:46:52 | LHC@home 1.0 | Finished download of w8_job_tracking_bb_np_nt_dq-3_20kHz_2__18__s__62.31_60.32__4_6__6__51_1_sixvf_boinc6111.zip

Edit: or maybe the files are simply missing. Trying manually got

The requested URL /sixtrack/download/76/w22_job_base_bb_np_nt_fset_240214_1k_2_x100_cc__47__s__62.31_60.32__10_12__6__4.5_1_sixvf_boinc16523.zip was not found on this server.
ID: 27129 · Report as offensive     Reply Quote
Uffe F

Send message
Joined: 9 Jan 08
Posts: 66
Credit: 727,923
RAC: 0
Message 27130 - Posted: 10 Feb 2015, 17:53:45 UTC - in response to Message 27129.  

Same problem here:

10-02-2015 18:32:09 | LHC@home 1.0 | Requesting new tasks for CPU
10-02-2015 18:32:11 | LHC@home 1.0 | Scheduler request completed: got 5 new tasks
10-02-2015 18:32:13 | LHC@home 1.0 | Started download of w25_job_ps2_corr_bb_np_nt_fset_240214_1k_2_x100_cc__42__s__62.31_60.32__6_8__6__42_1_sixvf_boinc14660.zip
10-02-2015 18:32:13 | LHC@home 1.0 | Started download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__27_1_sixvf_boinc18603.zip
10-02-2015 18:32:14 | LHC@home 1.0 | Giving up on download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__27_1_sixvf_boinc18603.zip: permanent HTTP error
10-02-2015 18:32:14 | LHC@home 1.0 | Started download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__28.5_1_sixvf_boinc18604.zip
10-02-2015 18:32:15 | LHC@home 1.0 | Giving up on download of w25_job_ps2_corr_bb_np_nt_fset_240214_1k_2_x100_cc__42__s__62.31_60.32__6_8__6__42_1_sixvf_boinc14660.zip: permanent HTTP error
10-02-2015 18:32:15 | LHC@home 1.0 | Giving up on download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__28.5_1_sixvf_boinc18604.zip: permanent HTTP error
10-02-2015 18:32:15 | LHC@home 1.0 | Started download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__30_1_sixvf_boinc18605.zip
10-02-2015 18:32:15 | LHC@home 1.0 | Started download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__31.5_1_sixvf_boinc18606.zip
10-02-2015 18:32:17 | LHC@home 1.0 | Giving up on download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__30_1_sixvf_boinc18605.zip: permanent HTTP error
10-02-2015 18:32:17 | LHC@home 1.0 | Giving up on download of w20_job_tracking_bb_np_nt_dq-6_300Hz_2_cc__53__s__62.31_60.32__8_10__6__31.5_1_sixvf_boinc18606.zip: permanent HTTP error
ID: 27130 · Report as offensive     Reply Quote
Antjest

Send message
Joined: 30 Sep 04
Posts: 21
Credit: 1,442,034
RAC: 0
Message 27131 - Posted: 10 Feb 2015, 18:36:55 UTC - in response to Message 27129.  

Looks like similar problem happened to already uploded files which couldn't validate and got ignored when additional result is returned:

http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=27499389

and some that couldn't validate:

http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=27455940
http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=27455939
ID: 27131 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27132 - Posted: 10 Feb 2015, 20:10:12 UTC

I have notified admins that you do not seem to be able
to get new WUs. There are anyway problems with results validation.
There are too many results waiting for confirmation.
I suspended a couple of clients with too many wrong results
but maybe they are back. Eric.
ID: 27132 · Report as offensive     Reply Quote
Profile pete
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 4 Aug 11
Posts: 12
Credit: 1,982,950
RAC: 22
Message 27133 - Posted: 11 Feb 2015, 10:27:38 UTC

We see that jobs are now being submitted on the new server.

Can users confirm if they are receiving WUs?
ID: 27133 · Report as offensive     Reply Quote
Senilix

Send message
Joined: 8 Oct 08
Posts: 3
Credit: 370,653
RAC: 0
Message 27134 - Posted: 11 Feb 2015, 10:59:06 UTC - in response to Message 27133.  

We see that jobs are now being submitted on the new server.

Can users confirm if they are receiving WUs?

Nope, can't get any.
11.02.2015 11:56:32 | LHC@home 1.0 | [sched_op] Starting scheduler request
11.02.2015 11:56:32 | LHC@home 1.0 | Sending scheduler request: To fetch work.
11.02.2015 11:56:32 | LHC@home 1.0 | Requesting new tasks for CPU
11.02.2015 11:56:32 | LHC@home 1.0 | [sched_op] CPU work request: 116507.69 seconds; 4.00 devices
11.02.2015 11:56:32 | LHC@home 1.0 | [sched_op] Intel GPU work request: 0.00 seconds; 0.00 devices
11.02.2015 11:56:33 | LHC@home 1.0 | Scheduler request completed: got 0 new tasks
11.02.2015 11:56:33 | LHC@home 1.0 | [sched_op] Server version 705
11.02.2015 11:56:33 | LHC@home 1.0 | Project has no tasks available
11.02.2015 11:56:33 | LHC@home 1.0 | Project requested delay of 6 seconds
11.02.2015 11:56:33 | LHC@home 1.0 | [sched_op] Deferring communication for 00:00:06
11.02.2015 11:56:33 | LHC@home 1.0 | [sched_op] Reason: requested by project
ID: 27134 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27135 - Posted: 11 Feb 2015, 11:12:38 UTC

None available here either:

11/02/2015 11:07:30 | LHC@home 1.0 | [sched_op] CPU work request: 27112.20 seconds; 6.00 devices
11/02/2015 11:07:32 | LHC@home 1.0 | Scheduler request completed: got 0 new tasks
11/02/2015 11:07:32 | LHC@home 1.0 | Project has no tasks available

Meanwhile, you just happened to update the server software while there was an untested word-wrap modification in the style sheet. David reverted that last night, because it made these log snippets unreadable: could you possibly apply

http://boinc.berkeley.edu/gitweb/?p=boinc-v2.git;a=commit;h=8f2eb6a5ca95d20db83d85ae9e9fcb299c5decb6

Thanks.
ID: 27135 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27136 - Posted: 11 Feb 2015, 11:41:12 UTC

That looks better, thanks.

Meanwhile, on another machine, I've just got one new task (task 58122173), but only one - and it was a resend because somebody else missed their deadline (WU 27029491). No sign of any tasks from the 'new work' pool:

11/02/2015 11:32:53 | LHC@home 1.0 | Requesting new tasks for CPU
11/02/2015 11:32:53 | LHC@home 1.0 | [sched_op] CPU work request: 17401.94 seconds; 3.00 devices
11/02/2015 11:32:55 | LHC@home 1.0 | Scheduler request completed: got 1 new tasks
...
11/02/2015 11:33:06 | LHC@home 1.0 | Requesting new tasks for CPU
11/02/2015 11:33:06 | LHC@home 1.0 | [sched_op] CPU work request: 4454.46 seconds; 2.00 devices
11/02/2015 11:33:08 | LHC@home 1.0 | Scheduler request completed: got 0 new tasks
11/02/2015 11:33:08 | LHC@home 1.0 | No tasks sent
ID: 27136 · Report as offensive     Reply Quote
m

Send message
Joined: 6 Sep 08
Posts: 116
Credit: 10,927,002
RAC: 2,464
Message 27137 - Posted: 11 Feb 2015, 11:51:29 UTC

Not better for me, I'm afraid.

From this:-


11/02/2015 11:33:07 am Sending scheduler request: To report completed tasks.
11/02/2015 11:33:07 am Reporting 1 completed tasks
11/02/2015 11:33:07 am Requesting new tasks for CPU
11/02/2015 11:33:08 am Scheduler request completed: got 0 new tasks
11/02/2015 11:33:08 am Project has no tasks available


To this:-


11/02/2015 11:41:48 am Requesting new tasks for CPU
11/02/2015 11:41:49 am Scheduler request completed: got 1 new tasks
11/02/2015 11:41:51 am Started download of sixtrack_win32_4517_gen.exe
11/02/2015 11:41:51 am Started download of w1_job_tracking_bb_np_nt_dq-3_50Hz_2_cc__29__s__62.31_60.32__4_6__6__60_1_sixvf_boinc27738.zip
11/02/2015 11:41:52 am Giving up on download of sixtrack_win32_4517_gen.exe: permanent HTTP error
11/02/2015 11:41:52 am Finished download of w1_job_tracking_bb_np_nt_dq-3_50Hz_2_cc__29__s__62.31_60.32__4_6__6__60_1_sixvf_boinc27738.zip

All subsequent requests have failed similarly.

John.
ID: 27137 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27138 - Posted: 11 Feb 2015, 12:04:35 UTC
Last modified: 11 Feb 2015, 12:07:03 UTC

Likewise,

11/02/2015 11:59:02 | LHC@home 1.0 | Giving up on download of sixtrack_win64_4517_gen.exe: permanent HTTP error

The data files for the six tasks allocated all downloaded OK, but without the program file they couldn't run.

Edit - note that this appears to be a generic (non-optimised) application. I normally get allocated the PNI version for SSE3 CPUs.
ID: 27138 · Report as offensive     Reply Quote
Senilix

Send message
Joined: 8 Oct 08
Posts: 3
Credit: 370,653
RAC: 0
Message 27139 - Posted: 11 Feb 2015, 12:13:26 UTC - in response to Message 27133.  

Working for me now, I just received some WUs. Good job!
ID: 27139 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27140 - Posted: 11 Feb 2015, 12:17:50 UTC

None of the Windows applications previously used by my hosts (win32 or win64, _gen, _sse2 or _pni) currently appear to be available for download from the server.
ID: 27140 · Report as offensive     Reply Quote
m

Send message
Joined: 6 Sep 08
Posts: 116
Credit: 10,927,002
RAC: 2,464
Message 27141 - Posted: 11 Feb 2015, 12:27:04 UTC
Last modified: 11 Feb 2015, 12:35:13 UTC

I'm still seeing the exe downloads fail (I'm after the generic one)

11/02/2015 12:20:24 pm Giving up on download of sixtrack_win32_4517_gen.exe: permanent HTTP error

Other files seem OK.

Edit:- Senilix has the pni version so perhaps that one is OK.

John
ID: 27141 · Report as offensive     Reply Quote
m

Send message
Joined: 6 Sep 08
Posts: 116
Credit: 10,927,002
RAC: 2,464
Message 27142 - Posted: 11 Feb 2015, 13:28:44 UTC

Seems OK now, thanks.

John.
ID: 27142 · Report as offensive     Reply Quote
Richard Haselgrove

Send message
Joined: 27 Oct 07
Posts: 186
Credit: 3,297,640
RAC: 0
Message 27143 - Posted: 11 Feb 2015, 14:56:37 UTC

Yes, everything seems to be back in place from here, too. Thanks.
ID: 27143 · Report as offensive     Reply Quote
Profile pete
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 4 Aug 11
Posts: 12
Credit: 1,982,950
RAC: 22
Message 27144 - Posted: 11 Feb 2015, 15:45:29 UTC

We apologize for the inconvenience and appreciate the feedback.
Please inform this thread if there are any further issues since the upgrade
ID: 27144 · Report as offensive     Reply Quote
alvin
Avatar

Send message
Joined: 12 Mar 12
Posts: 128
Credit: 20,013,377
RAC: 0
Message 27147 - Posted: 11 Feb 2015, 23:04:31 UTC - in response to Message 27144.  
Last modified: 11 Feb 2015, 23:14:39 UTC

more than 200 errors "error while downloading" most dated night and morning hours 11th of February and evening of 10 Feb (UTC)

but also have more than 300 WU also dated 11 feb and onwards timed evening and current time, so seems WUs are in pipeline
ID: 27147 · Report as offensive     Reply Quote
DaveSun

Send message
Joined: 3 May 07
Posts: 7
Credit: 5,048,604
RAC: 1
Message 27148 - Posted: 12 Feb 2015, 1:56:40 UTC

I have a few like this one 2 are validated and credit awarded but the one my machine ran is still showing as inconclusive.
ID: 27148 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 27150 - Posted: 12 Feb 2015, 17:18:16 UTC

My priority now is to try and reduce inconclusives
by improving the banning of hosts with too many wrong
results. Eric.
ID: 27150 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : News : Server Intervention 10-Feb-2014


©2024 CERN