log in

Network and server problems Sunday night


Advanced search

Message boards : News : Network and server problems Sunday night

Author Message
Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester
Send message
Joined: 15 Jul 05
Posts: 96
Credit: 694,171
RAC: 2,553
Message 30840 - Posted: 19 Jun 2017, 7:19:47 UTC

We had a network problem in the computer centre at CERN last night, leading to a number of issues for our servers. BOINC servers should be back in business now.

Normally tasks should be correctly uploaded again on the next attempt. If you see any issues, please try an update or reset of the project.

Sorry for the trouble, and happy crunching!

morgan
Send message
Joined: 1 Nov 12
Posts: 3
Credit: 220,812
RAC: 1,268
Message 30842 - Posted: 19 Jun 2017, 7:53:21 UTC - in response to Message 30840.

I made a reset, but still i get: Server error:Feeder not running

computezrmle
Send message
Joined: 15 Jun 08
Posts: 302
Credit: 3,288,491
RAC: 2,636
Message 30843 - Posted: 19 Jun 2017, 7:57:32 UTC - in response to Message 30840.

Host 1 is currently running an ATLAS task.
Task finished while I was typing this message.
Result uploaded but can't be reported.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=145793368

Mo 19 Jun 2017 09:21:22 CEST | LHC@home | Sending scheduler request: Requested by project.
Mo 19 Jun 2017 09:21:22 CEST | LHC@home | Not requesting tasks
Mo 19 Jun 2017 09:21:23 CEST | LHC@home | Scheduler request completed
Mo 19 Jun 2017 09:21:23 CEST | LHC@home | Server error: feeder not running
Mo 19 Jun 2017 09:47:19 CEST | LHC@home | Started upload of NHNMDm9TCfqnDDn7oo6G73TpABFKDmABFKDmZ0KKDme0FKDm109BWm_1_ATLAS_result
Mo 19 Jun 2017 09:48:43 CEST | LHC@home | Finished upload of NHNMDm9TCfqnDDn7oo6G73TpABFKDmABFKDmZ0KKDme0FKDm109BWm_1_ATLAS_result




Host 2 failed to fetch a LHCb task:

19-Jun-2017 09:41:03 [LHC@home] Sending scheduler request: To fetch work.
19-Jun-2017 09:41:03 [LHC@home] Requesting new tasks for CPU
19-Jun-2017 09:41:04 [LHC@home] Scheduler request completed: got 0 new tasks
19-Jun-2017 09:41:04 [LHC@home] Server error: feeder not running

morgan
Send message
Joined: 1 Nov 12
Posts: 3
Credit: 220,812
RAC: 1,268
Message 30848 - Posted: 19 Jun 2017, 9:24:02 UTC

Now Wu´s flowing (sixTrack)here :)

Profile Yeti
Volunteer moderator
Avatar
Send message
Joined: 2 Sep 04
Posts: 246
Credit: 39,990,177
RAC: 52,023
Message 30849 - Posted: 19 Jun 2017, 9:42:34 UTC - in response to Message 30848.

Now Wu´s flowing (sixTrack)here :)

But I don't get any Atlas-Tasks :-(
____________


Supporting BOINC, a great concept !

computezrmle
Send message
Joined: 15 Jun 08
Posts: 302
Credit: 3,288,491
RAC: 2,636
Message 30851 - Posted: 19 Jun 2017, 9:55:22 UTC - in response to Message 30843.

My ATLAS WU has now been validated but still no new vbox WUs.
Instead: "No tasks are available for xxxx Simulation".

At least "Server error: feeder not running" has disappeared.

Erich56
Send message
Joined: 18 Dec 15
Posts: 297
Credit: 3,154,251
RAC: 6,818
Message 30859 - Posted: 19 Jun 2017, 11:20:53 UTC - in response to Message 30851.

My ATLAS WU has now been validated but still no new vbox WUs.
Instead: "No tasks are available for xxxx Simulation".

same here; any idea when we will be back to normal operation?

Profile Michael H.W. Weber
Send message
Joined: 18 Sep 04
Posts: 27
Credit: 4,594,527
RAC: 5,988
Message 30862 - Posted: 19 Jun 2017, 13:36:41 UTC

The feeder is not running.
No tasks for any of the projects at present.

Michael.
____________

Profile Michael H.W. Weber
Send message
Joined: 18 Sep 04
Posts: 27
Credit: 4,594,527
RAC: 5,988
Message 30866 - Posted: 19 Jun 2017, 13:55:29 UTC

...now it works again.

Michael.
____________

computezrmle
Send message
Joined: 15 Jun 08
Posts: 302
Credit: 3,288,491
RAC: 2,636
Message 30872 - Posted: 19 Jun 2017, 14:27:16 UTC

Got a WU from LHCb on host 1 and another WU from Theory on host 2.
Both started normal.

I'll keep my fingers crossed.

Profile Michael H.W. Weber
Send message
Joined: 18 Sep 04
Posts: 27
Credit: 4,594,527
RAC: 5,988
Message 30895 - Posted: 20 Jun 2017, 8:12:29 UTC

The feeder is down again...

Michael.
____________

Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester
Send message
Joined: 15 Jul 05
Posts: 96
Credit: 694,171
RAC: 2,553
Message 30896 - Posted: 20 Jun 2017, 8:36:46 UTC

We are experimenting with the feeder parameters to try to tune performance. With our default setting, it takes 5-10 minutes to fill the scheduler memory buffer size with the current backlog of tasks.

computezrmle
Send message
Joined: 15 Jun 08
Posts: 302
Credit: 3,288,491
RAC: 2,636
Message 30898 - Posted: 20 Jun 2017, 11:26:47 UTC

Host 1:
20-Jun-2017 13:12:40 [LHC@home] No tasks are available for LHCb Simulation
No further error message.

Host 2:
20-Jun-2017 13:15:59 [LHC@home] No tasks are available for CMS Simulation
No further error message.


Although there should be thousands of unsent WUs:
https://lhcathome.cern.ch/lhcathome/server_status.php

Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester
Send message
Joined: 15 Jul 05
Posts: 96
Credit: 694,171
RAC: 2,553
Message 30900 - Posted: 20 Jun 2017, 11:55:55 UTC - in response to Message 30898.

What the scheduler has in shared memory may not always match what is queued in the database. The tasks should come after a while, but the feeder has difficulties with the backlog from Sunday.

As mentioned, we're trying to optimize this, and you might see occasional "feeder not running" messages on your client today.

Message boards : News : Network and server problems Sunday night