Message boards : News : Network and server problems Sunday night
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 242
Credit: 5,800,306
RAC: 0
Message 30840 - Posted: 19 Jun 2017, 7:19:47 UTC

We had a network problem in the computer centre at CERN last night, leading to a number of issues for our servers. BOINC servers should be back in business now.

Normally tasks should be correctly uploaded again on the next attempt. If you see any issues, please try an update or reset of the project.

Sorry for the trouble, and happy crunching!
ID: 30840 · Report as offensive     Reply Quote
morgan

Send message
Joined: 1 Nov 12
Posts: 3
Credit: 562,848
RAC: 0
Message 30842 - Posted: 19 Jun 2017, 7:53:21 UTC - in response to Message 30840.  

I made a reset, but still i get: Server error:Feeder not running
ID: 30842 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2375
Credit: 221,680,245
RAC: 142,961
Message 30843 - Posted: 19 Jun 2017, 7:57:32 UTC - in response to Message 30840.  

Host 1 is currently running an ATLAS task.
Task finished while I was typing this message.
Result uploaded but can't be reported.

https://lhcathome.cern.ch/lhcathome/result.php?resultid=145793368
Mo 19 Jun 2017 09:21:22 CEST | LHC@home | Sending scheduler request: Requested by project.
Mo 19 Jun 2017 09:21:22 CEST | LHC@home | Not requesting tasks
Mo 19 Jun 2017 09:21:23 CEST | LHC@home | Scheduler request completed
Mo 19 Jun 2017 09:21:23 CEST | LHC@home | Server error: feeder not running
Mo 19 Jun 2017 09:47:19 CEST | LHC@home | Started upload of NHNMDm9TCfqnDDn7oo6G73TpABFKDmABFKDmZ0KKDme0FKDm109BWm_1_ATLAS_result
Mo 19 Jun 2017 09:48:43 CEST | LHC@home | Finished upload of NHNMDm9TCfqnDDn7oo6G73TpABFKDmABFKDmZ0KKDme0FKDm109BWm_1_ATLAS_result




Host 2 failed to fetch a LHCb task:

19-Jun-2017 09:41:03 [LHC@home] Sending scheduler request: To fetch work.
19-Jun-2017 09:41:03 [LHC@home] Requesting new tasks for CPU
19-Jun-2017 09:41:04 [LHC@home] Scheduler request completed: got 0 new tasks
19-Jun-2017 09:41:04 [LHC@home] Server error: feeder not running
ID: 30843 · Report as offensive     Reply Quote
morgan

Send message
Joined: 1 Nov 12
Posts: 3
Credit: 562,848
RAC: 0
Message 30848 - Posted: 19 Jun 2017, 9:24:02 UTC

Now Wu´s flowing (sixTrack)here :)
ID: 30848 · Report as offensive     Reply Quote
Profile Yeti
Volunteer moderator
Avatar

Send message
Joined: 2 Sep 04
Posts: 453
Credit: 193,369,412
RAC: 27,111
Message 30849 - Posted: 19 Jun 2017, 9:42:34 UTC - in response to Message 30848.  

Now Wu´s flowing (sixTrack)here :)

But I don't get any Atlas-Tasks :-(


Supporting BOINC, a great concept !
ID: 30849 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2375
Credit: 221,680,245
RAC: 142,961
Message 30851 - Posted: 19 Jun 2017, 9:55:22 UTC - in response to Message 30843.  

My ATLAS WU has now been validated but still no new vbox WUs.
Instead: "No tasks are available for xxxx Simulation".

At least "Server error: feeder not running" has disappeared.
ID: 30851 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1681
Credit: 99,376,634
RAC: 111,250
Message 30859 - Posted: 19 Jun 2017, 11:20:53 UTC - in response to Message 30851.  

My ATLAS WU has now been validated but still no new vbox WUs.
Instead: "No tasks are available for xxxx Simulation".

same here; any idea when we will be back to normal operation?
ID: 30859 · Report as offensive     Reply Quote
Profile Michael H.W. Weber

Send message
Joined: 18 Sep 04
Posts: 30
Credit: 5,100,929
RAC: 0
Message 30862 - Posted: 19 Jun 2017, 13:36:41 UTC

The feeder is not running.
No tasks for any of the projects at present.

Michael.
ID: 30862 · Report as offensive     Reply Quote
Profile Michael H.W. Weber

Send message
Joined: 18 Sep 04
Posts: 30
Credit: 5,100,929
RAC: 0
Message 30866 - Posted: 19 Jun 2017, 13:55:29 UTC

...now it works again.

Michael.
ID: 30866 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2375
Credit: 221,680,245
RAC: 142,961
Message 30872 - Posted: 19 Jun 2017, 14:27:16 UTC

Got a WU from LHCb on host 1 and another WU from Theory on host 2.
Both started normal.

I'll keep my fingers crossed.
ID: 30872 · Report as offensive     Reply Quote
Profile Michael H.W. Weber

Send message
Joined: 18 Sep 04
Posts: 30
Credit: 5,100,929
RAC: 0
Message 30895 - Posted: 20 Jun 2017, 8:12:29 UTC

The feeder is down again...

Michael.
ID: 30895 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 242
Credit: 5,800,306
RAC: 0
Message 30896 - Posted: 20 Jun 2017, 8:36:46 UTC

We are experimenting with the feeder parameters to try to tune performance. With our default setting, it takes 5-10 minutes to fill the scheduler memory buffer size with the current backlog of tasks.
ID: 30896 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2375
Credit: 221,680,245
RAC: 142,961
Message 30898 - Posted: 20 Jun 2017, 11:26:47 UTC

Host 1:
20-Jun-2017 13:12:40 [LHC@home] No tasks are available for LHCb Simulation
No further error message.

Host 2:
20-Jun-2017 13:15:59 [LHC@home] No tasks are available for CMS Simulation
No further error message.


Although there should be thousands of unsent WUs:
https://lhcathome.cern.ch/lhcathome/server_status.php
ID: 30898 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 242
Credit: 5,800,306
RAC: 0
Message 30900 - Posted: 20 Jun 2017, 11:55:55 UTC - in response to Message 30898.  

What the scheduler has in shared memory may not always match what is queued in the database. The tasks should come after a while, but the feeder has difficulties with the backlog from Sunday.

As mentioned, we're trying to optimize this, and you might see occasional "feeder not running" messages on your client today.
ID: 30900 · Report as offensive     Reply Quote

Message boards : News : Network and server problems Sunday night


©2024 CERN