Message boards : News : Database issues
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 248
Credit: 5,974,599
RAC: 0
Message 44599 - Posted: 29 Mar 2021, 16:43:57 UTC

Our database cluster is heavily loaded today, and LHC@home services time out from time to time. Our DBA is trying to fix this. Sorry for the trouble and happy crunching.
ID: 44599 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1811
Credit: 118,330,157
RAC: 26,152
Message 44611 - Posted: 30 Mar 2021, 11:24:18 UTC

The problem now, unfortunately, is, that downloading new tasks is impossible, since after several automatic unsuccessful trials of the BOINC manager to download tasks, communication to the download server is being deferred for 24 hours :-(((
Something should be changed with this system.
ID: 44611 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2531
Credit: 253,722,201
RAC: 41,981
Message 44612 - Posted: 30 Mar 2021, 11:35:49 UTC

Although the server status shows "Workunits waiting for validation: 0"
my hosts have 18 CMS WUs waiting for validation since 2021-03-11.

CMS has a quorum of 1 which is already fulfilled.
Hence, I guess they will never be send out to another host to get checked.
ID: 44612 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1811
Credit: 118,330,157
RAC: 26,152
Message 44613 - Posted: 30 Mar 2021, 11:45:04 UTC - in response to Message 44611.  

The problem now, unfortunately, is, that downloading new tasks is impossible, since after several automatic unsuccessful trials of the BOINC manager to download tasks, communication to the download server is being deferred for 24 hours :-(((
Something should be changed with this system.
is there anything I can do to escape this strange trap, or do my machines really have to wait until tomorrow to get new tasks?
ID: 44613 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1811
Credit: 118,330,157
RAC: 26,152
Message 44616 - Posted: 30 Mar 2021, 12:35:29 UTC - in response to Message 44612.  

Although the server status shows "Workunits waiting for validation: 0"
my hosts have 18 CMS WUs waiting for validation since 2021-03-11.

CMS has a quorum of 1 which is already fulfilled.
Hence, I guess they will never be send out to another host to get checked.
what I just notice:
many tasks that were uploaded yesterday afternoon and evening are still waiting for validation (CMS, Theory, Sixtrack), but 2 CMS tasks that were uploaded about 1 hour ago were validated immediately.

So I am wondering whether the tasks from yesterday will ever get validated.
ID: 44616 · Report as offensive     Reply Quote
Eric Mcintosh
Volunteer moderator
Project administrator
Project developer
Project tester
Project scientist

Send message
Joined: 12 Jul 11
Posts: 857
Credit: 1,619,050
RAC: 0
Message 44618 - Posted: 30 Mar 2021, 12:51:59 UTC

Why don't you just put a second server in parallel!!! Eric
ID: 44618 · Report as offensive     Reply Quote
[VENETO] boboviz
Avatar

Send message
Joined: 7 May 08
Posts: 217
Credit: 1,575,053
RAC: 297
Message 44619 - Posted: 30 Mar 2021, 13:24:05 UTC - in response to Message 44612.  

Although the server status shows "Workunits waiting for validation: 0"
my hosts have 18 CMS WUs waiting for validation since 2021-03-11.


Same here, i have 72 sixtrack waiting for validation.
Maybe "status page" is not updated correctly.
ID: 44619 · Report as offensive     Reply Quote
[VENETO] boboviz
Avatar

Send message
Joined: 7 May 08
Posts: 217
Credit: 1,575,053
RAC: 297
Message 44627 - Posted: 31 Mar 2021, 6:54:05 UTC - in response to Message 44619.  

Same here, i have 72 sixtrack waiting for validation.


Now 86.
Seems that validator have some queues to solve.
ID: 44627 · Report as offensive     Reply Quote
Profile Nils Høimyr
Volunteer moderator
Project administrator
Project developer
Project tester

Send message
Joined: 15 Jul 05
Posts: 248
Credit: 5,974,599
RAC: 0
Message 44628 - Posted: 31 Mar 2021, 10:47:52 UTC - in response to Message 44627.  

We are checking our result records and will trigger a re-validation of results that did not get validated properly during the DB outage.
ID: 44628 · Report as offensive     Reply Quote
Mr P Hucker
Avatar

Send message
Joined: 12 Aug 06
Posts: 429
Credit: 10,579,233
RAC: 2,563
Message 44657 - Posted: 6 Apr 2021, 10:23:07 UTC - in response to Message 44613.  

The problem now, unfortunately, is, that downloading new tasks is impossible, since after several automatic unsuccessful trials of the BOINC manager to download tasks, communication to the download server is being deferred for 24 hours :-(((
Something should be changed with this system.
is there anything I can do to escape this strange trap, or do my machines really have to wait until tomorrow to get new tasks?
I thought Boinc backed off a slightly longer time each time there's no work available. That makes sense. If there's no work, try in 10 minutes, still no work, must be bigger problem, wait an hour, nothing then, wait a few hours, still nothing, wait a day, don't keep pestering. Seems sensible to me, it's what I'd do if I was trying to phone a company with busy phone lines. You can always just select the project and click update.
ID: 44657 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 2242
Credit: 173,902,375
RAC: 2,798
Message 44659 - Posted: 6 Apr 2021, 11:37:13 UTC - in response to Message 44658.  

Since 5.00 UTC Atlas-Tasks are back!
ID: 44659 · Report as offensive     Reply Quote

Message boards : News : Database issues


©2024 CERN