Message boards : Number crunching : Host corruption solved?
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
Profile Chrulle

Send message
Joined: 27 Jul 04
Posts: 182
Credit: 1,880
RAC: 0
Message 11025 - Posted: 26 Oct 2005, 11:20:43 UTC

I have installed a new validator and a new assimilator. Let us see if this solves the problem.

Chrulle
Research Assistant & Ex-LHC@home developer
Niels Bohr Institute
ID: 11025 · Report as offensive     Reply Quote
Profile Mr.Pernod
Avatar

Send message
Joined: 16 Jul 05
Posts: 65
Credit: 369,728
RAC: 0
Message 11027 - Posted: 26 Oct 2005, 11:35:02 UTC

I have two hosts with disabled network-access and pending credits.
host: 47319 credits: 21967.37 pending: 3 results
host: 57272 credits:7879.67 pending: 2 results
both hosts show correct information (unsure about "Average turnaround time")
I will keep an eye on them this afternoon and see if any pending credits are rewarded.
ID: 11027 · Report as offensive     Reply Quote
TPR_Mojo

Send message
Joined: 1 Sep 04
Posts: 8
Credit: 349,947
RAC: 0
Message 11030 - Posted: 26 Oct 2005, 11:46:35 UTC

So far so good. I have several corrupted host records on my account, but the hosts they relate to have created new records for themselves and (crossed fingers) the corruption doesn't appear to have affected these new records.
ID: 11030 · Report as offensive     Reply Quote
JardaM

Send message
Joined: 14 Jul 05
Posts: 9
Credit: 28,299
RAC: 0
Message 11031 - Posted: 26 Oct 2005, 11:51:53 UTC - in response to Message 11025.  

I have installed a new validator and a new assimilator. Let us see if this solves the problem.

It won't solve it, I'm afraid.
My last contact (upload) is approx. 1000 UTC, my last validated result is approx. at 1100. I belive only database of hosts is getting corrupted continuously.
ID: 11031 · Report as offensive     Reply Quote
Profile Chrulle

Send message
Joined: 27 Jul 04
Posts: 182
Credit: 1,880
RAC: 0
Message 11032 - Posted: 26 Oct 2005, 11:53:46 UTC

Well, since the new processes where installed after 1100 UTC it could still work.

Chrulle
Research Assistant & Ex-LHC@home developer
Niels Bohr Institute
ID: 11032 · Report as offensive     Reply Quote
JardaM

Send message
Joined: 14 Jul 05
Posts: 9
Credit: 28,299
RAC: 0
Message 11033 - Posted: 26 Oct 2005, 12:04:16 UTC

You were right. I reset the project and it started to download new WU immediately. Also values in my host record are OK now.
ID: 11033 · Report as offensive     Reply Quote
Profile littleBouncer
Avatar

Send message
Joined: 23 Oct 04
Posts: 358
Credit: 1,439,205
RAC: 0
Message 11034 - Posted: 26 Oct 2005, 12:16:20 UTC - in response to Message 11033.  
Last modified: 26 Oct 2005, 12:20:24 UTC

You were right. I reset the project and it started to download new WU immediately. Also values in my host record are OK now.


Question @ JardaM
Did the reset also solve the daily quota 'failure'? (from previous misswriting of the quantity to 0 WU/day)
From another post I saw you had the same problem as I. Your qoute:
It wozuldn't bother me much but the problem is that rewrites also No. of WUs allowed per day. This practically cuts me off from your source. I cannot get any fresh WU any longer.


greetz littleBouncer
BTW: I reported one success WU, but the daily quota didn't change (host 41625)...
ID: 11034 · Report as offensive     Reply Quote
Robert Nelson

Send message
Joined: 13 Jul 05
Posts: 4
Credit: 2,431,270
RAC: 405
Message 11036 - Posted: 26 Oct 2005, 12:44:04 UTC
Last modified: 26 Oct 2005, 12:44:27 UTC

Your fix corrected the host information on a host that had corrupted over night prior to your fix. However, the host location did not update from the corrupted value of 0 (the other information was ok) reset the host location to the location I desired for that host and will now watch and see if it holds.
ID: 11036 · Report as offensive     Reply Quote
Profile Thierry Van Driessche
Avatar

Send message
Joined: 1 Sep 04
Posts: 157
Credit: 82,604
RAC: 0
Message 11037 - Posted: 26 Oct 2005, 13:07:39 UTC
Last modified: 26 Oct 2005, 13:08:46 UTC

Result of 1 WU uploaded, host still OK.

Result of that WU reported 10 minutes after the upload and credit granted immediately, host still OK. Location still OK too.
ID: 11037 · Report as offensive     Reply Quote
Gaspode the UnDressed

Send message
Joined: 1 Sep 04
Posts: 506
Credit: 118,619
RAC: 0
Message 11038 - Posted: 26 Oct 2005, 13:11:02 UTC

Not sure yet - host venues now seem OK, as do the operating system entries, but this host (23752) still has a 12wu/day limit, and won't retrieve new work because of this, even though it's only had 9 results delivered to it in the last 24 hours.


Gaspode the UnDressed
http://www.littlevale.co.uk
ID: 11038 · Report as offensive     Reply Quote
timethief

Send message
Joined: 22 Jul 05
Posts: 7
Credit: 67,923
RAC: 0
Message 11040 - Posted: 26 Oct 2005, 13:27:26 UTC

Very powerfull host 62217?

LHC@home - 2005-10-26 15:21:54 - Message from server: No work sent (reached daily quota of 200 results)

Wow, I never thought, that this host can do so many WUs a day ;-)
.. or is there something wrong with the scheduler?

Greeting!
ID: 11040 · Report as offensive     Reply Quote
HStruik

Send message
Joined: 28 Sep 04
Posts: 1
Credit: 459,549
RAC: 0
Message 11042 - Posted: 26 Oct 2005, 14:31:31 UTC

26-Oct-05 16:34:24|LHC@home|Message from server: No work sent
26-Oct-05 16:34:24|LHC@home|Message from server: (reached daily quota of 100 results)

Get this MSG from the server although my host has no WU's left to crunch.
What's up ?
ID: 11042 · Report as offensive     Reply Quote
Profile Mr.Pernod
Avatar

Send message
Joined: 16 Jul 05
Posts: 65
Credit: 369,728
RAC: 0
Message 11043 - Posted: 26 Oct 2005, 14:53:21 UTC

all my active hosts have updated succesfully, host-information remained intact after credit was granted.

Only problem left is the
10/26/2005 4:48:54 PM|LHC@home|Message from server: No work sent
10/26/2005 4:48:54 PM|LHC@home|Message from server: (reached daily quota of 200 results)
message, so far I have only seen it on host 47319, which is a dual-cpu system
my dual Xeon (4 logical cpu's) downloaded new results without a problem.
ID: 11043 · Report as offensive     Reply Quote
Digitalis
Avatar

Send message
Joined: 2 Sep 04
Posts: 19
Credit: 26,799
RAC: 0
Message 11044 - Posted: 26 Oct 2005, 15:07:02 UTC

Just manually updated the wu's from overnight crunching and host info is now remaining intact here after a credit update.
Get BOINC WIKIed

ID: 11044 · Report as offensive     Reply Quote
Profile littleBouncer
Avatar

Send message
Joined: 23 Oct 04
Posts: 358
Credit: 1,439,205
RAC: 0
Message 11047 - Posted: 26 Oct 2005, 16:59:06 UTC
Last modified: 26 Oct 2005, 16:59:17 UTC

The only 'thing' I noted after a manually update (only sending 2 WU's back).
Why there is the bolded line (see bellow)?

26.10.2005 18:49:53|LHC@home|Sending scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi
26.10.2005 18:49:53|LHC@home|Reason: Requested by user
26.10.2005 18:49:53|LHC@home|Requesting 0 seconds of work, returning 2 results
26.10.2005 18:50:02|LHC@home|Scheduler request to http://lhcathome-sched1.cern.ch/scheduler/cgi succeeded
26.10.2005 18:50:03|LHC@home|Deferring communication with project for 5 seconds

IMO:That is something unnessecary, or I'm wrong?

greetz littleBouncer
otherwise no problems with that host 45610

ID: 11047 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 1 Sep 04
Posts: 36
Credit: 78,199
RAC: 0
Message 11053 - Posted: 26 Oct 2005, 18:15:31 UTC - in response to Message 11047.  
Last modified: 26 Oct 2005, 18:17:44 UTC

26.10.2005 18:50:03|LHC@home|Deferring communication with project for 5 seconds

IMO:That is something unnessecary, or I'm wrong?


To guard against clients flooding scheduling-server with requests, there's for a long time been a project-specific limit between how often scheduling-server accepts connections.

A resent change is that the scheduling-server now always sends this limit in all replys, this stops clients from immediately asking again, but getting hit by a "too resent"-message.


Of course, with projects like LHC@home that uses 5?-second-limit it's mostly superfluous to be told to wait 7 seconds... but since it's an informal message meaning doesn't show up as red, it shouldn't be a problem. ;)


BTW, the deferral is always a little bit longer than the limit, to guard against computers with too-fast clock being stuck in an infinite loop of asking, "too resent" wait 1 minute, contacts after 59.99 seconds and "too resent".
ID: 11053 · Report as offensive     Reply Quote
Profile littleBouncer
Avatar

Send message
Joined: 23 Oct 04
Posts: 358
Credit: 1,439,205
RAC: 0
Message 11055 - Posted: 26 Oct 2005, 19:02:36 UTC - in response to Message 11053.  
Last modified: 26 Oct 2005, 19:12:18 UTC

26.10.2005 18:50:03|LHC@home|Deferring communication with project for 5 seconds

IMO:That is something unnessecary, or I'm wrong?


To guard against clients flooding scheduling-server with requests, there's for a long time been a project-specific limit between how often scheduling-server accepts connections.

A resent change is that the scheduling-server now always sends this limit in all replys, this stops clients from immediately asking again, but getting hit by a "too resent"-message.


Of course, with projects like LHC@home that uses 5?-second-limit it's mostly superfluous to be told to wait 7 seconds... but since it's an informal message meaning doesn't show up as red, it shouldn't be a problem. ;)


BTW, the deferral is always a little bit longer than the limit, to guard against computers with too-fast clock being stuck in an infinite loop of asking, "too resent" wait 1 minute, contacts after 59.99 seconds and "too resent".


@ Ingleside
THX for your informativ reply!
you wrote:"but since it's an informal message meaning doesn't show up as red, it shouldn't be a problem."
But this message:
26.10.2005 18:50:03|LHC@home|Deferring communication with project for 5 seconds
is in red!!! and appaers after each request.
(edit):After the 5 second of deferring the client doesn't request at new! (what is right, but the line is unnessecary), that's why I was asking, because normally it requests after the announced deferring time...

greetz littleBouncer
ID: 11055 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 1 Sep 04
Posts: 36
Credit: 78,199
RAC: 0
Message 11058 - Posted: 26 Oct 2005, 20:30:14 UTC - in response to Message 11055.  

But this message:
26.10.2005 18:50:03|LHC@home|Deferring communication with project for 5 seconds
is in red!!! and appaers after each request.


It's only showing red if you're running an old client, not if you're running v5.2.x ;)
ID: 11058 · Report as offensive     Reply Quote
[B^S] sTrey

Send message
Joined: 4 Aug 05
Posts: 11
Credit: 14,485
RAC: 0
Message 11059 - Posted: 26 Oct 2005, 21:18:28 UTC
Last modified: 26 Oct 2005, 21:22:12 UTC

Sorry never mind -- misread the timestamp on some log messages. Updating venue manually fixed the client's messages eventually.
ID: 11059 · Report as offensive     Reply Quote
Profile littleBouncer
Avatar

Send message
Joined: 23 Oct 04
Posts: 358
Credit: 1,439,205
RAC: 0
Message 11062 - Posted: 26 Oct 2005, 22:00:26 UTC - in response to Message 11058.  

But this message:
26.10.2005 18:50:03|LHC@home|Deferring communication with project for 5 seconds
is in red!!! and appaers after each request.


It's only showing red if you're running an old client, not if you're running v5.2.x ;)


@ Ingleside
Sorry , it is a 4.72, I will upgrade when all works fine with the server after it's upgrade, that's why I'm not yet on 5.2.x.;)

greetz littleBouncer
ID: 11062 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Host corruption solved?


©2024 CERN