Message boards : Number crunching : Database Errors
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Purple Rabbit

Send message
Joined: 22 Oct 04
Posts: 8
Credit: 3,007,613
RAC: 1,178
Message 11968 - Posted: 14 Jan 2006, 19:18:43 UTC

I think LHC may want to do a database search for invalid and out of date entires. I have one host (33305) that has corrupted CPU type and Operating system entries. This host now has a new identity, but I can't merge the hosts because of the database error.

These are from last August when the database went down. There are also some strangler WU that either are pending or or have been granted credit, but not deleted or acknowledged. Most are from before August.

Maybe a database scrub is in order? There seems to be some leftovers in here.
ID: 11968 · Report as offensive     Reply Quote
Travis DJ

Send message
Joined: 29 Sep 04
Posts: 196
Credit: 207,040
RAC: 0
Message 11970 - Posted: 14 Jan 2006, 19:32:37 UTC - in response to Message 11968.  
Last modified: 14 Jan 2006, 19:33:44 UTC

In another post, long, long ago....

The fix is to reset your clients. After then you can check your computers on the LHC website and the information should show up correctly then. If not, wait a minute then wash, rinse, repeat.

Some people have WUs that will never receive credit. Last time I heard, it was a very low priority problem. I wouldn't worry about it.
ID: 11970 · Report as offensive     Reply Quote
Profile Purple Rabbit

Send message
Joined: 22 Oct 04
Posts: 8
Credit: 3,007,613
RAC: 1,178
Message 11972 - Posted: 14 Jan 2006, 19:44:48 UTC - in response to Message 11970.  
Last modified: 14 Jan 2006, 19:58:31 UTC

In another post, long, long ago....


Travis, thank you, but this doesn't apply to what I was saying. My host (33305) was active in August. It is now host #68704 and doing well. Unfortunately I updated Tomato (33305) to a new Linux distribution and killed all the BOINC files just when the LHC database crashed. I started over, but the old, corrupted file remained in the database. It can't be changed because Tomato has a new identity.

It also looks like some other files pre-August also remain. It looks like the database is corrupted to me.

I say again, there are extraneous entries in the database. An LHC sysadmin should look at this because I assume the impact is larger than what I can see.

ID: 11972 · Report as offensive     Reply Quote
Travis DJ

Send message
Joined: 29 Sep 04
Posts: 196
Credit: 207,040
RAC: 0
Message 11973 - Posted: 14 Jan 2006, 19:58:09 UTC - in response to Message 11972.  
Last modified: 14 Jan 2006, 19:58:23 UTC

I had 9 computers which exhibited the same problem you mentioned back in the same timeframe. What I didn't "get" when you first posted your question was that you had reloaded with a new OS.

You'll never be able to merge those two hosts because of the difference in the database between what you know is right and what the database thinks is right. The database entry may have invalid data in it (IIRC it was using your LTD in place of other fields as appropriate, hence the numbers instead of "Microsoft Windows...") but the enrty is valid as far as the database was concerned. If your host was available as it was back in August, then yes the reset your client suggestion would have cleared that entry up and you would have been able to reload, give it the same name, then merge away to your heart's delight.

To clarify:
1) The database does have wrong host information in it
2) Those entries are valid as far as the database is concerned
3) No chance at merging ;(
4) They do need to expunge such entries, however doing so would alter the total amounts of credits have been issued once those hosts have been removed (BadThing™).

ID: 11973 · Report as offensive     Reply Quote
Profile Purple Rabbit

Send message
Joined: 22 Oct 04
Posts: 8
Credit: 3,007,613
RAC: 1,178
Message 11974 - Posted: 14 Jan 2006, 20:08:26 UTC - in response to Message 11973.  
Last modified: 14 Jan 2006, 20:21:04 UTC


4) They do need to expunge such entries, however doing so would alter the total amounts of credits have been issued once those hosts have been removed (BadThing™).

Travis, again thank you. My database entry for CPU type for 33305 is:

1.270598
0
My entry for Operating system is:

0
1458205218.5383

These fields are clearly corrupted in the database when you might expect ascii test in these fields. Thank you for your response, but I think this is a sysadmin problem.
ID: 11974 · Report as offensive     Reply Quote
Travis DJ

Send message
Joined: 29 Sep 04
Posts: 196
Credit: 207,040
RAC: 0
Message 11975 - Posted: 14 Jan 2006, 20:21:06 UTC - in response to Message 11974.  
Last modified: 14 Jan 2006, 20:21:59 UTC

I did look at your hosts. :)

Unless you want your user account to be shorted 6,330.41 credits if the affected host were removed (because it can not be merged), then I wouldn't worry about it.

ID: 11975 · Report as offensive     Reply Quote
Ingleside

Send message
Joined: 1 Sep 04
Posts: 36
Credit: 78,199
RAC: 0
Message 11976 - Posted: 14 Jan 2006, 20:23:29 UTC - in response to Message 11975.  

I did look at your hosts. :)

Unless you want your user account to be shorted 6,330.41 credits if the affected host were removed (because it can not be merged), then I wouldn't worry about it.


User-credit is separate from host-credit, so it's no problem to delete any old hosts there all results has been purged.
ID: 11976 · Report as offensive     Reply Quote
Profile Purple Rabbit

Send message
Joined: 22 Oct 04
Posts: 8
Credit: 3,007,613
RAC: 1,178
Message 11978 - Posted: 14 Jan 2006, 20:44:59 UTC - in response to Message 11976.  
Last modified: 14 Jan 2006, 20:53:56 UTC

User-credit is separate from host-credit, so it's no problem to delete any old hosts there all results has been purged.


Yes, that's all I ask! Delete the darn thing and let the world go on :-) Again, I think LHC needs to look at the datbase.

To get my message back to what I intended: The database crash in August caused some problems. I think a rebuild might be in all of our best interests. There are still some artifacts left over from the crash.

I'm seeing some fallout in my accounts, but I suspect there's more. I had the unfortunate timing to change my Linux OS at the same time LHC was having problems...sigh.

Otherwise I'm doing fine as a cruncher. The world goes on.
ID: 11978 · Report as offensive     Reply Quote
Profile Chrulle

Send message
Joined: 27 Jul 04
Posts: 182
Credit: 1,880
RAC: 0
Message 11980 - Posted: 14 Jan 2006, 21:04:12 UTC

Well, the spurious database entries are not affecting the functioning of the system so it is not a big priority to fix it. It is annoying for you to have a corrupted entry in your overview but that is the only impact on the system.

To fix this i would have to manually go through the database and delete/fix those entries. This would take up a lot of my time and there would be a big chance that i mess something up. Causing some damage to the DB that will affect the system negatively.


cheers,
Chrulle
Research Assistant & Ex-LHC@home developer
Niels Bohr Institute
ID: 11980 · Report as offensive     Reply Quote
Profile Jim Baize
Avatar

Send message
Joined: 17 Sep 04
Posts: 103
Credit: 38,543
RAC: 0
Message 11981 - Posted: 14 Jan 2006, 21:07:48 UTC

Chrulle,

Thanks for the answer to the question that I didn't ask yet. :D

Actually, I have one of those spurious WU's on my account also. I was going to say something about it, but it wasn't that big of a deal to me to get right on it. Now taht I see you are aware of the issue and I know where you stand on the issue, I can just ignore the issue.

Again, Thanks for the information.

Jim
ID: 11981 · Report as offensive     Reply Quote
Profile Purple Rabbit

Send message
Joined: 22 Oct 04
Posts: 8
Credit: 3,007,613
RAC: 1,178
Message 11984 - Posted: 14 Jan 2006, 21:25:33 UTC - in response to Message 11980.  
Last modified: 14 Jan 2006, 21:56:09 UTC

Well, the spurious database entries are not affecting the functioning of the system so it is not a big priority to fix it.


OK, I now know your database philosophy. No big deal :-)

I wasn't sure if you knew about the errors. These things are annoying to the users (at least me) tho :-) At least please put it on your "to do" list. I was afraid this might be part of a larger problem.
ID: 11984 · Report as offensive     Reply Quote
Profile Paul D. Buck

Send message
Joined: 2 Sep 04
Posts: 545
Credit: 148,912
RAC: 0
Message 12030 - Posted: 15 Jan 2006, 13:48:10 UTC

It is nearly impossible to come up with a "rule" that will clearly identify a "bad" entry from a system perspective.

If the computer has no results on it, you can delete it yourself. If it does have residual results, well, then we (the generic we) have to live with it. Heck WCG has 16 devices in my account and no way to merge or delete them. But, it is the same set of computers, but since they sent me a bad account key, which I had to get changed, then detach and reattach ... well ... now I have a full "mirror" of my systems which, as you say, is annoying ...

But I would rather that them guys was out hunting up more work for us ...
ID: 12030 · Report as offensive     Reply Quote
Travis DJ

Send message
Joined: 29 Sep 04
Posts: 196
Credit: 207,040
RAC: 0
Message 12061 - Posted: 15 Jan 2006, 21:00:36 UTC

I'm very glad to know that deleting a host doesn't affect user credit. :)

There are some old hosts I've wanted to do that to..


ID: 12061 · Report as offensive     Reply Quote
Profile Purple Rabbit

Send message
Joined: 22 Oct 04
Posts: 8
Credit: 3,007,613
RAC: 1,178
Message 12065 - Posted: 15 Jan 2006, 21:57:20 UTC - in response to Message 12030.  
Last modified: 15 Jan 2006, 22:00:35 UTC

It is nearly impossible to come up with a "rule" that will clearly identify a "bad" entry from a system perspective.

That's true Paul, but I think a floating point value in a field where it should say "Linux" is a prime candidate :-)

If the computer has no results on it, you can delete it yourself.

Unfortunately the computer still has one result from August that will probably never go away. It's complete, but hasn't been deleted.

I didn't realize that this was really hard for the admin to fix. As Chrulle said it's not affecting the project. I agree what I have is just cosmetic. I reported the error and life goes on :-)
ID: 12065 · Report as offensive     Reply Quote

Message boards : Number crunching : Database Errors


©2024 CERN