Message boards : Number crunching : Discepency in Credits?
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile Lord Tedric
Avatar

Send message
Joined: 20 Jan 06
Posts: 2
Credit: 1,137
RAC: 0
Message 14135 - Posted: 23 Jun 2006, 11:02:51 UTC
Last modified: 23 Jun 2006, 11:03:25 UTC

There appears to be an inherent bug in the way in which BOINC calculates credit!

I posted in the Seti@Home forums a discrepency between the Credits granted and the sum of the HOSTS. http://setiathome.berkeley.edu/forum_thread.php?id=31939

This discrepency also appears to affect LHC

I have a shortfall of approx. 1205 Credits!

Anyone else suffering from this error in calculation?
ID: 14135 · Report as offensive     Reply Quote
Nuadormrac

Send message
Joined: 26 Sep 05
Posts: 85
Credit: 421,130
RAC: 0
Message 14142 - Posted: 23 Jun 2006, 18:55:13 UTC
Last modified: 23 Jun 2006, 18:57:43 UTC

My host tends to have more. Last time this happened was also due to the last time we had server probs, which was evened up when Churlle fixed it the last time.

Best I can figure, is something like this is going on. There's evidence of something seriously wrong in the dBase (for instance, click on certain sections like pending credit, teams, whatever, and get some error message wrt the operaton taking place).

So, the host reports a WU back, without a database association to say which account owns the host, the credit gets applied to the host, but not the owning account. Ditto on anything getting applied to team accounts, as they aren't even comming up anymore. If the dBase can't find out what computer belongs where, all these orphaned records/computers can end up gaining credit, without it applying elsewhere.

I'm not 100% positive how they have it setup, but there might be a host_computer table in the dBase which is seperate from the user table? If that's the case, the dBase would have to be able to link these tables together to make the necessary applications of credit; and performing lookups on some tables (aka pending credit and teams) seems to be what invokes the error with a 0.00 given for a result.

Or perhaps I should modify that. The computers aren't completely orphaned. They still do show up under our account as being our computers. It just seems an operation done to them credit wise, isn't reciprically being done to the comp account that owns them during such a dBase crash...
ID: 14142 · Report as offensive     Reply Quote
Toby

Send message
Joined: 1 Sep 04
Posts: 137
Credit: 1,691,526
RAC: 383
Message 14153 - Posted: 23 Jun 2006, 22:59:59 UTC
Last modified: 23 Jun 2006, 23:01:29 UTC

If the hosts are showing up under your account then the validator knows who the host belongs to. It uses the same foreign key for both lookups. It is more likely that this is the same bug that was just discovered over at seti and appears to be a bug in the basic BOINC validator code. It has to do with timing.

Say results 1 and 2 are reported at the same time by user "bob". There are 2 validator processes running. Each one of them grabs one of the results and validates it. Validator 1 does a lookup on the user table and sees that user bob has x credits. At the same time, validator 2 looks up the same information and also determines that bob has x credits. Then validator 1 adds the credit for result 1 to x and saves x+1 back to the user table. Then validator 2 adds the credit for result 2 and saves that to the user table (x+2) however this overwrites the x+1 that validator 1 wrote right before so in the end the user only gets credit for x+2 instead of x+2+1

The solution is to make the operations atomic so that validator 2 can't read how much credit bob has until validator 1 has finished adding and saving the data.

Of course none of this matters right now since the results table seems to be totally screwed up :)
- A member of The Knights Who Say NI!
My BOINC stats site
ID: 14153 · Report as offensive     Reply Quote
Nuadormrac

Send message
Joined: 26 Sep 05
Posts: 85
Credit: 421,130
RAC: 0
Message 14155 - Posted: 23 Jun 2006, 23:53:05 UTC
Last modified: 23 Jun 2006, 23:55:16 UTC

Keep in mind however, that since the database crash, my computer's copy of the cedit has been increasing, but the value in the user table has not. AKA one has been going up, and the other is the same.

Also, things can be orphaned in 1 direction, aka if it is able to link the tables in 1 direction, but for whatever reason not the other. Yes, it is reasonable the same foreign key is used, and logically it should be able to go both ways. This of course is assuming that everything is working correctly.

What could be going on, is that a result is being reported by a given host, and so there's the record of the host being made through the connection with the dBase itself. However, if in the process of trying to update the user, the dBase is getting the same sorta error we are getting, such as

Warning: mysql_fetch_object(): supplied argument is not a valid MySQL result resource in /shift/lxfsrk4101/data01/projects/lhcathome/html/user/results.php on line 41

Warning: mysql_free_result(): supplied argument is not a valid MySQL result resource in /shift/lxfsrk4101/data01/projects/lhcathome/html/user/results.php on line 45


then it might be unable to proceed any further, so stops.

One other thing to keep in mind. The results table needs to be updated, along with some other tables potentially. The user table doesn't need an update on some fields, unless an account is created, the cross-project ID is changed (aka after being created, BOINC contacts and supplies a different, earlier one), etc. It's the credits that need a recipracle update. Computers is another one of those things. Which of course brings up the possible question if one of the validator's is running into the same "not valid argument" we are, and if so; is simply unable to complete an operation.

I point this out, because there might or might not even be an attempt to update the user table at this point; in which case we could be looking at old data, rather then current (aka some things being presented to us has simply been cached). Not entirely positive, as I haven't looked at LHC's dBase setup itself; though as another sign of this whole dBase crash, LHC hasn't exported XML files for stat sites like (BOINCstats) to pick up in awhile either.

http://www.boincstats.com/

LHC@Home 2006-06-21 18:45:06 GMT
2 days 03:59:00 old


All in all, doesn't seem implausable that some parts of the dBase are still functioning (else we wouldn't even be getting to the forums, or be able to look at anything), and other parts have crashed. Nor does it seem implausable that if we are getting errors such as the above, that any internal operations that require those same tables or present similar arguments might not be getting the same error we get on our screen. And from there...?

Could some of these figures such as user credit be cached somewhere anyhow, or does it have to be a recent copy. At the very least I have noticed in the past a lag time between when user credits are incremented, and when this increment shows up on their team page (which of course couldn't complete sucessfully, after the teams have disappeared).
ID: 14155 · Report as offensive     Reply Quote
Profile Mchl

Send message
Joined: 18 Sep 04
Posts: 23
Credit: 3,304
RAC: 0
Message 14156 - Posted: 24 Jun 2006, 0:00:05 UTC
Last modified: 24 Jun 2006, 0:01:40 UTC

I told Willy at BOINCstats about LHC problems and he probably disabled importing stats from the project to avoid corrupted data.
See here

But indeed files in http://lhcathome.cern.ch/stats/ are dated Jul 21st 2006
ID: 14156 · Report as offensive     Reply Quote
Travis DJ

Send message
Joined: 29 Sep 04
Posts: 196
Credit: 207,040
RAC: 0
Message 14178 - Posted: 24 Jun 2006, 23:14:51 UTC

To throw in my two cents:

Could the problem LHC is experiencing with multiple host-ids be contributing to this symptom? It seems it might make sense if the servers are no longer able to track which hosts a user has due to the hosts issue.

ID: 14178 · Report as offensive     Reply Quote

Message boards : Number crunching : Discepency in Credits?


©2024 CERN