1) Message boards : Number crunching : Too low credits granted in LHC (Message 23303)
Posted 1 Oct 2011 by J Henry Rowehl
Post:
We can loose the credits, see how many crunchers stay around for science.


Deja vu...

Now, what about those who think the faster computer should get more points for completing the work faster? Or, what about those who think that the slower computer should get more points because it spent more time crunching? I'd be willing to bet that if all the BOINC projects stopped granting points altogether, we would find out relatively quickly who was crunching for the warm fuzzy feeling of contributing something worthwhile to science, and who was crunching solely to get the most points.


It does make you wonder. With cheating being the issue that it has become, why do people find it necessary to cheat at all. It sometimes looks like the cheaters have the attitude that 'whatever you do, I can do better'. If the credit system were abolished, would these people have a nervous breakdown, pack up their toys and leave, all the while screaming about how unfair it is that they can't be the best anymore?
2) Message boards : Number crunching : Too low credits granted in LHC (Message 23300)
Posted 1 Oct 2011 by J Henry Rowehl
Post:
I found something interesting... go the the QMC website at:

http://qah.uni-muenster.de/projectinfos.php

And you'll find the following at the bottom of the page:

Credits:

Because of several complaints about cheating-attempts, we STOPPED to give out credits according to the standard BOINC system (see here for details). We assign every workunit series a FIXED CREDIT value according to a calculation on our reference system.


That's a small part of the situation I was attempting to address with my question regarding credit grants. If you know how many credits you will grant for a particular WU, then cheating is very nearly eliminated. But, that brings us right around in a complete circle to the start of the conversation, - what is a fair standard for calculating credit?

Hoo boy.... I never ask the easy questions, do I? :-)
3) Message boards : Number crunching : Too low credits granted in LHC (Message 23285)
Posted 29 Sep 2011 by J Henry Rowehl
Post:
University Flashback!
Now I remember why I loathed Assembler Programming.


Assembly was only part of the fun. The architecture of the 'processor' made things interesting also. Back in my days on submarines, I worked on CDC-1604 mainframes with RD-281 disk drives. The CDC-1604 was a second generation Cray, and the RD-281 was the original Winchester disk drive, the one where the head carriage (9 pounds) was moved by a hydraulic system powered by a Cessna aircraft 120 PSI hydraulic pump.

The CDC-1604 didn't have a CPU. It didn't have a main processor as such. The little beastie had a 'subtractive adder' as it's central processor. The subtractive adder, by itself, was about twice the size of 2 current day tower PC's. Programming it was not a task for the faint of heart. There were no add, multiply, or divide instructions available. You had to add by subtracting, hence the name subtractive adder.

Take a simple problem, like adding 2 plus 3. You start by getting the operands in the registers as usual. Then, negate the operand in the accumulator. Make sure you use the correct instruction, since one's compliment is a direct bit by bit inversion, but the two's compliment handled the sign bit as well as adjusting the value in the LSB. You now have negative 2. Then, subtract the other operand (3) giving you negative 5. Now negate the accumulator again, giving you positive 5.

Multiply and divide operations were exponentially more complicated. After writing a few short programs in some of the training courses, it became very clear to me why programmers in the 'days of old' always had a 500 count bottle of aspirin on their desk and wore glasses thick enough to be bullet proof. They were also easily recognizable because they had either turned completely gray or gone bald sometime in their mid to late twenties.
4) Message boards : Number crunching : Too low credits granted in LHC (Message 23284)
Posted 28 Sep 2011 by J Henry Rowehl
Post:
In the properties there is another interesting number... Estimated app speed in GFLOPS/sec. I have no idea what that's used for but I bet it has (or will have in the future) something to do with credit calculations.


I think that's used by the server scheduler. I saw that also, and just happened to notice that although it's a different number for each computer, each individual computer has the same number for every WU from every project.

Stepping cautiously out on a limb, I think the scheduler uses your CPU benchmark to generate that number, and then uses that number as a gross check against the WU deadline before sending the WU.
5) Message boards : Number crunching : Too low credits granted in LHC (Message 23283)
Posted 28 Sep 2011 by J Henry Rowehl
Post:
After writing my message offline, then logging on and posting it, I discovered your reply. Just my luck... :-)

I'm also fairly confident that the method of estimating the task size and fraction done has been sufficiently refined over the years that it has reached the point where it 'works for SETI', therefore, it 'works for Dave'. This causes me to stand up and argue that we have a time tested and proven method that can be applied across all projects.

There may be a proven method that works well for SETI applications and tasks but that same method might fail miserably on other projects' apps and tasks. I wouldn't assume anything there.


After you read message that I just posted, that pretty much summarizes the rough spot that I'm trying to get a handle on. You had mentioned concerns about the number of times a particular routine might get called, varying numbers of loop iterations, and that sort of thing. All of which are valid points. I'm trying to come up with a workable method of reporting that back to the projects in such a way that the information can be put to good use, while at the same time remaining 'cheat resistant'.

Maybe one or more fields that are coded against a randomly generated one-time-key from the project servers? Hmmmm...

Got any aspirin? :-)
6) Message boards : Number crunching : Too low credits granted in LHC (Message 23282)
Posted 28 Sep 2011 by J Henry Rowehl
Post:
I think I'm back. :-) I did some more mental pretzelization during dinner, and kind of refined my thoughts. I think my stumbling blocks are based on what I view to be two basic issues. First issue being what I perceive to be a lack of a solid definition of 'work'. I know you feel that there is a definition, but I just don't get a warm fuzzy with the current definition. I'll give my thoughts and reasoning on this shortly. The second issue being task length calculations for the purpose of determining the amount of work done.

OK, first issue - definition of work. As I had mentioned earlier, it appears to me that the flop is the standard applied across BOINC. but, when the subject of credit grants is raised, suddenly there are several creeping unknowns thrown in to complicate and confuse - time, processor speed, memory access/latency, CPU vs GPU, who screams the loudest, rogue projects, cheaters, etc. Since credit is granted for performing work, then, in order to to have universal standard to determine how much credit to grant for performing that work, there must first be a universal standard to define the work that is being performed.

If you can bear with me for a moment while I pull some numbers out of the air, a 'policy determination' from BOINC might look like this:

[
In order to address concerns regarding the processing and granting of credit for Work Units, the staff of UC Berkley Science Lab, BOINC Administration and BOINC Development teams ("BOINC Admin") have adopted the following standards:
1 - The basic measure of computer processing resources expended by any type of electronic computing device will be the Unit of Work (UW). One UW consists of 0.25 Gflops. The 'Estimated Task Size' as reported by the Work Unit Generation software authored, compiled, and issued by BOINC Admin is the sole measure of work required to complete a Work Unit.
2 - One credit will be granted for each UW that is successfully completed and that has passed validation by the project that generated and issued the original Work Unit.
In the event that any project administration team has any concerns regarding any aspect of Task Size estimates or Credit Granting will present their concern to BOINC Admin for review. If BOINC Admin determines that the concern is valid, BOINC Admin will decide what, if any, corrective action will be taken.
]

That would give us a standard definition of work, a standard to use for granting credit, and a standard point of authority for resolving disputes. As far as the task length calculations, that's been sorta kinda halfway addressed. Maybe. Still thinking on that. The problem remaining is how to report the actual work done if the task size estimate was way off, and be 'cheat resistant' at the same time. That's the biggest contributing factor to my mental hernia right at the moment.

Let me think on it some more and try to post an idea or two tomorrow.

Chat at ya later!

7) Message boards : Number crunching : Too low credits granted in LHC (Message 23279)
Posted 27 Sep 2011 by J Henry Rowehl
Post:
Sorry I didn't get back to you last night... between a long day at work, the grand kids screaming in the background, and then my reply to the message being 'lost' somewhere, I decided to wait till today and try again.

What I was asking is why aren't we uniformly maintaining that standard of measurement for the credit grants?

I'm sure Dave Anderson would like to do just that. If that could be done the problem would be mostly solved as I see it. If every science app that runs under BOINC were coded in assembly then a very accurate count of the flops and iops could be determined prior to run time IF you could know before hand how times loops would be iterated and exactly where the code would go before it exits. How many times does it enter this/that block? How many times does this loop iterate? Those answers have to be known prior to run time but they usually aren't. The problem becomes even worse when apps are compiled/assembled from C, C++, FORTRAN, etc.


I think that's where I started yesterday, and yes, I agree. I'm sure Dave would like to put this all behind him also. The languages you mentioned aren't in my area of expertise, unfortunately. I can't put any code on the table for that reason. About the best I can do is to propose a semi-detailed concept, and hand it over to the people that have their hands on the code on a daily basis.

I'm happy with just counting/estimating/measuring the int and float ops but nobody has found a good way to do that so far.


That particular point has resulted in my brain turning into a large sized pretzel. By selecting any task in progress in my task queue, and displaying the properties (sorry - can't manage to copy it over to the message for reference... it a SETI enhanced 6.03 WU) I see two particular items that jump off the screen and bite me in the knee. Those two items being 'Estimated task size' and 'Fraction done'. The estimate of the task size comes from... somewhere? Is it assigned by the WU generator? And, is the calculation for that hard coded?


What I meant when I said "Dave Anderson is going to that" is that his algorithm is going to count/estimate/measure int and float ops, logical ops and whatever other units of work he decides his algorithm will count/estimate/measure. I don't know exactly how he is going to do that but he's the lead programmer, he's no dummy, so let's give him a shot at it.


Emphasis added by yours truly. :-) I chose the SETI WU because I have a fair degree of confidence that Dave has been instrumental in most, if not all aspects of generating/distributing/processing and granting credits for completion. I'm also fairly confident that the method of estimating the task size and fraction done has been sufficiently refined over the years that it has reached the point where it 'works for SETI', therefore, it 'works for Dave'. This causes me to stand up and argue that we have a time tested and proven method that can be applied across all projects.

The question would then become how to ensure that all projects follow SETI's lead and use the same method? Remove any option(s) to use any method other than the hard coded algorithm. By doing that, all projects would have a uniform method of estimating task size, a uniform method of reporting the fraction done, and therefore have a uniform method of establishing credit grants. Since the project servers know set parameters (size, credits, etc) before assigning the WU, then the client claimed credit becomes irrelevant.

I'm going to have to stop now for dinner. I'll go ahead and post this now, and check back in a bit, hopefully before I forget the rest of my thoughts! :-)


8) Message boards : Number crunching : Too low credits granted in LHC (Message 23266)
Posted 26 Sep 2011 by J Henry Rowehl
Post:
I think you may have misunderstood what the question was that I was asking here:

I read the article on 'creditnew', and to be perfectly honest, about 90 percent of it shot right past me at about warp 7. I saw the parts about computer speed, processing time, etc. But I have to ask, why is the CPU benchmark given in flops? On the server status pages, why is the available computing power given in Gflops and Tflops? When a project page shows the 'user of the day', why does it say 'Person X is contributing Y Gflops'? And before I go too much further, yes, I realize that 'flops' is floating point operations per second, second being a measure of time.



Well, if you had any understanding of how computers work you wouldn't have to ask why benchmarks are given in flops. Nobody's going to give you a meaningful tutorial on that here. Read. Find out about machine cycles, instructions and operations.


The question was why are flops used as the standard measure everywhere except credit grants?

Back in my days of Z80 assembly programming, I routinely went through the instruction repertoire calculating the number of CPU cycles needed per instruction, just to make sure that a routine would meet the polling time requirement. How many cycles to perform the instruction fetch, and how many cycles to load either one or two bytes from RAM into the registers, and do I need a near or far load instruction, each with a different number of cycles? And, what is the memory latency in the system? Also, will I need to reprogram the PIO chip for a different data type? Not to mention interrupt handling - an MI (Maskable interrupt) involved about 50 cycles to dismiss if I recall correctly, while an NMI (Non-Maskable nterrupt) shot the whole process out the window. An NMI was the usual response to the DMA controller tri-stating the CPU data and address busses and handing control over to the FDC. When this happened, the CPU executed NOP's continuously until a CPU reset occurred in response to the 20ms heartbeat interrupt, or, the CPU entered a halt state until the DMA released the halt line. If you had a 50ms polling requirement, your routine clocked in at 38 ms, but, the FDC had the busses for 1.5 seconds, you were dead in the water with no chance of survival.

As far as floating point operations, what accuracy are you looking for? How many positions will the data need to be shifted to meet the accuracy requirement? Each data shift requires a set number of cycles to perform, so it's important to know how many shifts are needed. Also, are we doing simple shifts, logical shifts, or rotates? In the Z80, not all registers were capable of 16 bit shifts and rotates, some were only capable of 8 bit operations. A 16 bit shift in a 16 bit register involved 6 prefetch cycles and 31 execution cycles. Meanwhile, a 16 bit shift in an 8 bit register pair required two separate 8 bit shifts, along with setting a bit in the flag register, and including that flag in the second 8 bit shift. Each 8 bit shift involved 6 prefetch cycles and 18 execution cycles, while the flag set/test/restore required 5 prefetch and 9 execution cycles.

So, yes, I know what a flop is. I have some knowledge about machine cycles, instructions and operations. What I was asking is why aren't we uniformly maintaining that standard of measurement for the credit grants?


9) Message boards : Number crunching : Too low credits granted in LHC (Message 23265)
Posted 26 Sep 2011 by J Henry Rowehl
Post:
>> Wow, I didn't realize that I'd be opening up such a can of worms with the credit issue!
> What makes you think you opened the can? It's been open for a long time.

Just standing up and taking my lumps. :-) One thing I do want to mention before continuing with our message exchange, is that the written word does not convey the human aspect of a face-to-face conversation. I ask questions because I do not know the answer, and not to be confrontational or adversarial. I make suggestions without knowing if the same suggestion has already been made. As you are reading this, you can't see that I'm sitting here scratching my head while staring off into space thinking about the conversation. I can assure you that I'm not sitting here red faced and fuming, with a large heavy object, ready to beat you soundly about the head and shoulders until you come around to my way of thinking.

My intention is to contribute to the team spirit by putting my thoughts and ideas on the table. Needless to say, just because a thought pops out of my head doesn't mean that anyone is obligated to put it to use. On the other hand, sometimes the solutions to seemingly complex problems come from the most unlikely sources. Such as my IT department at work, troubleshooting a server failure for 5 hours, and swapping out motherboards, power supplies, RAM, CPU's... Then, they were jolted back into reality by a fork truck operator that drove by singing the words to a Glade air freshener commercial - "Plug it in, plug it in!". They had checked at the beginning to verify that the computer was plugged in, but, they hadn't noticed that the battery charger and radio plugged into the same outlet were not working either. They spent several thousand dollars in manpower and hardware troubleshooting what turned out to be a tripped circuit breaker.

That was a poor example, but the basic point was supposed to be that you can not solve the problem until you drill down to the root cause. So, putting on our collective thinking caps, what's the root cause? I think you came really close, possibly without realizing it:

> All that amounts to is equal pay for equal work, a principle long established and accepted by most.

> Cross project credit equality is definitely impossible when projects are allowed to give whatever projects they want.

It appears to me that cross project credit equality is impossible, because we have no way of determining what constitutes equal pay for equal work, because there is not yet any definition of what 'work' actually consists of. Now, take a deep breath, count to 10... am I close? Could it be as simple as that? We're on the third version of credit grants, so we've replaced the power supply, motherboard, and CPU. Could a lack of a definition for 'work' be the tripped circuit breaker causing the problem?

>> Yes, there should be some type of limit set

> Now you contradict what you just said about it being the admin's decision.

Umm... er... OK, I'll concede that point, and withdraw my suggestion that the project admins assign the credit grants. Once again, you've provided something that made me rethink things.

> Who's going to determine what a task is worth? Dave Anderson is going to do that

AHA! Who better to define 'work' than the BOINC founder? Once we have a definition of work, it can now be applied equally across all projects. Let me catch a number... hang on... got it! 10,000 Gflops. OK, Dave decides that 1 unit of work consists of 10,000 Gflops, and 1 credit is granted for each unit of work completed. Work has just been defined, and the credit grant has been assigned. You have been assimilated. CreditNewerThanNew is now in effect. Rogue projects have just been un-rogued. Cheaters have just been tripped up. The project administrators are out of the loop, but they can still control number of tasks per CPU, number of tasks in queue, number of active tasks, all that good stuff.

> the amount of granted credit can be scaled based on the percentage of the work unit that was processed. In other words, if 20 percent of the WU was processed, then 1 point is granted, 50 percent gets 2.5 points, and so on.


> You're talking about a deterministic algorithm, an algorithm whose progress can be determined prior to running or counted while running. That's not always possible. There is no way to determine if 20 percent was processed, 50 percent or whatever.

> Computers don't compute pie in the sky and pipe dreams. They compute numbers.

OK, here's a chance for me to pick up some more education. :-) Checkpoints? When an application is interrupted, it needs to know where to pick up from when it gets CPU time again. Don't laugh too loud if I'm about to embarrass myself, but, if the application knows how long the WU is, knows what position it reached at the last checkpoint, and can successfully resume from the same point, shouldn't it be able to take those two numbers and compute the percentage of the WU that's been completed? If the WU is 561K in length, and the last checkpoint was at 178K, then the WU is 31.7 percent complete?

Time for me to put this down for a bit. I hope I was able to stimulate some productive thoughts without tromping aimlessly on raw nerves. :-)

I appreciate the conversation, thanks!

10) Message boards : Number crunching : Too low credits granted in LHC (Message 23247)
Posted 25 Sep 2011 by J Henry Rowehl
Post:
Wow, I didn't realize that I'd be opening up such a can of worms with the credit issue! Anyway, I thought I'd give a little more of an idea as to what I think might be a good method for figuring out the credit to grant for crunching. I know that the subject of credit comes up quite a lot on a lot of projects, and the question becomes how to fairly grant credit for work done.

My belief is that the project administrators should be able to determine how much they want to 'pay' for a volunteer's computer resources. Yes, there should be some type of limit set, but the actual credit grant should be based on the value of the returned result as it applies to the science being done. I read the article on 'creditnew', and to be perfectly honest, about 90 percent of it shot right past me at about warp 7. I saw the parts about computer speed, processing time, etc. But I have to ask, why is the CPU benchmark given in flops? On the server status pages, why is the available computing power given in Gflops and Tflops? When a project page shows the 'user of the day', why does it say 'Person X is contributing Y Gflops'? And before I go too much further, yes, I realize that 'flops' is floating point operations per second, second being a measure of time.

So... let me ramble on with my thoughts.

The project administrators decide that the science returned in a given type of work unit is worth 5 points to them. The work unit is sent out, and when returned with a successful result, 5 points are granted. In the event that a work unit can be returned with several valid end points, such as the LHC work units that can complete successfully even if the particle crashes into a wall, the amount of granted credit can be scaled based on the percentage of the work unit that was processed. In other words, if 20 percent of the WU was processed, then 1 point is granted, 50 percent gets 2.5 points, and so on.

I am going to use my previous example, but modify it so that computer A does 10 Gflops, computer B does 1 Gflops, and add a computer running on a GPU that is capable of 150 Gflops. One of my LHC work units shows an estimated task size of 30,000 Gflops. For this WU, Computer A should take 3,000 seconds (50 minutes) to run the job, Computer B should take 30,000 seconds, (about 8.3 hours), and computer C should take 200 seconds (about 3.3 minutes). When each computer completes 30,000 Gflops of work, that computer is 'paid' 5 points, regardless of how long it takes. Simple enough for me.

Now, what about those who think the faster computer should get more points for completing the work faster? Or, what about those who think that the slower computer should get more points because it spent more time crunching? I'd be willing to bet that if all the BOINC projects stopped granting points altogether, we would find out relatively quickly who was crunching for the warm fuzzy feeling of contributing something worthwhile to science, and who was crunching solely to get the most points.

Another example - I have several desks in an office that need to be moved. Each desk weighs 100 pounds, and needs to be moved 300 feet, or, I need 30,000 foot pounds of work done for each desk. And, I'm going to pay 5 dollars to move each desk. Amazingly enough, that's the same number of estimated Gflops for an LHC WU! And the same number of points I used in my example! I'm goooood! :-)

Three people agree to do the work, and head over to the office to start moving the desks. Now for the fun part... Person A came back in less than an hour, got paid, and was happy. Person B came back 8 and a half hours later and wanted more money because he spent more time moving the desk. Person C came back in less than 5 minutes and wanted more money because he moved the desk so quickly. Does this sound eerily familiar?

The people running the faster computers DO get more points. Using the three computers from my example above, computer A, at 10 Gflops, can crunch 28.8 WU's/Day, for 144 points. Computer B, at 1 Gflops, can crunch 2.88 WU's/Day, for 14.4 points. And computer C, at 150 Gflops, can crunch 432 WU's/Day, for 2160 points.
11) Message boards : Number crunching : Too low credits granted in LHC (Message 23238)
Posted 24 Sep 2011 by J Henry Rowehl
Post:
After reading this thread, I agree that the amount of credit that LHC awards for work appears to be lower than other projects. I do not crunch for credit... I crunch because I have an interest in what the project is doing. Getting credit for donating my resources to a project is nice, and getting more credit is nicer, but, the amount of credit granted is NOT a consideration for me.

I do see something that I think may have a bearing on the credit issue. If I understand correctly, the credit system at LHC appears to be based on computing time. My personal feeling is that it might be better to award credit based on worked performed. I make this suggestion based on what I have seen between the different computers that I have crunching for various projects, and the credit granted for WU's of approximately the same size.

Let me see if I can explain what I'm thinking without creating too much confusion. :-)

Computer 'A' performs 1000 operations per second. Computer 'B' performs 10 operations per second. A WU arrives requiring 5000 operations to complete, and the project will award 1 credit for every 10 seconds of work. If both computers get the same WU, when that WU is completed, computer A will get 0.5 credits, and computer B will get 50 credits.

If the project granted 1 credit for every 1000 operations performed, then both computers A and B will get 5 credits granted.

Did I get that right without embarrassing myself???

Now, I realize that things aren't quite that simple, and there are factors and circumstances that I haven't addressed in my example. But this example does point out what I feel is a weakness in a time based credit structure.



©2024 CERN