41) Message boards : Number crunching : Faulty Computers or Modified BOINC ?? Huge Credits (Message 23837)
Posted 12 Jan 2012 by Profile Igor Zacharov
Post:
I have been correcting the data base by hand subtracting values from total_credit in host and user tables. I believe all values should be fair now.

If anybody thinks more should be done, please let me know.

Thanks,
42) Message boards : Number crunching : Faulty Computers or Modified BOINC ?? Huge Credits (Message 23836)
Posted 12 Jan 2012 by Profile Igor Zacharov
Post:

Или с теми заниями которые уже набрали кворум и прошли валидацию это уже не срабатывает?


именно так
43) Message boards : Number crunching : Faulty Computers or Modified BOINC ?? Huge Credits (Message 23830)
Posted 10 Jan 2012 by Profile Igor Zacharov
Post:
The new credit system is in place. Will monitor how it goes.

I have corrected the host and user tables in the sixtrack data base. This is not a simple action, as this may impact the consistency of the data base. I have recalculated the value of total_credit in these two tables. If you know more places, where values should be corrected - let me know.

The correction has been done for both cheating episodes, last november and in january. However, for november I have reset only the user, who generated the huge claimed credit values. The reason not to update other host/user tables affected by the huge credit claims, is that it is not obvious how to recalculate the contributions. This is also why we did not do it immediately then. This time however, I felt that it must be done even at a risk of some inconsistencies.

I hope this will resolve the issue. Thank you.
44) Message boards : Number crunching : Faulty Computers or Modified BOINC ?? Huge Credits (Message 23825)
Posted 10 Jan 2012 by Profile Igor Zacharov
Post:
We do have a new credit system ready which depends only on the computational characteristics of the sixtrack program. This will defeat any cheating with clock or any other way of statistics distortion - both, intentional or unitentional.

It was ready for introduction after the cheat episode last year. What prevented its introduction, is the wait for the update to the latest boinc server version, that was waiting for the verification of the executable, etc., etc.

I will decouple different dependencies and make the introduction of the new credit system the highest priority.

Thank you for making your opinion known. I appreciate the shake up.



45) Message boards : Number crunching : Tasks v530.09 crashing (Message 23431)
Posted 9 Oct 2011 by Profile Igor Zacharov
Post:
After consultation with Eric McIntosh, we desided to retrack completely the
530.9 version. I have reinstalled the 530.8 version, now called 530.10.

We will come back to it after a better investigation. Have to admit a mistake.
46) Message boards : Number crunching : Long delays in jobs (Message 23417)
Posted 9 Oct 2011 by Profile Igor Zacharov
Post:
with help of Keith, I have implemented the reliable/trusted host settings correctly now.
I believe the execution turn-around will improve. Will monitor and see.

Thank you much!

47) Message boards : Number crunching : Tasks v530.09 crashing (Message 23415)
Posted 9 Oct 2011 by Profile Igor Zacharov
Post:
we don't have much architectural choices when specifying which app version to run.

I have now retracted 530.9 (deleted) for all generic x86 Windows and Linux,
leaving 530.9 specifically only for platforms which report with AMD_x86_64 and Intel EM64T processors back to the server.

Please, check if that works for you.


48) Message boards : Number crunching : Tasks v530.09 crashing (Message 23391)
Posted 7 Oct 2011 by Profile Igor Zacharov
Post:
yes, the 530.9 and 530.8 deliver identical results (within the model, where
we look for last bit differences). The 530.9 can be factor of 2 faster, but
not always - it does not optimize away calculations, it (the compiler in fact)
just organizes them better by using the pipelining and special instructions.

Yes, there is no problem for us to keep multiple version of the same.
I just need to find out how to call the architecture to which the older
cpus belong. This will allow for automatic selection of the executable.

Shortening the deadlines is only a discussion item at this time. I still need
to assess what will have the largest inpackt on the efficiency of calculations.
49) Message boards : Number crunching : Tasks v530.09 crashing (Message 23387)
Posted 6 Oct 2011 by Profile Igor Zacharov
Post:
Apparently, we should have kept the version 530.8 for older processors.
It is still possible, I have not removed them.

What would be the architecture designation for distinguishing the old and the new?
50) Message boards : Number crunching : Can't set resource to zero (Message 23375)
Posted 6 Oct 2011 by Profile Igor Zacharov
Post:


I've just tried it, setting it to 0.001 on the web does the trick, Boinc manager reports the resource share as zero.

EDIT TO ADD THE FOLLOWING:
The reason I myself want a resource share of zero is that I only want to crunch LHC@Home at the weekend. Like zombie67 I don't want the cache to fill up with tasks. It would be OK except that I couldn't suspend the tasks to continue crunching the following weekend, as the deadlines are so short.

[/quote]

so, it works then and RS can be set to 0 with this trick.

You don't have to explain why you need it. We appreciate your contribution.

This brings me to another thought, however. Physicists complain about the long
outfliers of a study. The bulk of first jobs comes quickly, then last jobs take very long, since they are sitting and waiting somewhere.
We may need to tune the system to squize tighter deadlines.

I would like to collect opinions about this first.
51) Message boards : Number crunching : no more work? (Message 23374)
Posted 6 Oct 2011 by Profile Igor Zacharov
Post:
two new studies will go in tomorrow. I have been holding them back to see
the result of the sixtrack version upgrade. So far it is good. Also the
numerical comparison between the versions holds.


Sorry, was posting accidentaly from a different account where I was testing.
The message itself is correct.
52) Message boards : Number crunching : Can't set resource to zero (Message 23358)
Posted 5 Oct 2011 by Profile Igor Zacharov
Post:
does setting the resource share to 0.001 works for you?

Just to say, we are not on the very latest boinc server version, because of
some administrative reasons. Nothing serious and will be sorted out.
(If you wonder - std CERN update program for software dependencies doesn't
include the latest versions needed for boinc server installation.)

I like to use some projects as back-up projects, by using the "0" resource share method. But when I set it to zero for this project, it keeps getting changed back to 100. Please fix.

53) Message boards : Number crunching : Linux vs. Windows app (Message 23345)
Posted 4 Oct 2011 by Profile Igor Zacharov
Post:
Why specifically Linux version of sixtrack might be slower then the Windows
version needs to be investigated. Before we do that, I have loaded inherently
faster version of sixtrack (version 503.9) into the system which uses SSE3.

This has to be monitored for a while, since we have also changed the validator
to accomodate results from different sixtrack versions. System should be stable.

With version 503.9 we are also interested in comparison between linux and windows.
Before, the numerical stability was major concern. Sixtrack is very sensitive
to erroneous results. Particles turning around accelerator structure are very
close to chaotic behaviour. Single bit differences somewhere lead to exponential
deviations over 1 million turns.

We have found machines out there, which produce wrong results 50-100 times more
frequently then the average. We would like to offer an explanation for this
artifact and will suggest a way to investigate this further.
54) Message boards : Number crunching : Too low credits granted in LHC (Message 23249)
Posted 25 Sep 2011 by Profile Igor Zacharov
Post:
we are on boinc version 6.11.0 with validator for sixtrack rewritten from
the sample_bitwise_validator. The credit is calculated by returning the
function stddev_credit from the compute_granted_credit in the validator.

I believe we don't calculate any credit in sixtrack (will verify with Eric McIntosh). Therefore, the average is based on the client software claimed credit.
55) Message boards : Number crunching : Too low credits granted in LHC (Message 23236)
Posted 24 Sep 2011 by Profile Igor Zacharov
Post:
It is kind of important. Who regulates what should be second of compute worth?

I have changed the algotithm of assignment to take the average of all the
valid contributions. For the rest, the default boinc assignments are kept.
The actual valuation could also be dependent on boinc version.

I don't mind to multiply the calcuated figure by a factor. If admins on the
other projects do the same we will end up in an escalating situation which
makes no sense.

Proper advice is needed.
56) Message boards : LHC@home Science : screensaver/graphics bug (Message 23234)
Posted 24 Sep 2011 by Profile Igor Zacharov
Post:
SixTrack is a computational program... Part of our problems in the past was related to bugs in the graphics part, which hampered the advance of the computation. Therefore, we have decided to cut the graphics off completely.

I do agree that visual attraction is important. We plan to resurect the screen saver, but it should not be coupled to the program any more. The screensaver
could be fed the cumulative effect of the study to give it informative sense.

Having said that, the priority on the graphics project is low, I must admit.
If there are volunteers to help with the screensaver it would be great.
57) Message boards : Number crunching : Daily quota (Message 23205)
Posted 23 Sep 2011 by Profile Igor Zacharov
Post:
Right now the limits are as follows:

<daily_result_quota> 80 </daily_result_quota>
<one_result_per_user_per_wu> 1 </one_result_per_user_per_wu>
<max_wus_to_send> 2 </max_wus_to_send>
<max_wus_in_progress> 1 </max_wus_in_progress>

Please, monitor and see how this works.
58) Message boards : Number crunching : Credit awarded calculation (Message 23177)
Posted 21 Sep 2011 by Profile Igor Zacharov
Post:
based on this discussion we have changed the algorithm of awarding credits.
It is based on average now. Please, monitor and tell us if the change is
visible and if it is indeed the right method.
59) Message boards : Number crunching : error -177 resource limit exceeded (Message 23132)
Posted 19 Sep 2011 by Profile Igor Zacharov
Post:
I have put the value of 90MB into the data base for all work units and
restarted the boinc server. Also, I have changed the distribution rate:
<max_wus_to_send> 5 </max_wus_to_send>
<max_wus_in_progress> 3 </max_wus_in_progress>

(was 10 before) such that not too many are sitting on a single machine.
The daily quota is still at 40, so that nobody should be short of work.

I suggest, if you abort the jobs which are waiting or consummed little
time you will get new jobs with corrected disk size.

Please, report any other problems you see. Thank you.
60) Message boards : Number crunching : error -177 resource limit exceeded (Message 23127)
Posted 19 Sep 2011 by Profile Igor Zacharov
Post:
The beam-beam interaction jobs which we run now are new studies.
They simulate influence of beam particles on each other and need large number
of simulated turns to unveil the effects. Of course, we have tested on our
computers before submitting, but we may not have corrected everything.

The problem with resource limits is serios and I will discuss with Eric McIntosh when he comes to CERN in the morning.

We will review the settings asap.


Previous 20 · Next 20


©2024 CERN