1) Questions and Answers : Windows : Less credit granted than others get for same wu (Message 4982)
Posted 7 Nov 2004 by joe
Post:
Yes, that's what I explained earlier, your result is flagged "Invalid" but still got credits :

http://lhcathome.cern.ch/result.php?resultid=2267683


The old validator would have granted 0.0 for that. But it is not totally invalid, it is still close enough to the other results - the new validator "saw" that and gave you a percentage of the possible credits.


http://lhcathome.cern.ch/FAQ.html#2.5

describes, how that works.


p.s.: I have a few of those results too, often when there's two Intel CPUs vs. one AMD or vice versa. No overclocking involved, the CPUs (or different compilers) just get slightly different float results. This could only be fixed if a project would use BCD arithmetics instead of float.
2) Questions and Answers : Windows : Less credit granted than others get for same wu (Message 4967)
Posted 6 Nov 2004 by joe
Post:
> Look at your claimed credit. The granted credit is the middle figure or the
> lowest figure of the finished results......Your granted credit may be higher
> or lower than your claimed credit


The new validator can give credits for results that the old one would have claimed to be "invalid" with zero credits.

The new one still flags it "invalid" but it gives still some credits if the result is fairly correct but not close enough to be really good.
3) Questions and Answers : Windows : Server can't open database (Message 4935)
Posted 5 Nov 2004 by joe
Post:
I wasn't worried, just thought I'd post it, as this time it was not the usual one that says "too many connections". It sounded like a network problem rather than a database overload problem.

It sure can "confuse" a database client when the LAN between the application server and the database server gets lost. That's why it is a good idea to have this connection redundant, i.e. one primary p2p connection between the two machines and a backup one through the normal house-LAN.
4) Questions and Answers : Windows : Server can't open database (Message 4885)
Posted 4 Nov 2004 by joe
Post:
I just got the scheduler message :

[scheduler_reply]
[message priority="low"]Server can't open database[/message]
[request_delay]14400[/request_delay]
[project_is_down/]
[/scheduler_reply]
[scheduler_reply]
[project_name]LHC@home[/project_name]
[/scheduler_reply]

but a few seconds later the scheduler did reply. It seems that some overload condition is not recognized.

(the client didn't recognize the message, that's why it contacted the server again after a minute instead of 4 hours)
5) Questions and Answers : Sixtrack : Changed credit computation (Message 4784)
Posted 1 Nov 2004 by joe
Post:
http://lhcathome.cern.ch/FAQ.html#2.5
6) Questions and Answers : Wish list : [server] Idea for reducing number of database requests (Message 4764)
Posted 1 Nov 2004 by joe
Post:
- in results.php set
$results_per_page = 50;

- make the first statement in result.php
ob_start("ob_gzhandler");


As more results are fetched with one database query, it should reduce the total load if someone browses all his results.

The GZ-compression of the page avoids longer load times for the users and reduces overall HTTP traffic.
7) Questions and Answers : Preferences : 2 Computers but I have only 1 (Message 4714)
Posted 1 Nov 2004 by joe
Post:
> Hello,
>
> thanks again for your answer!!!
> I actually run the client and it works very good. That is now OK for me that I
> have a new Host ID. But must I do anything with the client_state.xml ?
> Actually the client works fine! And when I must do anything with the
> client_state.xml , please can you describe it very easy, because I don�t know
> what I must do.
>
> Thanks a lot!!!
>
> Akio

It seems to always merge the lower ID into the higher one, keeping the newer one. I had to do this a few times and it did not affect the work units, those still have been accepted by the server. So you should not need to change anything in your client_state.
8) Questions and Answers : Preferences : how do I set up different cache for different machines (Message 4662)
Posted 30 Oct 2004 by joe
Post:
As the cache size is measured in days and not in work unit count, it will do this anyway. If a fast machine requests 3 days, it is more work than if a slow machine requests 3 days.

If you still need different settings, you can use the separate preferences of "school" and "work" to create different 3 groups of computers.
9) Questions and Answers : Windows : What does "can't parse scheduler reply" mean? (Message 4574)
Posted 28 Oct 2004 by joe
Post:
There have been too many requests on the database, the server sent an XML error message about this fact to your client.

This error message contained some HTTP relics above the XML header, which confused the BOINC client.


The correct message would have been : Database server overload, try again in about an hour.


It will be fixed in the next BOINC client update, which will reduce the server overload times significant.
10) Questions and Answers : Windows : error 113 when starting (Message 4497)
Posted 27 Oct 2004 by joe
Post:
If your BOINC version is older than 4.13, try a newer one - some download issues have been fixed :

http://setiweb.ssl.berkeley.edu/download.php


You can see your BOINC version in Help->About (from the menu) and there in the window title
11) Questions and Answers : Wish list : Deleting results instead of work units (Message 4273)
Posted 25 Oct 2004 by joe
Post:
Some work units are "sticky", they keep getting sent out over and over again. From what I have seen so far, the reason for that are unrecognized download errors with BOINC 4.09 that have not the correct status and do not show an error although stderr.txt does contain an error message.

Now and then, those workunits just disappear - I guess that's when the Admin cleans up.

It would be better to delete those results that have :

- exit_status = 0 AND
- CPU_time = 0.0 AND
- claimed_credit = 0.0 AND
- client_state = 4 AND (the numeric representation of "Done")
- outcome = 1 AND (the numeric representation of "Success")
- stderr MATCHES "*download error*" (does this work on BLOBs? *)

so next time the validator will see that the work units are ready for validation.

An example for such a result : http://lhcathome.cern.ch/result.php?resultid=1244946


* I guess, as stderr is a BLOB, it will need a PHP program that extracts the field and scans the extracted contents for the text. It could then set the correct exit status that it should have for download errors.

But maybe it would work to just combine the other fields in the SELECT to give a good UPDATE statement for this problem.
12) Questions and Answers : Windows : No longer recognised as a valid user (Message 4112)
Posted 22 Oct 2004 by joe
Post:
Did you maybe use quite an old BOINC version for the reinstall?

You should download one from a running project, the one at Predictor might not be a recent one at the moment as they are working on upgrades.

Working versions should be 4.09 or 4.13 (not sure about older ones)
13) Questions and Answers : Windows : SCHEDULER_REPLY::parse(): bad first tag Content-type: text/plain (Message 3944)
Posted 17 Oct 2004 by joe
Post:
> I also had this message come up but I suspect that it was due to LHC's server
> being down around that time.
>
> Just my opinion of course, could be wrong


Will be fixed with the next BOINC client. If the database is too busy, BOINC will give the correct error message.
14) Questions and Answers : Unix/Linux : boinc 4.12 (Message 3928)
Posted 17 Oct 2004 by joe
Post:
> looking at the Seti pages, I get the impression that 4.12/4.13 fixes a couple
> communications issues re. uploading results.
>
> I haven't actually tried 4.12/4.13 on my linux testbox.


If you haven't installed 4.12 yet - don't do it, use 4.13 instead.

It causes some computers to hang because there was a configuration problem with the task switching timer on other venues than home - this has been fixed in 4.13
15) Questions and Answers : Sixtrack : Same WU sent out after multiple valid results received? (Message 3927)
Posted 17 Oct 2004 by joe
Post:
The validate state is "check skipped", it must be a bug in the server part of BOINC as it happens with Seti work units too. So for some reason it didn't even try to validate those results.

edit : It seems to happen quite often when at least one of the first three results has a client upload error which would be http://lhcathome.cern.ch/result.php?resultid=953217 in your example.


edit2: In the meantime the work unit and its results disappeared.
16) Questions and Answers : Sixtrack : Bad first tag???? (Message 3781)
Posted 14 Oct 2004 by joe
Post:
See http://lhcathome.cern.ch/forum_thread.php?id=745
17) Questions and Answers : Windows : abnormal server activity (Message 3770)
Posted 13 Oct 2004 by joe
Post:
Just some ideas :

- Have you started BOINC as a user different from the one that you used for downloading the work units?

- Did you copy BOINC stuff back from a CDROM without clearing the readonly flag?

- Has the owner of the directory been changed sometimes?
18) Questions and Answers : Windows : server overflow or faliure? (Message 3757)
Posted 13 Oct 2004 by joe
Post:
It's nothing to be worried about, just a minor problem with the XML parser that happens with a few (or only one?) specific error message. usually it's a message about too much load on the database. It will fix itself after a few tries - when the database is less busy.


edit : When this happens, you can look at sched_reply.xml to see the correct error message, just copy it to sched_reply.txt and open it with notepad.
19) Questions and Answers : Windows : WU not finishing (Message 3756)
Posted 13 Oct 2004 by joe
Post:
Instead of resetting the whole thing, you could carefully try to remove the single damaged WU from your client_state.xml

- Make a backup copy of the XML files in the BOINC root directory or better of your complete BOINC directory

- edit client_state.xml and remove the one section

[file_info]
[name]??????????.zip[/name]

[/file_info]

where the name value equals the name of your stuck work unit. Be sure to remove the complete section, not more and not less.

If the work unit is gone there, the client should recognize that. It will still be in the project folder of course but you can remove that later.


Oh - and _before_ you do all this : Check, if your global_prefs.xml has [max_cpus]2[/max_cpus] in the venue that you have set for this computer !!!
20) Questions and Answers : Windows : SCHEDULER_REPLY::parse(): bad first tag Content-type: text/plain (Message 3544)
Posted 11 Oct 2004 by joe
Post:
> This message seems to pop up when the LHC server is particularly busy. It
> indicates that the limit on the number of DB connections has been exceeded. It
> will generally clear at the next attempt since some DB connections will
> probably have been released.


That's not really the reason for this error message.

The server _does_ issue a correct error message about the database overload but preceeds it with the mime type. This confuses the BOINC function SCHEDULER_REPLY::parse(), which requires the first line to contain "[scheduler_reply]"

Although this is caused by the sender side, I think, the parser on client side should just ignore the mime type entry "Content-type:" or even everything outside of the scheduler_reply tags.

I think, CPDN had the same problem earlier - would really be good to parse less picky in this case. It would be a 5 minute task to fix the function.


Next 20


©2024 CERN