Thread 'Can't parse scheduler reply'

Author	Message
joe Send message Joined: 2 Sep 04 Posts: 24 Credit: 12,288 RAC: 0	Message 3037 - Posted: 1 Oct 2004, 22:43:41 UTC re if this is a BOINC problem or LHC related : From stderr.txt : [pre] 2004-10-02 00:30:10 [LHC@home] SCHEDULER_REPLY::parse(): bad first tag Content-type: text/plain 2004-10-02 00:30:10 [LHC@home] Can't parse scheduler reply [/pre] And from sched_reply.xml : [pre] Content-type: text/plain Server can't open database 3600 LHC@home [/pre] Problem seems to be this Content-type: text/plain ID: 3037 · Reply Quote

FalconFly Send message Joined: 2 Sep 04 Posts: 121 Credit: 592,214 RAC: 0	Message 3038 - Posted: 1 Oct 2004, 23:00:22 UTC - in response to Message 3037. Same here, I also get multiple errors from the Website (too many connections, unable to connect to Database). Seems they've hit their first real bottleneck on the Servers ;) ___________________________________________ <p>Scientific Network : 36200 MHz «» 8204 MB «» 815.0 GB </p> ID: 3038 · Reply Quote

STE\/E Send message Joined: 2 Sep 04 Posts: 352 Credit: 2,898,606 RAC: 0	Message 3043 - Posted: 2 Oct 2004, 0:14:34 UTC 2004-10-02 00:30:10 [LHC@home] SCHEDULER_REPLY::parse(): bad first tag Content-type: text/plain 2004-10-02 00:30:10 [LHC@home] Can't parse scheduler reply And from sched_reply.xml : ========= Well at least now I know I'm not the only one getting that Message ... Been getting a lot of 2 & 3 Min tunescan WU's also, but just on one Computer, it could be the Computer I don't know, but it was running ok up until a few hours ago & then I had to do a reset & final uninstall BOINC & reinstall it to get it to run. Naturally I got a fresh load of WU's so maybe it's the WU's also...??? ID: 3043 · Reply Quote

KingPin Send message Joined: 2 Sep 04 Posts: 16 Credit: 26,713 RAC: 0	Message 3045 - Posted: 2 Oct 2004, 1:46:36 UTC Last modified: 2 Oct 2004, 1:47:17 UTC I believe LHC better go back to beta. Seems it can't handle the bandwidth, ???? ID: 3045 · Reply Quote

LP Send message Joined: 1 Sep 04 Posts: 39 Credit: 54,460 RAC: 0	Message 3048 - Posted: 2 Oct 2004, 1:56:29 UTC Seems they underestimated how many wu's we can run and how much bandwidth we can gobble up. ;) ID: 3048 · Reply Quote

Guido Alexander Waldenmeier Send message Joined: 2 Sep 04 Posts: 321 Credit: 10,607 RAC: 0	Message 3059 - Posted: 2 Oct 2004, 6:16:09 UTC look this maybe helpfull http://www.google.de/search?hl=de&ie=UTF-8&q=Can%27t+parse+scheduler+reply&btnG=Google-Suche&meta= feel free to visit www.guidowaldenmeier.de ID: 3059 · Reply Quote

joe Send message Joined: 2 Sep 04 Posts: 24 Credit: 12,288 RAC: 0	Message 3060 - Posted: 2 Oct 2004, 6:42:12 UTC - in response to Message 3048. Last modified: 2 Oct 2004, 7:16:00 UTC eems they underestimated how many wu's we can run and how much bandwidth we > can gobble up. ;) Well, many (or all?) 8-hours WUs (sixtrack 4.46) run only 3 minutes - I doubt this is the normal behaviour. edit: This is the complete output btw., it ate the XML tags. So the scheduler reply was absolutely correct and contained a valid error message, it just shouldn't have had the mime type in the header. From stderr.txt : [pre] 2004-10-02 00:30:10 [LHC@home] SCHEDULER_REPLY::parse(): bad first tag Content-type: text/plain 2004-10-02 00:30:10 [LHC@home] Can't parse scheduler reply 2004-10-02 00:30:10 [LHC@home] Deferring communication with project for 1 minutes and 0 seconds [/pre] And from sched_reply.xml : [pre] Content-type: text/plain [scheduler_reply] [message priority="low"]Server can't open database[/message] [request_delay]3600[/request_delay] [project_is_down/] [/scheduler_reply] [scheduler_reply] [project_name]LHC@home[/project_name] [/scheduler_reply] [/pre] ID: 3060 · Reply Quote

Bermon.net Send message Joined: 28 Sep 04 Posts: 3 Credit: 8,995 RAC: 0	Message 3907 - Posted: 16 Oct 2004, 17:23:42 UTC - in response to Message 3037. got the same problem: > 2004-10-02 00:30:10 [LHC@home] SCHEDULER_REPLY::parse(): bad first tag > Content-type: text/plain > 2004-10-02 00:30:10 [LHC@home] Can't parse scheduler reply I just re-installed Boinc by re-running Boinc update 4.13. Now Bionc says: > No work from project > Deferring communication with project for... But it did upload my finished results now. and didnt do that before i reinstalled. Maybe the re-instal did help. ID: 3907 · Reply Quote

Gaspode the UnDressed Send message Joined: 1 Sep 04 Posts: 506 Credit: 118,619 RAC: 0	Message 3911 - Posted: 16 Oct 2004, 18:55:09 UTC Last modified: 16 Oct 2004, 18:56:58 UTC A definitive answer (at least, it's as definitive as anything is with BOINC) When the server is under heavy load it can exceed the number of connections allowed to the database resulting in the message 'Server can't open database'. There is a problem with the server software that returns a malformed XML message in this circumstance that the client can't understand. The result is a 'Cannot parse scheduler reply' message displayed by the client. What it means is 'Server is a bit busy right now'. Reinstalling the BOINC software won't help directly since this is a server problem. Waiting a few minutes (while you re-install BOINC, perhaps) will allow some connections to clear, and things might then work. The connection limit is configurable by the admins. Raising it can improve connection performance, but raising it too far can result in much reduced throuhput. Currently it's set at 100 connections unless Markku has changed it recently. It had been set at 400 connections prior, but this resulted in only 25% of the throughput! IMO the whole BOINC database set up needs a good overhaul since these performance problems pop up at what seems to be quite low levels. As always, I suspect that the database performance may only be improved by the expenditure of money - something which most BOINC projects are short of. I think that's it. Anyone got anything to add? Giskard - the first telepathic robot. ID: 3911 · Reply Quote

Ulrich Metzner Send message Joined: 27 Sep 04 Posts: 36 Credit: 29,315 RAC: 0	Message 3930 - Posted: 17 Oct 2004, 1:46:10 UTC About the connections to the database: @Developers: Have you ever heard of connection pooling? That's the solution to problems like you are getting here. No offense intended, but that's what i've (as a web developer) seen always if a bottleneck like this occurs... Just a thought. greetz, Uli ID: 3930 · Reply Quote

Gaspode the UnDressed Send message Joined: 1 Sep 04 Posts: 506 Credit: 118,619 RAC: 0	Message 4857 - Posted: 4 Nov 2004, 11:36:45 UTC All you need to know about this error is in this thread. Just wait and your work will be reported. Giskard - the first telepathic robot. ID: 4857 · Reply Quote

Markku Degerholm Send message Joined: 3 Sep 04 Posts: 212 Credit: 4,545 RAC: 0	Message 4870 - Posted: 4 Nov 2004, 13:25:38 UTC - in response to Message 3930. > About the connections to the database: > > @Developers: > Have you ever heard of connection pooling? That's the solution to problems > like you are getting here. No offense intended, but that's what i've (as a web > developer) seen always if a bottleneck like this occurs... Maybe, maybe not. What you actually mean with connection pooling? Markku Degerholm LHC@home Admin ID: 4870 · Reply Quote

Markku Degerholm Send message Joined: 3 Sep 04 Posts: 212 Credit: 4,545 RAC: 0	Message 4872 - Posted: 4 Nov 2004, 13:33:03 UTC MikeW got it right. System overload is actually pretty natural after longer periods of no work / no service, because most of the 7500 active hosts try to connect within one hour and download/upload dozens of workunits at once. And if connections start to fail, they will try to reconnect soon... And so there are yet more connections attemps. And then there are forums and other web things that also generate more load to the database. Markku Degerholm LHC@home Admin ID: 4872 · Reply Quote