Thread 'Multiple Client Instances'

Author	Message
computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2745 Credit: 302,489,135 RAC: 70,686	Message 30897 - Posted: 20 Jun 2017, 10:10:26 UTC Referring to a discussion about multiple client instances that started here: https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=4297&postid=30603 During the setup of multiple client instances I noticed a pitfall that is not mentioned in any of the tutorials I read. If an additional instance contacts the project server for the first time, it's host parameters will be compared against the host records in the server's DB. If the server decides that the host already exist, it will merge that new instance into the existing instance and both run under the same host ID. This leads to severe problems, e.g. that WUs from instance 1 will be cancelled if instance 2 connects to the server and vice versa. Workaround To solve the problem above the server has to be forced to generate a new host ID for every new instance. This can be achieved by changing a major host parameter on the client before the first server contact. My suggestion is to set <ncpus> in the file cc_config.xml to an unused value and then contact the server as often as necessary until it returns a new host ID. This should then be crosschecked on the project website. Once this new host ID exist it can be used in the same way than a real host, e.g. own venue, own WUs. ID: 30897 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 224,921,321 RAC: 9,644	Message 30901 - Posted: 20 Jun 2017, 11:59:47 UTC - in response to Message 30897. HM, in case of Races I'm using up to 20 instances per machine but I never saw this behaviour. Maybe, you copied the BOINC-Data directory to the new instance ? Or it is simply a mistake of looking? All instances will have the same machine-name and this can be very irritrating. But they all should have a different ID Supporting BOINC, a great concept ! ID: 30901 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2745 Credit: 302,489,135 RAC: 70,686	Message 30904 - Posted: 20 Jun 2017, 12:44:59 UTC - in response to Message 30901. ...you copied the BOINC-Data directory to the new instance ? Yes. And after I noticed the described behavior I tried several measures: - deleted the host ID from the new instance before I restarted it. - connected a previously detached (fresh) and restarted instance. In any case the ID was restored by the server reply. The first measure that worked was a temporary change of <ncpus>. ...All instances will have the same machine-name and this can be very irritrating. But they all should have a different ID Yes, now they have the same machine-name but individual IDs and individual venues to control which subproject they run. And they keep the new ID although <ncpus> is now the same on every instance. ID: 30904 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 224,921,321 RAC: 9,644	Message 30905 - Posted: 20 Jun 2017, 12:51:06 UTC - in response to Message 30904. ...you copied the BOINC-Data directory to the new instance ? Yes. Then, this is the reason for your described behaviour. Better not to use the DATA_Directory from existing instances ... Supporting BOINC, a great concept ! ID: 30905 · Reply Quote

Crystal Pellet Volunteer moderator Volunteer tester Send message Joined: 14 Jan 10 Posts: 1552 Credit: 10,072,995 RAC: 710	Message 30906 - Posted: 20 Jun 2017, 13:07:35 UTC - in response to Message 30897. My suggestion is to set <ncpus> in the file cc_config.xml to an unused value and then contact the server as often as necessary until it returns a new host ID. For me setting <suppress_net_info>1</suppress_net_info> in cc_config was enough to get new hostid's for the new instances. ID: 30906 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2745 Credit: 302,489,135 RAC: 70,686	Message 30907 - Posted: 20 Jun 2017, 13:20:36 UTC I also tried it with a (nearly) clean copy of the original data directory. - LHC and all other projects were detached - I checked if there were remains in the client_state.xml -> none - I deleted all files/dirs with referenes to LHC that were not automatically deleted Anyway, thank you for scratching your head. At the end the setup was perfectly running through a few cycles before the server outage and is still patiently asking for new work. What could be done server side is to increase the number of venues to run more than 4 subprojets, e.g. sixtrack, in such a setup. May be the project admins find time to comment this request after the server problems are solved. ID: 30907 · Reply Quote