Message boards :
LHC@home Science :
Sixtrack apps - Bad "api_version" - BOINC 7.8.2 display grids failing
Message board moderation
Author | Message |
---|---|
Send message Joined: 26 Jul 13 Posts: 13 Credit: 2,043,458 RAC: 0 ![]() ![]() |
I'm noticing a problem, where BOINC 7.8.2 Advanced View will fail to correctly load the Project/Task grids, every second, only partially loading, then reloading, repeatedly. I've tracked it to this project, specifically to apps having bad info within their "api_version" XML properties. This causes the 7.8.2 manager to freak out trying to parse that xml and read the strings! Once I remove this project, all is well. But then when I re-added it, it broke again! Could you guys please investigate how your <api_version> XML properties are becoming messed up, and fix it? Note: The first one has a binary character before the close tag, which is ultimately what hoses BOINC, I think. But I don't know how. See below. Thanks, Jacob Klein For sure caused a problem: <app_version> <app_name>sixtracktest</app_name> <version_num>4630</version_num> <platform>windows_x86_64</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>4763423249.552478</flops> <plan_class>sse2</plan_class> <api_version>7.7.0 API_VERSI</api_version> <file_ref> <file_name>sixtracktest_win64_4630_sse2.exe</file_name> <main_program/> </file_ref> </app_version> Older set, that maybe caused a problem: <app_version> <app_name>sixtrack</app_name> <version_num>45107</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>6757081731.981332</flops> <plan_class>pni</plan_class> <api_version>7.1.0</api_version> <file_ref> <file_name>sixtrack_win32_4517_pni.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>sixtrack</app_name> <version_num>4630</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>14855145875.554390</flops> <plan_class>sse2</plan_class> <api_version>7.7.0 API_VERSI</api_version> <file_ref> <file_name>sixtrack_win32_4630_sse2.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>sixtrack</app_name> <version_num>4630</version_num> <platform>windows_x86_64</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>19207728646.985275</flops> <plan_class>sse2</plan_class> <api_version>7.7.0 API_VERSI</api_version> <file_ref> <file_name>sixtrack_win64_4630_sse2.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>sixtracktest</app_name> <version_num>4630</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>4480961228.675860</flops> <plan_class>sse2</plan_class> <api_version>7.7.0 API_VERSI</api_version> <file_ref> <file_name>sixtracktest_win32_4630_sse2.exe</file_name> <main_program/> </file_ref> </app_version> |
Send message Joined: 14 Jan 10 Posts: 1178 Credit: 7,524,459 RAC: 3,253 ![]() ![]() ![]() |
Hi Jacob, I too see those strange API_VERSI in my client_state, <app_version> <app_name>sixtracktest</app_name> <version_num>4630</version_num> <platform>windows_x86_64</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>4823445069.717739</flops> <plan_class>sse2</plan_class> <api_version>7.7.0 API_VERSI</api_version> <file_ref> <file_name>sixtracktest_win64_4630_sse2.exe</file_name> <main_program/> </file_ref> </app_version> <app_version> <app_name>sixtracktest</app_name> <version_num>4630</version_num> <platform>windows_intelx86</platform> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>6112078616.967116</flops> <plan_class>sse2</plan_class> <api_version>7.7.0 API_VERSI</api_version> <file_ref> <file_name>sixtracktest_win32_4630_sse2.exe</file_name> <main_program/> </file_ref> </app_version> but I don't have an issue with BOINC's Advanced View and sixtracktest tasks are running normally. 08-Sep-2017 13:53:15 CEST [---] Version change (7.7.2 -> 7.8.2) Probably one difference with you is, that I loaded the tasks before the change. |
Send message Joined: 26 Jul 13 Posts: 13 Credit: 2,043,458 RAC: 0 ![]() ![]() |
I'm still troubleshooting, and the bug (where it crashes the 7.8.2 BOINC Manager) is a little bit elusive, but... I think sometimes it'll say: API_VERSI</api_version> and sometimes it'll say API_VERSI*</api_version> where * is some crazy binary character. If that binary character gets into a sched_reply or your client_state.xml file, and attempted to be loaded into memory, then things get hosed. Either way, I believe that LHC@Home fixing <api_version> will be the answer. Can we get it fixed? PS: You can encapsulate a block in "pre" (for pre-formatted text), to keep the formatting and tab stops. |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 ![]() ![]() |
Hello Jacob, thanks for pointing this out. I have triggered our IT experts, for a better insight on this point. Will keep you posted! |
![]() Send message Joined: 15 Jul 05 Posts: 234 Credit: 5,771,110 RAC: 152 ![]() ![]() |
Jacob, do you see this with other applications than Sixtracktest? Unlike other applications, Sixtrack and Sixtracktest do not have a version.xml with the application, so I wonder where this comes from. BOINC 7.8.2 is a recent BOINC client, and we do not see this with BOINC 7.6.22. If the error is due to garbled info generated on the server side, do you get the same errors on: https://lhcathomedev.cern.ch/lhcathome-dev/ ? Thanks for filling us out. |
Send message Joined: 26 Jul 13 Posts: 13 Credit: 2,043,458 RAC: 0 ![]() ![]() |
I'm an active BOINC Alpha tester, attached to all of the projects, routinely getting work from about 11 ... and this is the first time I've ever seen this problem. The only applications I've ever seen this problem with, are Sixtrack and Sixtracktest, on BOINC 7.8.2. Note: This new version of BOINC (client AND server) does have many string sprintf changes. It's possible that something isn't terminating a string correctly, or a reader isn't reading correctly. I'm attached to your dev project too, but my workload doesn't really offer much of a chance to do work for your projects. I'll try to adjust that. Regarding "how" it might be garbled ... Please read the following, which might help you do some testing to see if you can recreate the problem or find the fault. Richard Haselgrove (another tester who knows much more about the code than I do), said this, to the BOINC Alpha list service:
Let us know what you find! Kind regards, Jacob Klein |
![]() Send message Joined: 15 Jul 05 Posts: 234 Credit: 5,771,110 RAC: 152 ![]() ![]() |
Thanks for this useful information! On a quick search on the Sixtracktest binaries, I cannot see any odd characters, but only API version 7.7: -bash-4.1$ strings */*|grep API_VERSION API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION LPAPI_VERSION API_VERSION_7.7.0 API_VERSION LPAPI_VERSION API_VERSION_7.7.0 API_VERSION LPAPI_VERSION API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 API_VERSION_7.7.0 I will ask the Sixtrack team for further details. |
Send message Joined: 26 Jul 13 Posts: 13 Credit: 2,043,458 RAC: 0 ![]() ![]() |
Might be useful to search for: API_VERSI |
Send message Joined: 22 Mar 17 Posts: 30 Credit: 360,676 RAC: 0 ![]() ![]() |
Could you check how it's in sched_reply_lhcathome.cern.ch_lhcathome.xml? You need to have SixTrack tasks assigned in that reply for the app_version to be included. |
![]() Send message Joined: 15 Jul 05 Posts: 234 Credit: 5,771,110 RAC: 152 ![]() ![]() |
A search for API_VERSI gives the same result as for API_VERSION. I have Sixtrack tasks downloaded, but paused, and my sched_reply_lhcathome.cern.ch_lhcathome.xml does not contain any info related to app versions. Need to await the next round of Sixtrack tasks. Could this be an issue with the 7.8.2 client on Windows 10? |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 ![]() ![]() |
I have installed the BOINC client 7.8.2 on my Windows7 machine (actually a virtual machine in VB in an Ubuntu environment): https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10481870 And I have successfully crunched some brand new tasks: https://lhcathome.cern.ch/lhcathome/results.php?hostid=10481870 What is the name of the app BLOB .xml file? |
Send message Joined: 22 Mar 17 Posts: 30 Credit: 360,676 RAC: 0 ![]() ![]() |
I got some SixTrack tasks and this is in sched_reply: <app_version> <app_name>sixtrack</app_name> <version_num>4630</version_num> <api_version>7.7.0 API_VERSION LPAPI_VERSION</api_version> <file_ref> <file_name>sixtrack_win64_4630_sse2.exe</file_name> <main_program/> </file_ref> <platform>windows_x86_64</platform> <plan_class>sse2</plan_class> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>11772905691.719425</flops> </app_version> The <api_version> garbage is client_state.xml is truncated because the client stores the api version in max 16 byte string. IIRC, <app_version> is copied from XML blob in DB as is. Do you use update_version or something else to add app versions? Not related to the api_version but I'll mention it anyway because you are going to get questions about it. Previous SixTrack version was 451.07 and the new version is 46.30, that is, less than previous. The client keeps only the app version that has the highest version number. Ones with lesser version numbers are deleted as soon as no task refers to them. Because of that the client keeps re-downloading the app's files over and over again if at any moment it runs out of SixTrack tasks. The only way out of it is to reset the project so that the client forgets about the 451.07 version or you re-release 46.30 with a higher version number. |
Send message Joined: 29 Feb 16 Posts: 157 Credit: 2,659,975 RAC: 0 ![]() ![]() |
Hello Juha, thanks for the detailed reply. This is what I get in my sched_reply_lhcathome.cern.ch_lhcathome.xml: <app_version> <app_name>sixtrack</app_name> <version_num>4630</version_num> <api_version>7.7.0 API_VERSION LPAPI_VERSION</api_version> <file_ref> <file_name>sixtrack_win32_4630_sse2.exe</file_name> <main_program/> </file_ref> <platform>windows_intelx86</platform> <plan_class>sse2</plan_class> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>9839391853.388186</flops> </app_version> Hence no corruption. This is for the 32bit windows exe, not for the 64bit, as it happens to you. My sched_reply doesn't show any line for the 64bit, do you have anything on the 32bit? Thanks also for pointing out the problem with the version number of the exe - I think it explains nicely why, after releasing the new exes, the very first tasks were executed still with the old ones. Though, I guess that, since we declared as deprecated the 45107 exes, then only the 4630 should be distributed, and I see this is happening; so I guess that the issue that you raise is less than a concern, isn't it? |
Send message Joined: 26 Jul 13 Posts: 13 Credit: 2,043,458 RAC: 0 ![]() ![]() |
Wow. He's not saying that your reply is corrupted. His wasn't. I think he's saying that BOINC is expecting 16 chars or less for that field, and will have problems if you put so much in there. Why can't that field just say "7.7.0" ? <api_version>7.7.0 API_VERSION LPAPI_VERSION</api_version> |
Send message Joined: 14 Jan 10 Posts: 1178 Credit: 7,524,459 RAC: 3,253 ![]() ![]() ![]() |
... do you have anything on the 32bit? This is what was in my sched_reply_lhcathome.cern.ch_lhcathome.xml: <app_version> <app_name>sixtrack</app_name> <version_num>4630</version_num> <api_version>7.7.0 API_VERSION LPAPI_VERSION</api_version> <file_ref> <file_name>sixtrack_win32_4630_sse2.exe</file_name> <main_program/> </file_ref> <platform>windows_intelx86</platform> <plan_class>sse2</plan_class> <avg_ncpus>1.000000</avg_ncpus> <max_ncpus>1.000000</max_ncpus> <flops>10356896019.129839</flops> </app_version> |
Send message Joined: 22 Mar 17 Posts: 30 Credit: 360,676 RAC: 0 ![]() ![]() |
Sorry to have kept you waiting for an answer. <api_version> is supposed to contain only API version number: <api_version>7.7.0</api_version> So it's wrong in 32-bit app version too. This is now documented in #2121 and fixed in #2122 since we declared as deprecated the 45107 exes, then only the 4630 should be distributed, and I see this is happening; so I guess that the issue that you raise is less than a concern, isn't it? It's not a critical issue. At some point someone is going to notice these in log: 12-Sep-2017 02:53:45 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe 13-Sep-2017 03:15:26 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe 13-Sep-2017 14:15:32 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe 14-Sep-2017 02:47:09 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe 14-Sep-2017 15:40:46 [LHC@home] Started download of sixtrack_win64_4630_sse2.exe And then they wonder what's going on. That's all. |
©2023 CERN