Message boards :
Sixtrack Application :
SixTrack on anonymous platform: getting error 500 from scheduler
Message board moderation
Author | Message |
---|---|
Send message Joined: 21 Mar 15 Posts: 2 Credit: 19,590,343 RAC: 5,528 ![]() ![]() ![]() |
Hi all! I have a specific hardware platform with WLIV Elbrus CPU onboard, and I managed to build SixTrack for it with both SSE2 and AVX modes (yes, Elbrus has its support for both of these SIMD extensions). Neither of built apps are runnable though. I did it the following way: - Cloned SixTrack git repo, switched to version 5.4.3 commit - In CMakeLists.txt, added "${CMAKE_SYSTEM_PROCESSOR} MATCHES e2k" check to the places needed - Initialized and updated BOINC subproject, built boinc_api_fortran.o and libraries inside this subproject - Built sixtrack applications using "./cmake_six CR BOINC NATIVE" and "./cmake_six CR BOINC NATIVE AVX" - Installed BOINC, connected to my BAM! account, attached this host to LHC@home, got expected (at this moment) message "[LHC@home] This project doesn't support computers of type e2k-mcst-linux-gnu" - Shut down BOINC, put one of the applications and corresponding app_info.xml inside "projects/lhcathome.cern.ch_lhcathome" directory in BOINC data directory - Ran BOINC again, tried to update project - Got "[LHC@home] Scheduler request failed: HTTP internal server error" that I would not expect. Resetting project does not help. Neither do detaching, re-creating a project directory (with app_info and binaries), and then attaching to project again. Tested it on two hosts, one behind proxy, and one directly connected to internet, but to no avail. Milkyway@Home and Einstein@home works on these hosts perfectly nice using the similar way, though. My app_info.xml (tried many of them, not a single one was ok): <app_info> <app> <name>sixtrack</name> <user_friendly_name>SixTrack</user_friendly_name> <non_cpu_intensive>0</non_cpu_intensive> </app> <file> <name>sixtrack_avx</name> <status>1</status> <executable/> </file> <app_version> <app_name>sixtrack</app_name> <version_num>50205</version_num> <platform>anonymous</platform> <avg_ncpus>1.000000</avg_ncpus> <flops>100000000.000000</flops> <plan_class>avx</plan_class> <api_version>7.14.2</api_version> <file_ref> <file_name>sixtrack_avx</file_name> <main_program/> </file_ref> </app_version> </app_info> Most of contents is guessed by client_state.xml from a host where it's everything ok with SixTrack. I tried both avx and sse2 binaries and plan classes, or either one of them; I tried setting api_version to actual version of BOINC, or removing it; I tried to remove and/or alter other fields, like avg_ncpus, flops, platform (in it, I tried anonymous, e2k-mcst-linux-gnu, e2k-linux-gnu, or no such field at all), version_num, status, non_cpu_intensive, etc. My sched_request_lhcathome.cern.ch_lhcathome.xml (authenticator is removed for security): <scheduler_request> <authenticator>....</authenticator> <hostid>10674216</hostid> <rpc_seqno>6</rpc_seqno> <core_client_major_version>7</core_client_major_version> <core_client_minor_version>16</core_client_minor_version> <core_client_release>14</core_client_release> <resource_share_fraction>1.000000</resource_share_fraction> <rrs_fraction>1.000000</rrs_fraction> <prrs_fraction>1.000000</prrs_fraction> <duration_correction_factor>1.000000</duration_correction_factor> <allow_multiple_clients>0</allow_multiple_clients> <sandbox>0</sandbox> <dont_send_work>0</dont_send_work> <work_req_seconds>1658880.000000</work_req_seconds> <cpu_req_secs>1658880.000000</cpu_req_secs> <cpu_req_instances>32.000000</cpu_req_instances> <estimated_delay>0.000000</estimated_delay> <client_cap_plan_class>1</client_cap_plan_class> <platform_name>anonymous</platform_name> <working_global_preferences> <global_preferences> <source_project></source_project> <mod_time>0.000000</mod_time> <battery_charge_min_pct>90.000000</battery_charge_min_pct> <battery_max_temperature>40.000000</battery_max_temperature> <run_on_batteries>0</run_on_batteries> <run_if_user_active>1</run_if_user_active> <run_gpu_if_user_active>0</run_gpu_if_user_active> <suspend_if_no_recent_input>0.000000</suspend_if_no_recent_input> <suspend_cpu_usage>25.000000</suspend_cpu_usage> <start_hour>0.000000</start_hour> <end_hour>0.000000</end_hour> <net_start_hour>0.000000</net_start_hour> <net_end_hour>0.000000</net_end_hour> <leave_apps_in_memory>0</leave_apps_in_memory> <confirm_before_connecting>1</confirm_before_connecting> <hangup_if_dialed>0</hangup_if_dialed> <dont_verify_images>0</dont_verify_images> <work_buf_min_days>0.100000</work_buf_min_days> <work_buf_additional_days>0.500000</work_buf_additional_days> <max_ncpus_pct>0.000000</max_ncpus_pct> <cpu_scheduling_period_minutes>60.000000</cpu_scheduling_period_minutes> <disk_interval>60.000000</disk_interval> <disk_max_used_gb>0.000000</disk_max_used_gb> <disk_max_used_pct>90.000000</disk_max_used_pct> <disk_min_free_gb>0.100000</disk_min_free_gb> <vm_max_used_pct>75.000000</vm_max_used_pct> <ram_max_used_busy_pct>50.000000</ram_max_used_busy_pct> <ram_max_used_idle_pct>90.000000</ram_max_used_idle_pct> <idle_time_to_run>3.000000</idle_time_to_run> <max_bytes_sec_up>0.000000</max_bytes_sec_up> <max_bytes_sec_down>0.000000</max_bytes_sec_down> <cpu_usage_limit>100.000000</cpu_usage_limit> <daily_xfer_limit_mb>0.000000</daily_xfer_limit_mb> <daily_xfer_period_days>0</daily_xfer_period_days> <override_file_present>0</override_file_present> <network_wifi_only>0</network_wifi_only> </global_preferences> </working_global_preferences> <cross_project_id>a8f1beff410131bb325a4041577b4c90</cross_project_id> <time_stats> <on_frac>0.999549</on_frac> <connected_frac>-1.000000</connected_frac> <cpu_and_network_available_frac>1.000000</cpu_and_network_available_frac> <active_frac>1.000000</active_frac> <gpu_active_frac>1.000000</gpu_active_frac> <client_start_time>1607201147.328425</client_start_time> <total_start_time>1607137250.782878</total_start_time> <total_duration>859.837951</total_duration> <total_active_duration>859.837951</total_active_duration> <total_gpu_active_duration>859.837951</total_gpu_active_duration> <now>1607201150.834698</now> <previous_uptime>175.987093</previous_uptime> <session_active_duration>0.000000</session_active_duration> <session_gpu_active_duration>0.000000</session_gpu_active_duration> </time_stats> <net_stats> <bwup>18398.141947</bwup> <avg_up>470059541.941634</avg_up> <avg_time_up>1607138339.565820</avg_time_up> <bwdown>4639723.433021</bwdown> <avg_down>520727181988.380493</avg_down> <avg_time_down>1607138339.478808</avg_time_down> </net_stats> <host_info> <timezone>10800</timezone> <domain_name>mamizou</domain_name> <ip_addr>192.168.0.153</ip_addr> <host_cpid>dee8b86df3d72f6b3c1875514a723de7</host_cpid> <p_ncpus>32</p_ncpus> <p_vendor>E8C</p_vendor> <p_model>E8C [Family 4 Model 7 ]</p_model> <p_features></p_features> <p_fpops>1000000000.000000</p_fpops> <p_iops>1000000000.000000</p_iops> <p_membw>1000000000.000000</p_membw> <p_calculated>1607137089.123090</p_calculated> <p_vm_extensions_disabled>0</p_vm_extensions_disabled> <m_nbytes>236458024960.000000</m_nbytes> <m_cache>-1.000000</m_cache> <m_swap>0.000000</m_swap> <d_total>983312404480.000000</d_total> <d_free>796927619072.000000</d_free> <os_name>Linux Debian</os_name> <os_version>Debian GNU/Linux [5.4.0-1.9-e8c|libc 2.29 (GNU libc)]</os_version> <n_usable_coprocs>0</n_usable_coprocs> <wsl_available>0</wsl_available> </host_info> <disk_usage> <d_boinc_used_total>45367296.000000</d_boinc_used_total> <d_boinc_used_project>45162496.000000</d_boinc_used_project> <d_project_share>876552173404.160034</d_project_share> </disk_usage> <app_versions> <app_version> <app_name>sixtrack</app_name> <version_num>50205</version_num> <platform>anonymous</platform> <avg_ncpus>1.000000</avg_ncpus> <flops>100000000.000000</flops> <plan_class>avx</plan_class> <api_version>7.14.2</api_version> </app_version> </app_versions> <other_results> </other_results> <in_progress_results> </in_progress_results> </scheduler_request>4480.000000</d_total> <d_free>796927619072.000000</d_free> <os_name>Linux Debian</os_name> <os_version>Debian GNU/Linux [5.4.0-1.9-e8c|libc 2.29 (GNU libc)]</os_version> <n_usable_coprocs>0</n_usable_coprocs> <wsl_available>0</wsl_available> </host_info> <disk_usage> <d_boinc_used_total>45367296.000000</d_boinc_used_total> <d_boinc_used_project>45162496.000000</d_boinc_used_project> <d_project_share>876552173404.160034</d_project_share> </disk_usage> <app_versions> <app_version> <app_name>sixtrack</app_name> <version_num>50205</version_num> <platform>anonymous</platform> <avg_ncpus>1.000000</avg_ncpus> <flops>100000000.000000</flops> <plan_class>avx</plan_class> <api_version>7.14.2</api_version> </app_version> </app_versions> <other_results> </other_results> <in_progress_results> </in_progress_results> </scheduler_request>[/code] My sched_reply_lhcathome.cern.ch_lhcathome.xml: [code]<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> <html><head> <title>500 Internal Server Error</title> </head><body> <h1>Internal Server Error</h1> <p>The server encountered an internal error or misconfiguration and was unable to complete your request.</p> <p>Please contact the server administrator at. boinc-server-admin@cern.ch to inform them of the time this error occurred, and the actions you performed just before this error.</p> <p>More information about this error may be available in the server error log.</p> </body></html>[/code] May please anyone help me on getting SixTrack to work on anonymous platform? |
![]() Send message Joined: 15 Jun 08 Posts: 2613 Credit: 263,657,290 RAC: 145,688 ![]() ![]() |
A basic description of the anonymous platform can be found here: https://boinc.berkeley.edu/wiki/Anonymous_platform This page states: "... the <platform> element should be removed." The tag names are slightly different to your app_info.xml. I would suggest to rename the executable to "sixtrack_lin64_50205_avx.linux" and use this app_info.xml: <app_info> <app> <name>sixtrack</name> </app> <file_info> <name>sixtrack_lin64_50205_avx.linux</name> <executable/> </file_info> <app_version> <app_name>sixtrack</app_name> <version_num>50205</version_num> <api_version>7.14.2</api_version> <plan_class>avx</plan_class> <flops>100000000.000000</flops> <avg_ncpus>1.0</avg_ncpus> <file_ref> <file_name>sixtrack_lin64_50205_avx.linux</file_name> <main_program/> </file_ref> </app_version> </app_info> The BOINC client must be restarted to activate the changes. In addition your scheduler request has empty processor features: <p_features></p_features> Don't know if at least "avx" must be send to the server. |
Send message Joined: 21 Mar 15 Posts: 2 Credit: 19,590,343 RAC: 5,528 ![]() ![]() ![]() |
Thanks for your suggestions! Unfortunately, neither of them worked. "... the <platform> element should be removed."Yes, I tried removing platform too, as well as specifying it anonymous, e2k-mcst-linux-gnu, e2k-linux-gnu, and, for now, even x86_64-pc-linux-gnu, to get the exact same app_version block, as it is on x86_64 machine (I also messed a bit with number of zeros in avg_ncpus, which does always end up in six zeros in request, and setting flops to 1634109463.151072): <app_version> <app_name>sixtrack</app_name> <version_num>50205</version_num> <platform>x86_64-pc-linux-gnu</platform> <avg_ncpus>1.000000</avg_ncpus> <flops>1634109463.151072</flops> <plan_class>sse2</plan_class> <api_version>7.14.2</api_version> </app_version>But, on x86_64 everything with such app_version block goes fine, and here it does not. The tag names are slightly different to your app_info.xml.Well, you're right, probably that's a mistake, but still nothing changed, when I fixed <file>...</file> to <file_info>...</file_info>. I would suggest to rename the executable to "sixtrack_lin64_50205_avx.linux" and use this app_info.xmlI tried, and got the exact same result. I tried the same with the sse2 build too, but still have no success. In addition your scheduler request has empty processor featuresGood idea, but it stil didn't help. I tried patching BOINC in client/hostinfo_unix.cpp:709 like this: #if defined(__e2k__) safe_strcpy(features, "sse sse2 ssse3 sse4_1 sse4_2 sse4a avx avx2"); #endifand got the following in p_features in host_info: <p_features>sse sse2 ssse3 sse4_1 sse4_2 sse4a avx avx2</p_features>but it still can't do anything with the problem. I even mocked the entire string of p_features from x86_64 box ("fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid eagerfpu pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate vmmcall npt lbrv svm_lock nrip_save"), but guess what--it haven't done anything to the problem. I really ran out of ideas for now. By the way, does anyone have an idea how to send request XML (like manually constructed sched_request) to the scheduler using CURL? I'm up to do so and analyze return codes. I tried "curl -d sched_request_lhcathome.cern.ch_lhcathome.xml https://lhcathome.cern.ch/lhcathome_cgi/cgi", but got not error 500, but result 200 and this: <scheduler_reply> <scheduler_version>715</scheduler_version> <master_url>https://lhcathome.cern.ch/lhcathome/</master_url> <request_delay>6.000000</request_delay> <message priority="low">Error in request message: xp.get_tag() failed </message> <project_name>LHC@home</project_name> <send_full_workload/> </scheduler_reply> |
©2025 CERN