Message boards :
News :
Status, 24th January, 2014
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Hope this will answer some of your messages. We still have some 34,000 WUs NOT being taken. We have apparently almost 6000 in progress. We introduced SixTrack Version 4.5.03 on Wednesday 22nd January after extensive testing on boinctest and at CERN. Unluckily Yuri flooded us with work at the same time and AFS blew up leading to a huge backlog of over 16,000 results to be downloaded. 1. Results Validation;seems to be OK. I summarise that, countimg from 0-59 we do NOT CHECK Words 51, 59? and 60 in fort.10. The validator log shows many many "cannot open" supposedly existing results for comparison. They were probably lost somehow. 2. Assimilation; the log shows "Herror too many total results" !!! There are about 2000 (1979) unique messages and cases/WUs. I suspect we may nedd to clean the database and remove results (with clients losing credit I am afraid, but they will probably never get credit for these anyway). I could delete them from upload but that would probably be worse. 3. Scheduler log: there are about 2.4 million messages of which there are 1.64M unrecognised messages, multiple messages per WU. This is perhaps significant! previously these messages existed only for Macs as far as I can see. here is one case: 2014-01-22 17:24:41.1073 [PID=51877] HOST::parse(): unrecognized: opencl_cpu_prop 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: platform_vendor 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: Advanced Micro Devices, Inc. 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: /platform_vendor 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: opencl_cpu_info 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: name 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz 2014-01-22 17:24:41.1075 [PID=51877] HOST::parse(): unrecognized: /name 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: vendor 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: GenuineIntel 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /vendor 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: vendor_id 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: 4098 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /vendor_id 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: available 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: 1 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /available 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: half_fp_config 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: 0 2014-01-22 17:24:41.1076 [PID=51877] HOST::parse(): unrecognized: /half_fp_config 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: single_fp_config 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: 191 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: /single_fp_config 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: double_fp_config 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: 63 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: /double_fp_config 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: endian_little 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: 1 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: /endian_little 2014-01-22 17:24:41.1077 [PID=51877] HOST::parse(): unrecognized: execution_capabilities 2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: 3 2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: /execution_capabilities 2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: extensions 2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_kh 2014-01-22 17:24:41.1078 [PID=51877] HOST::parse(): unrecognized: /extensions 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: global_mem_size 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: 17029206016 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: /global_mem_size 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: local_mem_size 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: 32768 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: /local_mem_size 2014-01-22 17:24:41.1153 [PID=51877] HOST::parse(): unrecognized: max_clock_frequency 2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: 3500 2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: /max_clock_frequency 2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: max_compute_units 2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: 8 2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: /max_compute_units 2014-01-22 17:24:41.1154 [PID=51877] HOST::parse(): unrecognized: opencl_platform_version 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: OpenCL 1.2 AMD-APP (1348.5) 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_platform_version 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: opencl_device_version 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: OpenCL 1.2 AMD-APP (1348.5) 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_device_version 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: opencl_driver_version 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: 1348.5 (sse2,avx) 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_driver_version 2014-01-22 17:24:41.1155 [PID=51877] HOST::parse(): unrecognized: /opencl_cpu_info 2014-01-22 17:24:41.1156 [PID=51877] HOST::parse(): unrecognized: /opencl_cpu_prop 2014-01-22 17:24:41.3583 [PID=51877] Request: [USER#221474] [HOST#10137513] [IP 69.35.195.242] client 7.2.33 2014-01-22 17:24:41.3880 [PID=51877] Sending reply to [HOST#10137513]: 0 results, delay req 6.00 2014-01-22 17:24:41.3880 [PID=51877] Scheduler ran 0.035 seconds I am not an expert but it seems to me it might explain work not being taken....... (but never saw this with boinctest!). Other issue; one client reports "Cannot Create Process" mon Windows 7. May or may not be significant. Are executables 'signed" OK? So all a bit complicated but hope to sort it (very) soon. Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
P.S. The "unrecognized" messages wre for only Mac systems in the past. |
Send message Joined: 12 Mar 12 Posts: 128 Credit: 20,013,377 RAC: 0 |
take a note on ratio valid:inconclusive:invalid:error is has significantly shifted recently from Validation inconclusive (614) · Invalid (63) · Error (197) to Validation inconclusive (1036) · Invalid (233) · Error (211) and now this In progress (42) · Validation pending (389) · Validation inconclusive (6842) · Valid (3265) · Invalid (1887) · Error (218) |
Send message Joined: 12 Sep 11 Posts: 38 Credit: 218,154 RAC: 0 |
Have fun. I am standing by. :) |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
3. Scheduler log: there are about 2.4 million messages of which That looks like the server is trying, but failing, to parse an <opencl_cpu_prop> block in the user's sched_request file. OpenCL on CPUs is a very recent addition to BOINC (previously OpenCL was only supported on GPUs), and the user is requesting using BOINC v7.2.33, which is indeed the newest 'recommended' client version - deployed 26 Nov 2013. Here's what a CPU OpenCL description looks like in context: <host_info> If you haven't updated the server code since November, it might be wise to do so (though I haven't heard of the new section causing any problems at other projects). Else, you might need to consult David Anderson. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks a lot Richard. This almost certainly explains the not getting work issue. My colleagues built the Windows execs using a very recent BIOINC version I suppose and our server BOINC is older. Eric |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
We have restored old executables and seem to be back in business. Need to clean up, clear up and the post mortem news will follow. Eric. |
Send message Joined: 21 Aug 12 Posts: 9 Credit: 2,941,516 RAC: 0 |
Eric please note aloso this Thread: http://lhcathomeclassic.cern.ch/sixtrack/forum_thread.php?id=3799 there are problems with Winxp and the new Version. In the Thread a Solution ist described. regardes Franz |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks for that.....many thanks. Don't know how I/we missed it. The inquest continues. Eric. |
©2025 CERN