Message boards :
News :
No RESULTS accepted from Linux Kernel 4.8.*
Message board moderation
Author | Message |
---|---|
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
As an emergency measure and over the weekend, I have set max_results_day to -1 for all hosts running Linux (Ubuntu?) Kernel 4.8.*. SixTrack is consistently crashing with an IFORT run time formatted I/O error. This will avoid wasting your valuable contributions. Eric. |
Send message Joined: 26 Sep 11 Posts: 37 Credit: 7,807,848 RAC: 12 |
I'm not sure how you arrived at that conclusion? I represent a sample size of 1, but my SixTrack tasks seem to be running OK and are being (slowly) validated. I run Xubuntu Linux 64-bit with kernel 4.8.0-58. P.S. Maybe my sample size is 2. I have 2 machines running SixTrack without problems under Xubuntu Linux with the 64-bit kernel 4.8. You can check the details of my CPUs. |
Send message Joined: 14 May 15 Posts: 17 Credit: 11,627,311 RAC: 0 |
I also have a host with 4.8 without a single errored wu. So I think you have to look harder |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Indeed my apologies; I see a very good record for this host id 10398562 State: All (35) · In progress (3) · Validation pending (7) · Validation inconclusive (4) · Valid (19) · Invalid (0) · Error (2) (the 2 in error are "no heartbeat) I am afraid I am "throwing out the bay with the bathwater". Nonetheless we are having frequent errors from 4.8.0 on Intel Family 6. I shall publish some numbers soonest. This combination of hardware and software is a source of errors but not always. Eric. I'm not sure how you arrived at that conclusion? I represent a sample size of 1, but my SixTrack tasks seem to be running OK and are being (slowly) validated. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Indeed, your results look impeccable, and a lot of them. Many thanks for your contribution. I have noted your hostids and cleared the flag. Eric. I also have a host with 4.8 without a single errored wu. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
..... and I have cleared the flag. Eric Indeed my apologies; I see a very good record for this |
Send message Joined: 4 Jul 17 Posts: 1 Credit: 37,145 RAC: 0 |
Hello, Eric. Have checked job, just yesterday I first started boinc on Freebsd OS 10.3 on the Celeron 1037U. Three tasks on the check, part in anticipation. There are 125 errors, I can guess why. Ran boinc on freebsd there was no package with the libraries in linux. Error ELF. Judging by the challenges and logs no problem. Today I will go out of the city, will run on the laptop is Linux Ubuntu, have had no problems. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
iGreat, many thanks. Eric. Hello, Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
IMPORTANT! Apologies for the excessive ban. This is now corrected but took more time than I had available yesterday. The Prpject Management page at CERN is far too slow to be usable. I had to use my own scripts to access the sixt_production database. I have now "unbanned" 770 hosts but maintained the ban for "banned" 73. I have found 98707 all_linux Linux Hosts 4331 allnew48 Linux Hosts with Kernel 4.8.* 3161 allfamily6 Linux Hosts with Kernel 4.8.* and Intel Family 6 Processor(s) Of the "banned" (max_results_day=-1) of which there are 1203, 843 are running Kernel 4.8.* on Intel Family 6. Now I went to PCBE13978 (more disk space, and the validator logs) and looked for all Invalids in the validator logs. Then checked all 843 Hosts in the Invalids. (Had to use nohup a lot as they are digging up the roads and my Internet connection is being broken regularly, or is it lxplus@CERN???) Anyway, to cut a long story short. and I can't remember how to italicise or emphasise with this interface :-( I have found that 73 hosts account for 204,184 Invalid Results ============================================== out of a Total of 258,725, i.e. almost 79% of all Invalids. ========================================= No time to make a plot, but here are the Invalid counts for each of the 73 Hosts. 39852 21733 20055 19813 19601 18485 7587 5425 5360 4848 4651 4266 4196 3791 2325 2293 1953 1825 1802 1789 1620 1535 1103 880 731 730 729 696 598 383 369 369 367 355 338 308 308 305 240 104 96 63 54 34 33 33 31 19 18 10 9 8 7 7 5 5 4 4 3 3 3 3 2 2 2 1 1 1 1 1 1 1 1 ....and the HostIds in the same order.... 10452223 10480022 10486162 10487841 10485156 10484503 10480909 10454365 10484659 10484606 10486251 10483458 10477752 10481907 10484752 10487436 10484663 10487212 10453783 10485912 10485911 10485913 10405110 10485905 10485907 10485906 10456121 10487210 10485908 10482829 10453149 10452598 10453254 10453494 10452614 10476277 10453157 10453507 10454458 10488834 10481344 10481733 10485179 10487938 10487900 10487190 10480804 10482592 10475984 10480775 10475982 10453730 10475983 10455704 10488196 10478598 10487688 10476101 10452585 10451971 10421428 10408937 10486716 10479782 10417991 10489602 10489459 10484733 10451832 10449556 10416774 10415082 10396588 Needless to say I shall be looking VERY closely at least the first few of these 73 Hosts! Hint the 1st "englab" system has been banned for a considerable time already :-) However we have an even more urgent problem with "Transient" errors and incorrect Validation. I MUST look at that and write a report for Monday latest. Eric. (Have to take a break. Too hot at the pool, and too sunny to read my screen, and my battery is flat!) |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
|
Send message Joined: 15 Nov 14 Posts: 602 Credit: 24,371,321 RAC: 0 |
Of the "banned" (max_results_day=-1) of which there are 1203, Maybe due to the hyper-threading problem? http://www.guru3d.com/news-story/debian-project-warns-turn-off-hyperthreading-with-skylake-and-kaby-lake.html |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Well I never look a gift horse in the mouth, but I have been using Fortran since 1964... Not up to speed yet on Fortran90/95 and one time I shall talk about Fortran 2003! where I eagerly await a proper implementation. I don't have time right now, but I can provide some tests for input/output binary decimal and decimal binary conversion (AND I must watch Roger at Wimbledon tomorrow!) (Have to take a break. Too hot at the pool, and too sunny to read my screen, and my battery is flat!) |
Send message Joined: 24 Oct 04 Posts: 1176 Credit: 54,887,670 RAC: 5,761 |
(Have to take a break. Too hot at the pool, and too sunny to read my screen, and my battery is flat!) Here in the Great NW PDT we are in that typical month when it is sunny every day and at times too hot for me (but once Fall gets here it will rain until next June) I do have a pond and creek on the property but I haven't had my laptop out there since T4T started and I for some reason have to have every computer running Cern tasks 24/7 so it has been sitting here in the house plugged into the AC for just over 5 YEARS 24/7 and the only break it gets is when the power goes out in a Winter storm. (my only laptop ever so I must have picked the right one) Have a fine weekend Eric and Ivan (I'll email you some sunshine if you need it Ivan) (oh and IBM Fortran is 15 months older than me) The next revision of the language (Fortran 2015) is intended to be a minor revision and is planned for release in mid-2018. Volunteer Mad Scientist For Life |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
This looks very very interesting, I'll see if I can get some feedback from experts. Thanks a lot. Eric. Of the "banned" (max_results_day=-1) of which there are 1203, |
Send message Joined: 29 Aug 05 Posts: 1061 Credit: 7,737,455 RAC: 298 |
Well I never look a gift horse in the mouth, but I haveOK, you beat me by 5 or 6 years on FORTRAN, but I've done a fair bit of 90/95. Not so much 2003 I guess, and I've never actually tried co-arrays yet. I presume you are aware of it, but the Usenet newsgroup comp.lang.fortran can be a great resource. I don't have time right now, but I can provide some tests for input/output binary decimal |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 1,266 |
Needless to say I shall be looking VERY closely at least the first The first 7 hosts: HostID CPU type Cores Threads Linux HT 10452223 Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz [Family 6 Model 158 Stepping 9] 4 4 4.8.0-54-generic x 10480022 Intel(R) Xeon(R) CPU E3-1505M v5 @ 2.80GHz [Family 6 Model 94 Stepping 3] 4 8 4.8.0-54-generic ON 10486162 Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz [Family 6 Model 158 Stepping 9] 4 4 4.8.0-56-generic x 10487841 Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz [Family 6 Model 94 Stepping 3] 4 8 4.8.0-58-generic ON 10485156 Intel(R) Core(TM) i5-7500 CPU @ 3.40GHz [Family 6 Model 158 Stepping 9] 4 4 4.8.0-54-generic x 10484503 Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz [Family 6 Model 158 Stepping 9] 4 8 4.8.0-54-generic ON 10480909 Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz [Family 6 Model 94 Stepping 3] 4 8 4.8.0-56-generic ON |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 1,266 |
HostID not on your list, but seems to me also not very reliable: 10489186 AMD Ryzen 7 1800X Eight-Core Processor [Family 23 Model 1 Stepping 1] Cores 8 Threads 16 OS Linux 4.8.0-58-generic HT ON |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Well thanks; I shall check it out. I see at least one result and I don't believe this is a SixTrack problem. Late though, maybe only tomorrow. It is another 4.8.* Linux. I only looked at Intel Family 6. I guess I shall also have too look at AMDs....now I know how to do it :-) (Latest news suggests a hyper-threading problem.) . Eric. stderr out process exited with code 193 (0xc1, -63) SIGSEGV: segmentation violation Stack trace (5 frames): [0x8266bed] [0xf7727cb0] [0x825acd7] [0x8253b2f] [0x81fb4f8] Exiting... ]]> HostID not on your list, but seems to me also not very reliable: |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
OK, maybe my list is incomplete. This host is already banned though. Eric. HostID not on your list, but seems to me also not very reliable: |
Send message Joined: 14 Jan 10 Posts: 1422 Credit: 9,484,585 RAC: 1,266 |
OK, maybe my list is incomplete. This host is already banned though. Eric. This one is not (yet) banned, but seems not trustful: Host 9841071 An AMD Phenom II with OS Linux 4.4.0-83-generic?! Not 4.8 Results: State: All (1118) · In progress (0) · Validation pending (452) · Validation inconclusive (373) · Valid (1) · Invalid (125) · Error (167) |
©2024 CERN