Message boards :
Number crunching :
All tasks end with error
Message board moderation
Author | Message |
---|---|
Send message Joined: 27 Sep 04 Posts: 20 Credit: 23,880 RAC: 0 |
All tasks with the new SixTrack 446.03 are erroring out on my PC, I checked and I saw they're crashing also on some other users, but not for all: http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=9028119 http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=9032177 http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=9112361 The error is "-226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS" Already tried to reset LHC project in Boinc... what's going on? (other Bpoinc projects are working fine on my machine) |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks for the feedback. I am at home now so it would be good if you could just tell me whi Linux you are running. I found your AMD description. Thanks. Eric. |
Send message Joined: 17 Feb 06 Posts: 2 Credit: 1,686,368 RAC: 0 |
same problem... currently 1009 tasks 766 with errors, i tried reset project, but nothing changed. debian jessie: Linux server 3.9-1-amd64 #1 SMP Debian 3.9.8-1 x86_64 GNU/Linux debian wheeze (older boinc) seems to be ok |
Send message Joined: 4 Jan 07 Posts: 3 Credit: 2,197,570 RAC: 0 |
I've noticed this for a couple of days. Task error logs (stderr.txt) are filling with "... No heartbeat from client for 30 sec - exiting" and end with "Compute error" (-226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS). Resetting project had no effect. Unpacked one of the failing tasks .zip file into separate directory and ran sixtrack there - .fort files are being updated and growing, it hasn't crashed yet. Linux: Linux 3.9.6-gentoo-Intel-Core-i7 #1 SMP Thu Jun 20 21:48:50 EEST 2013 x86_64 Intel(R) Core(TM) i7 CPU 950 @ 3.07GHz GenuineIntel GNU/Linux BOINC: 7.2.0 (x64) SixTrack: 446.03 (pni) sixtrack_lin64_4463_pni.exe (in projects dir) 'file' reports it as "sixtrack_lin64_4463_pni.exe: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.6.9, not stripped", guess it isn't really a 64-bit app? Had a look at my tasks: Inclonclusive (1): http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=9018280 Error (306): first one: http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=8972676 Also, some tasks seem to be OK: http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=9086803 All tasks in "Error" list belong to SixTrack v446.03 (pni) |
Send message Joined: 26 Sep 11 Posts: 37 Credit: 7,719,477 RAC: 465 |
Yep. That is the same problem as discussed in this thread: http://lhcathomeclassic.cern.ch/sixtrack/forum_thread.php?id=3767 The errors and error messages described here are the same |
Send message Joined: 4 Jan 07 Posts: 3 Credit: 2,197,570 RAC: 0 |
Update: in standalone mode, sixtrack completed in ~6h as expected. Communication problem(s) between sixtrack and boinc_client, something wrong in init_data.xml? |
Send message Joined: 27 Sep 04 Posts: 20 Credit: 23,880 RAC: 0 |
Thanks for the feedback. I am at home now so it would I'm running Fedora 19 64bit. Here are starting lines in Boinc output: sab 07 set 2013 15:08:15 CEST | | No config file found - using defaults sab 07 set 2013 15:08:15 CEST | | Starting BOINC client version 7.0.65 for x86_64-pc-linux-gnu sab 07 set 2013 15:08:15 CEST | | log flags: file_xfer, sched_ops, task sab 07 set 2013 15:08:15 CEST | | Libraries: libcurl/7.29.0 NSS/3.15.1 zlib/1.2.7 libidn/1.26 libssh2/1.4.3 sab 07 set 2013 15:08:15 CEST | | Data directory: /home/marvin/Boinc sab 07 set 2013 15:08:15 CEST | | Processor: 4 AuthenticAMD AMD Phenom(tm) II X4 965 Processor [Family 16 Model 4 Stepping 3] sab 07 set 2013 15:08:15 CEST | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt hw_pstate npt lbrv svm_lock nrip_save sab 07 set 2013 15:08:15 CEST | | OS: Linux: 3.10.10-200.fc19.x86_64 sab 07 set 2013 15:08:15 CEST | | Memory: 7.80 GB physical, 7.78 GB virtual sab 07 set 2013 15:08:15 CEST | | Disk: 86.89 GB total, 80.09 GB free sab 07 set 2013 15:08:15 CEST | | Local time is UTC +2 hours sab 07 set 2013 15:08:15 CEST | | No usable GPUs found sab 07 set 2013 15:08:15 CEST | SETI@home | URL http://setiathome.berkeley.edu/; Computer ID 6981088; resource share 100 sab 07 set 2013 15:08:15 CEST | Einstein@Home | URL http://einstein.phys.uwm.edu/; Computer ID 7130990; resource share 100 sab 07 set 2013 15:08:15 CEST | SETI@home Beta Test | URL http://setiweb.ssl.berkeley.edu/beta/; Computer ID 63147; resource share 100 sab 07 set 2013 15:08:15 CEST | Asteroids@home | URL http://asteroidsathome.net/boinc/; Computer ID 23986; resource share 100 sab 07 set 2013 15:08:15 CEST | LHC@home 1.0 | URL http://lhcathomeclassic.cern.ch/sixtrack/; Computer ID 10286524; resource share 100 sab 07 set 2013 15:08:15 CEST | malariacontrol.net | URL http://www.malariacontrol.net/; Computer ID 628816; resource share 100 sab 07 set 2013 15:08:15 CEST | Milkyway@Home | URL http://milkyway.cs.rpi.edu/milkyway/; Computer ID 515018; resource share 100 sab 07 set 2013 15:08:15 CEST | SimOne@home | URL http://mmgboinc.unimi.it/; Computer ID 3627; resource share 100 sab 07 set 2013 15:08:15 CEST | rosetta@home | URL http://boinc.bakerlab.org/rosetta/; Computer ID 1612749; resource share 100 sab 07 set 2013 15:08:15 CEST | SETI@home | General prefs: from SETI@home (last modified 19-Mar-2013 23:43:11) sab 07 set 2013 15:08:15 CEST | SETI@home | Computer location: home sab 07 set 2013 15:08:15 CEST | SETI@home | General prefs: no separate prefs for home; using your defaults sab 07 set 2013 15:08:15 CEST | | Reading preferences override file sab 07 set 2013 15:08:15 CEST | | Preferences: sab 07 set 2013 15:08:15 CEST | | max memory usage when active: 5589.97MB sab 07 set 2013 15:08:15 CEST | | max memory usage when idle: 7586.39MB sab 07 set 2013 15:08:15 CEST | | max disk usage: 5.00GB sab 07 set 2013 15:08:15 CEST | | max CPUs used: 2 sab 07 set 2013 15:08:15 CEST | | suspend work if non-BOINC CPU load exceeds 50 % sab 07 set 2013 15:08:15 CEST | | (to change preferences, visit a project web site or select Preferences in the Manager) sab 07 set 2013 15:08:15 CEST | | Not using a proxy |
Send message Joined: 27 Sep 04 Posts: 20 Credit: 23,880 RAC: 0 |
'file' reports it as "sixtrack_lin64_4463_pni.exe: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.6.9, not stripped", guess it isn't really a 64-bit app? I think that's a good point... the executable seems to be a 32bit one, not 64bit, and the 'statically linked' stuff makes me wonder if can be a library problem... maybe for those who have also 32bit libraries installed SixTrack works, for whose who have only 64bit libraries SixTrack crashes. At least there's something wrong in the compiler, it outputs 32bit executables for 64bit systems. |
Send message Joined: 17 Feb 06 Posts: 2 Credit: 1,686,368 RAC: 0 |
I downgraded boinc-client from 7.2.7 to 7.0.27 (same version as second computer) and it seems to be ok |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Just to be clear; all our executables are 32-bit. They run on 64-bit systems OK. (In theory a 64-bit executable should be numerically compatible. However. this is not likely to be the case in practice. I will look at this one day, but right now need to sort out the problems with our Linux executable.) Thanks for all the feedback. Eric. |
Send message Joined: 4 May 07 Posts: 250 Credit: 826,541 RAC: 0 |
FYI... Running Windows 7 (x64), BOINC 7.0.64 (x64), SixTrack 446.03. No problems... yet. |
Send message Joined: 26 Oct 04 Posts: 6 Credit: 1,696,248 RAC: 0 |
Tried making sure any of the 32-bit libraries that could help were installed on one of my Ubuntu boxes. It didn't help. Both of the Ubuntu 12.04 boxes get the failed WU's right away. They are running BOINC 7.0.65. These also have the most cores, so I'm just trying to run LHC on my Windows boxes for now... |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Our executables are all statically linked; should be NO dependence on libraries. They do depend on system calls though, and the BOINC lib, api lib and fortran api. Working hard on fixing this. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
No problem running LHC on my Linux box, but my BOINC client is good old 6.10.58. Tullio |
Send message Joined: 24 Apr 11 Posts: 37 Credit: 1,295,012 RAC: 0 |
I get nothing but errors running Linux Mint Cinnamon 14. Processor AMD 1045T. 8-) |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I had a system crash while running SuSE Linux 12.3 and went back to SuSE Linux 12.1. My CPU is Opteron 1210 at 1.8 GHz. I made a full check on CPU, RAM memory and disks using the 1.8 version of diagnostic tools supplied by SUN when I acquired the SUN workstation in 2008. All seem OK. Tullio |
©2024 CERN