Message boards :
Number crunching :
Time limit errors????
Message board moderation
Author | Message |
---|---|
Send message Joined: 24 Apr 11 Posts: 37 Credit: 1,295,012 RAC: 0 |
Howdy! Getting way too many of this lately: http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10356474&offset=0&show_names=0&state=6&appid= <core_client_version>7.4.42</core_client_version> <![CDATA[ <message> exceeded elapsed time limit 5738.95 (1800000.00G/313.65G) </message> <stderr_txt> Unhandled Exception Detected... - Unhandled Exception Record - Reason: Breakpoint Encountered (0x80000003) at address 0x76873226 Is it my setup or a bug or what? Lot of wasted time there... Thanks! 9-) |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Right, apologies. These are my famous tests "wzero". Should all be out of the way soon I hope, Eric. |
Send message Joined: 24 Apr 11 Posts: 37 Credit: 1,295,012 RAC: 0 |
Right, apologies. These are my famous tests "wzero". I don't mind if it's not the setups fault... just worried the hardware might be glitching.. I have a 4.3GHz 2600K setup I just dedicated to CERN stuff and worried a little... 2 WU's each... Thanks! 9-) |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
That sounds great; I may have more news later. I am also very interested in results from these modern very fast machines. SixTrack is a great test apart from producing useful results. Eric. |
Send message Joined: 26 Jul 05 Posts: 63 Credit: 4,083,755 RAC: 0 |
Right, apologies. These are my famous tests "wzero". I'm rather disappointed with the wzero_jbbtcm1's, my batch are all running perfectly! |
Send message Joined: 2 Sep 04 Posts: 455 Credit: 200,267,081 RAC: 48,388 |
I am also very interested in results from these modern very fast machines. My Prozessor tells these Features: Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss htt tm pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 vmx smx tm2 pbe fsgsbase bmi1 hle smep bmi2 short: sse sse2 pni ssse3 At the Moment, all running WUs of Sixtrack on this box are sse2. In the history I see that this box has only crunched WUs with sse2 or "nothing". On the program-side is shown, that there are versions for sse3 and pni. So, is it okay, that only sse2 is sent to this box ? Supporting BOINC, a great concept ! |
Send message Joined: 29 Nov 09 Posts: 42 Credit: 229,229 RAC: 0 |
I used to receive pni and sse3 (the proof is in those buggy works I celebrate the birthdays of), but indeed, recently there are only sse2. Maybe it's to better hunt bugs? edit: I just realized you had asked it back in a dedicated topic.Sorry for posting here. |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 0 |
All today's wsuper_ sixtracktest wus are showing a 197 (0xc5) EXIT_TIME_LIMIT_EXCEEDED error. All wingmen are showing the same error. Initial time estimate for most is 11 mins although a few have been 30hours. They get to c.50% in 4mins then reset to 0% and an estimate of 90hours. They proceed to 10% in about 2 hours with time estimate reducing in large chunks but then exit with the time-exceeded error. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks Ray; looking at that problem right now. Eric. (Testing a new version which should return all result files...... :-) |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
NO, this is NOT normal. Thanks for that valuable feedback. Eric. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
OK; a bug on my part. Jobs were running for one million turns but were supposed to stop after 10,000 turns. The limits were set for only 10,000 .That was the bug. More tests coming. Eric. |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
I have 'inoculated' task 67569958 by increasing <rsc_fpops_bound> by several orders of magnitude - so you should get at least one result from the first set. On the other hand, it might be quicker to abort it and run a 10K turn task instead, or find the million-turn parameter and turn it down a bit. Up to you. |
Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0 |
Thanks Richard; don't waste time. That one result I can check anyway, but corrected tests are on their way. I don't know how to get rid of the bad tests, probably too late, but there are not too many. Eric. |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
OK, it hadn't even started, so I've aborted it - no time wasted (the machine needed a reboot for an AV update anyway). |
©2024 CERN