1) Message boards : Number crunching : Time limit errors???? (Message 27304)
Posted 7 Apr 2015 by Tex1954
Post:
Right, apologies. These are my famous tests "wzero".
Should all be out of the way soon I hope, Eric.


I don't mind if it's not the setups fault... just worried the hardware might be glitching..

I have a 4.3GHz 2600K setup I just dedicated to CERN stuff and worried a little... 2 WU's each...




Thanks!

9-)
2) Message boards : Number crunching : Time limit errors???? (Message 27299)
Posted 7 Apr 2015 by Tex1954
Post:
Howdy!

Getting way too many of this lately:

http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10356474&offset=0&show_names=0&state=6&appid=


<core_client_version>7.4.42</core_client_version>
<![CDATA[
<message>
exceeded elapsed time limit 5738.95 (1800000.00G/313.65G)
</message>
<stderr_txt>


Unhandled Exception Detected...

- Unhandled Exception Record -
Reason: Breakpoint Encountered (0x80000003) at address 0x76873226


Is it my setup or a bug or what? Lot of wasted time there...

Thanks!

9-)
3) Message boards : Number crunching : All tasks end with error (Message 25828)
Posted 14 Sep 2013 by Tex1954
Post:
I get nothing but errors running Linux Mint Cinnamon 14.

Processor AMD 1045T.

8-)
4) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25806)
Posted 8 Sep 2013 by Tex1954
Post:
I am still running BOINC client 6.10.58 on my Linux box and have no problem.
Tullio


Would be interesting to note if it was a Linux kernal/build problem or a specific BOINC client problem or a combination of both..

I run Linux MINT Cinnamon (latest) with client 7.0.65...

8-)
5) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25804)
Posted 7 Sep 2013 by Tex1954
Post:
I have all the correct and updated libraries... I think... pretty sure because everything else 32b/64b applications run fine. ONLY LHC is giving me problems.

As I think back, it seems this isn't the first time I've run across this problem... seems to me another Beta project had the same problem... if I could just remember..

8-)
6) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25802)
Posted 7 Sep 2013 by Tex1954
Post:
Well, if one runs OpenCL with Nvidia (like me) under Linux, then newer 7.x.x BOINC clients are required...

Obviously there is some difference in how the Linux tasks are built... the Windoz tasks run fine.

I would guess we may need a version switch in the tasks to account for 6.x.x vs. 7.x.x or something like that...

:D
7) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25800)
Posted 7 Sep 2013 by Tex1954
Post:
I am still running BOINC client 6.10.58 on my Linux box and have no problem.
Tullio


Possibly a bug or library change in the newer BOINC clients? They all error out the same way with that heartbeat thing...

:)
8) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25789)
Posted 7 Sep 2013 by Tex1954
Post:
I am getting a completely new experience with many of the new work units. They run to about 0.030% completion in 14 seconds on my computer, and then go back to zero %. The elapsed time in all those cases, as displayed in BOINC manager, also jumps back from around 14 to 4 seconds. Something I did not know was possible. In other words, elapsed time is jumping backwards.

I'm aborting those jobs now, because they just seem to be completely stuck without progress. Because it's my bed time I will suspend LHC for now. Please let me know if and when it is safe to resume LHC again.


I tried again on a 2600K Linux system and all 8 tasks jumped around then errored out again. Seems to run okay on my Windoz-7 systems, but not on linux at all.

http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10299948

:)
9) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25760)
Posted 5 Sep 2013 by Tex1954
Post:
I tried a few new WU's again... same problem... I run Linux Mint 64b on my systems with BOINC CLIENT 7.0.65.

Oh well...

:D
10) Message boards : Number crunching : Big problem: work units running with negative time. (Message 25757)
Posted 4 Sep 2013 by Tex1954
Post:
I am getting a completely new experience with many of the new work units. They run to about 0.030% completion in 14 seconds on my computer, and then go back to zero %. The elapsed time in all those cases, as displayed in BOINC manager, also jumps back from around 14 to 4 seconds. Something I did not know was possible. In other words, elapsed time is jumping backwards.

I'm aborting those jobs now, because they just seem to be completely stuck without progress. Because it's my bed time I will suspend LHC for now. Please let me know if and when it is safe to resume LHC again.


I am getting the same problem. I look into the log file and see "No Heartbeat from client in 30 seconds, restarting..."

Major symptom is Elapsed Times keeps jumping backward...

As of an hour or so ago, all the new WU's failing...

I aborted the bad WU's and kept the good... NNT for now until I see the uploads proceed properly.

:)

PS: This problem only show up on my Linux machines so far... forgot to mention that...
11) Message boards : Number crunching : Uploads failing: Server out of disk space (Message 25756)
Posted 4 Sep 2013 by Tex1954
Post:
Having same problem... many upload retries..

:)
12) Message boards : Number crunching : Long WU's (Message 24496)
Posted 4 Aug 2012 by Tex1954
Post:
Wow, one of my 3.75GHz AMD boxes finished one that took 24 hours... and the wingman is a 1.6GHz Pentium-M that will probably take 3 times longer...

Does this mean the equations are turning out more and more relevant data???

Many longer tasks lately... but they seem to finish just fine...

I have one task at 29 hrs now with 14 more to go on the same box... I have another on the 950 box at 39hrs elapsed with 11 more to go as well... LOL!

:)

http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=2493244
13) Message boards : Number crunching : No Tasks ??? (Message 24494)
Posted 4 Aug 2012 by Tex1954
Post:
Wow, one of my 3.75GHz AMD boxes finished one that took 24 hours... and the wingman is a 1.6GHz Pentium-M that will probably take 3 times longer...

Does this mean the equations are turning out more and more relevant data???

:)

http://lhcathomeclassic.cern.ch/sixtrack/workunit.php?wuid=2493244
14) Message boards : Number crunching : How is the SSE3 thing coming along? (Message 24383)
Posted 16 Jul 2012 by Tex1954
Post:
I just got a load of those ZERO length tasks... and this time it crashed the system somehow. Just in case, lowered the speed (1100T box) to run cooler.

Also checked the data directory and it appears there are more ZERO length files in the wings. Last time this was a disk full error I think. Anyways, they only run a couple seconds so no biggy.

WOOPSY!!!

:)
15) Message boards : Number crunching : Damn wingman (Message 24361)
Posted 14 Jul 2012 by Tex1954
Post:
I can vouch for Overclockers having problems. Most of the problems I catch, but for a day or so, two boxes had problems.

One box was easily corrected with a voltage tweak that was neglected on a major BIOS update.

The other box had a bad motherboard and it was replaced.

Both these boxes would run Prime95 all day/night no problem but fail other tests and certain BOINC jobs.

I have since proven the corrections stable as before and all is okay now...

Soo, yes, overclockers can generate good results that are wrong from time to time and one has to be painfully careful with that.

Even machines setup pure stock with BIOS defaults can fail if the BIOS doesn't set things up properly... and I've experienced that as well.

Also, it would be helpful (in the future) if the server would send a NOTICE to offending machines to wake up folks should this prove to be a problem in the future.

However, usually BOINC will start doing random stupid things when there is a problem... like shutting down unexpectedly without error... GPU drivers suddenly working more slowly... all kinds of hints that something isn't perfect.

:)
16) Message boards : Number crunching : No Tasks ??? (Message 24353)
Posted 14 Jul 2012 by Tex1954
Post:
Okay, have the new Linux installed on the 800D box with the Sabertooth Mobo... had to update the Realtek driver to get the LAN to work right and now it's humming fine.

It has also downloaded only PNI tasks and they are running fine.

I like running linux since many projects run significantly faster using Linux. For instance, Correlizer@home tasks take 11:35 under windows on this box, but the same tasks take only 9:54 under Linux. Same for many others..

Thanks for all your hard work! Looks like all my systems talking and working well now!

And the SSE3/PNI tasks run almost twice as fast... 7+ hrs old tasks vs. 4hrs SSE3/PNI on the 1055T box (3.25GHz) for instance.. WOOHOO!

:)
17) Message boards : Number crunching : No Tasks ??? (Message 24349)
Posted 14 Jul 2012 by Tex1954
Post:
Just got another load on the i7-950 system... so looks like things moving along well.

This load was SSE3 and PNI both...

On the Compaq box (running Linux at the moment), the tasks take 4 hours instead of the usual 7 or so... so much improved.

I LIKE what they are doing! The tweaks are improving things all around!

Hip Hip Hurray for smart developers! I bet this project get tuned perfectly very soon!

:)
18) Message boards : Number crunching : No Tasks ??? (Message 24332)
Posted 13 Jul 2012 by Tex1954
Post:
Woopsy!

Just got a load of (pni) work on the compaq...

Running now!

:)
19) Message boards : Number crunching : No Tasks ??? (Message 24328)
Posted 13 Jul 2012 by Tex1954
Post:
Yup, me too. What was working no longer works... sigh... not even the i7-950 box gets any more tasks...

Have to smile about that! It's a sure sign folks are tweaking the system.

LOL!

:)
20) Message boards : Number crunching : How is the SSE3 thing coming along? (Message 24305)
Posted 12 Jul 2012 by Tex1954
Post:
Just got 18 of the slow tasks on the 1055T box... the ones that take > 7 hours?

LOL!

Seems random to me still...

:)


Next 20


©2024 CERN