Message boards :
Number crunching :
Tasks v530.09 crashing
Message board moderation
Author | Message |
---|---|
Send message Joined: 6 Sep 08 Posts: 117 Credit: 12,346,476 RAC: 18,332 |
My trusty Win2K laptop which has been running v530.08 tasks without problem has just started it's first .09 task which crashed, as did the next one, as here . The stderr shows exit code 168, which means nothing to me. I don't really want to move from BOINC v5 to v6 unless I have to, disk space is at a premium on this host and newer versions of things often seem to take more space. There are no errors reported in the Boinc Manager messages log. Any suggestions gratefully received. John. |
Send message Joined: 2 Sep 04 Posts: 209 Credit: 1,482,496 RAC: 0 |
As I see your other computers are returning 530.09 OK lets assume it is the laptop only. I also see the wingmen on those tasks completed ok, vaiting valaidation so that kind of eliminates bad work units. Also to find out what error codes mean, you can look at the result detail once it is returned to the project. then you can go to the unofficial boinc wiki and search for error code, there is a list of error codes that sometimes gives more detail on a certain error. -168 is ERR_FTOK I don't know what that means, search the net I found ERR_FTOK -168 BOINC cannot get file token (key) for semaphores. I still don't know this one. Have you restarted boinc or windows since the errors ? Always try the simple things first. If so and those don't help, then try a project reset on the laptop, quite possibly the download of a new app went wrong. A reset will download again and that may have been the problem. |
Send message Joined: 18 Sep 04 Posts: 143 Credit: 27,645 RAC: 0 |
Hold on, "Exit code 168" and "Exit code -168" are two different beasts. BOINC errors are negative errors. Positive errors are science application errors. The error here is "exit code 168 (0xa8)", thus a science app error. The whole error is: <stderr_txt> I then looked up the "forrtl: severe (168)" part of the error and came to this thread on the Intel boards, where it says: "Turns out that the problem was caused by an older-generation processor not understanding newer instructions. The application had been compiled with the "Generate most optimized code" (/fast) setting, which implies /arch:host." Probably something similar has happened here. The "Genuine Intel x86 Family 6 Model 8 Stepping 3 746MHz", is a what? A Pentium III, an P2/3 model Celeron or a Pentium II Xeon? The 530.09 science application is probably compiled with an instruction set that these old CPUs do not understand. Hence the errors. Jord BOINC FAQ Service |
Send message Joined: 6 Sep 08 Posts: 117 Credit: 12,346,476 RAC: 18,332 |
Ageless. You got it in one, as they say. I'll bow out gracefully at this point. Thanks both. Goodbye. John. |
Send message Joined: 16 May 11 Posts: 79 Credit: 111,419 RAC: 0 |
Apparently, we should have kept the version 530.8 for older processors. It is still possible, I have not removed them. What would be the architecture designation for distinguishing the old and the new? skype id: igor-zacharov |
Send message Joined: 25 Jan 11 Posts: 179 Credit: 83,858 RAC: 0 |
Will results from the 530.8 app verify with results from 530.9? There aren't many CPUs that old. Are there enough to justify maintaining 2 versions of the app? You've proposed shortening the deadline, will CPUs that old be able to meet the new deadline? |
Send message Joined: 16 May 11 Posts: 79 Credit: 111,419 RAC: 0 |
yes, the 530.9 and 530.8 deliver identical results (within the model, where we look for last bit differences). The 530.9 can be factor of 2 faster, but not always - it does not optimize away calculations, it (the compiler in fact) just organizes them better by using the pipelining and special instructions. Yes, there is no problem for us to keep multiple version of the same. I just need to find out how to call the architecture to which the older cpus belong. This will allow for automatic selection of the executable. Shortening the deadlines is only a discussion item at this time. I still need to assess what will have the largest inpackt on the efficiency of calculations. skype id: igor-zacharov |
Send message Joined: 6 Sep 08 Posts: 117 Credit: 12,346,476 RAC: 18,332 |
Having started all this, I feel somewhat guilty, so here goes:-
The relevant information is probably in clent_state.xml. Oddly the "features" line here shows the same flags for the "rogue" machine (Pentium 3?) as for a Pentium 4 which works. I have a vague recollection that Boinc was able to send different apps to different hosts.
For those volunteers who don't run "farms" or use their employers' machines, I think the largest factor in work throughput is probably the time that the machine is actually running rather than it's speed. Energy use is the factor here. I'm sure that Boinc makes an effort to predict whether work will finish within the deadline, and will not download if work won't complete. It certainly used to. This would automatically remove work from hosts that were just too slow, even if availability rather than processor speed was the cause. The server even sent a message to that effect. Having said all that, given that the purpose of Boinc is to get the work done rather than to keep the likes of me happy so, although a quick search through BoincStats finds thousands of "Family 6" cpus I fear that many are not still active and am not sure that they justify much work to accommodate. Sadly. John. |
Send message Joined: 18 Sep 04 Posts: 143 Credit: 27,645 RAC: 0 |
Apparently, we should have kept the version 530.8 for older processors. You always increment version numbers, so re-releasing 530.8 as 530.91 would be the next logical choice. If you want to designate these to specific CPUs only, you'll need HR type 1, or to set up application plan classes. Edit: But you can also ask yourself, is it worth it? Does this project have that many old CPUs attached? You can check that in the database. Or do you just not want to set a minimum CPU/minimum OS/minimum BOINC version as requirement? All questions you may answer for yourself. :-) Jord BOINC FAQ Service |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 4 |
I trashed a few WUs with this error and did a full un- and re-install of Boinc just in case something had become corrupted on an Athlon XP 2400. Surely that's not too old and slow. If anything, I think it's faster than the lappy that does most of my WUs. |
Send message Joined: 18 Sep 04 Posts: 143 Credit: 27,645 RAC: 0 |
Surely that's not too old and slow. Both the Pentium III and the Athlon XP 2400+ can do SSE, but not SSE2 or above. So, if the 530.9 science application was compiled to use the SSE2 instruction set, whereas the 530.8 version was compiled to only use the SSE instruction set, then that's what causing these errors. Jord BOINC FAQ Service |
Send message Joined: 18 Sep 04 Posts: 143 Credit: 27,645 RAC: 0 |
You always increment version numbers, so re-releasing 530.8 as 530.91 would be the next logical choice. Ugh, it was 530.08 and 530.09, so the next logical choice is 530.10 ;-) Jord BOINC FAQ Service |
Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,866,264 RAC: 4 |
Thanks, Jord. Been looking for an excuse to upgrade anyway; single core and the PSU fan's a bit noisey. I'll maybe hold off going out to buy in case they can find a way to distribute WUs according to CPU architecture. Waiting developments. Lappy crunching away happily at LHC 1 and T4T work |
Send message Joined: 6 Sep 08 Posts: 117 Credit: 12,346,476 RAC: 18,332 |
|
Send message Joined: 18 Sep 04 Posts: 143 Credit: 27,645 RAC: 0 |
The Pentium 4 uses MMX, SSE and SSE2, as long as the operating system knows about these instruction sets as well. Windows 2000 does not support SSE2 or any instruction set thereafter. It only supports up to SSE. I gave the instruction sets of SSE and SSE2 as examples, I hadn't checked all of Igor's posts to see what they were actually using. But at least that explains things further. To be able to use SSE3, one needs a Pentium 4 Prescott CPU or better, or an Athlon 64 or better and Windows XP or better. SSE3 is also known as PNI (Prescott New Instructions). Jord BOINC FAQ Service |
Send message Joined: 17 Feb 09 Posts: 22 Credit: 311,184 RAC: 0 |
I recycled (took it to the recycling center) my last Pentium 3 earlier this year. It's just wasn't worth the electricity. |
Send message Joined: 17 Feb 09 Posts: 22 Credit: 311,184 RAC: 0 |
Thanks Ageless, I'm sure you're on the right lines, but this host reports sse not sse2 and is running 530.09 as I write. Igor wrote here that 530.09 uses sse3. Now I'm really confused. Different Arch's have different apps, same version. 64-bit processors support sse3. I requested sse3 support for Linux 64-bit. I don't know if the 32-bit apps were compiled that way. Applications: Microsoft Windows (98 or later) running on an Intel x86-compatible CPU 530.09 4 Oct 2011 15:31:31 UTC Microsoft Windows running on an AMD x86_64 or Intel EM64T CPU 530.09 4 Oct 2011 15:31:31 UTC Linux running on an Intel x86-compatible CPU 530.09 4 Oct 2011 15:31:31 UTC Linux running on an AMD x86_64 or Intel EM64T CPU 530.09 4 Oct 2011 15:31:31 UTC |
Send message Joined: 6 Sep 08 Posts: 117 Credit: 12,346,476 RAC: 18,332 |
The Pentium 4 uses MMX, SSE and SSE2, as long as the operating system knows about these instruction sets as well. Windows 2000 does not support SSE2 or any instruction set thereafter. It only supports up to SSE. Well, it confuses me. I've got a host running Windows 2000, so presumably no sse2 never mind sse3 and its 10% through a 530.09 task, with no problems that I can see. To be able to use SSE3, one needs a Pentium 4 Prescott CPU or better, or an Athlon 64 or better and Windows XP or better. SSE3 is also known as PNI (Prescott New Instructions). John. |
Send message Joined: 25 Jan 11 Posts: 179 Credit: 83,858 RAC: 0 |
I don't know if the 32-bit apps were compiled that way. The app sent to my Linux 64 bit machine is 32 bit. Run the file command against it and see. Don't know if the Windows app for 64 bit arch is 32 or 64. Applications: The above means they support those archs but doesn't necessarily mean they have a 64 bit app for 64 bit arch. |
Send message Joined: 2 Sep 04 Posts: 209 Credit: 1,482,496 RAC: 0 |
I don't know if the 32-bit apps were compiled that way. In windows 7 x64 task manager, a lot of programs have *32 next to them, including sixtrack, i assume that means 32bit.
|
©2024 CERN