Message boards :
Number crunching :
Two instantaneous crashes
Message board moderation
Author | Message |
---|---|
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
Hi I've just downloaded two SixTrack 4.67 tasks onto this single-core AMD with boinc 5.10.20. Both appear to have started simultaneously (I've never seen this phenomenon on the single-core before) and both crashed simultaneously with 1 sec of computing time. Has anyone any idea how two tasks can start together on a single-core? (Both still show in my account as in progress as this has only just occurred.) When I looked at the tasks in my account I noticed a second problem. I reckon I must have crunched a total of about 20 tasks for LHC, all recently, yet only 4 are shown - the 2 that have just crashed and 2 previous successes. The credit of 176 is, I think, correct, but isn't of course consistent with only 2 completed tasks. Any ideas? http://lhcathome.cern.ch/lhcathome/hosts_user.php?show_all=1&sort=rpc_time |
Send message Joined: 27 Oct 07 Posts: 186 Credit: 3,297,640 RAC: 0 |
Hi re fewer results shown than you reckon you've crunched: Yes, I was surprised by that when I joined LHC at the weekend. What I reckon is that the process called 'purging' - removing interim results from the BOINC database once validated - is set even more aggressively here than on other BOINC projects. I'm used to Einstein results remaining visible for a week or more, and SETI results for at least a day: but my first LHC was history in well under 12 hours. And the simultaneous start - are you sure that they didn't run consecutively, but for less than a second and within the same 'quantized' 1 second reporting interval? PS Did you mean these results? hosts_user.php shows the reader's results, not the poster's. |
Send message Joined: 14 Jul 05 Posts: 275 Credit: 49,291 RAC: 0 |
PS Did you mean these results? hosts_user.php shows the reader's results, not the poster's. Unless it has the userid explicitly. http://lhcathome.cern.ch/lhcathome/hosts_user.php?userid=86197 |
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
Sorry about giving the generic link and not my own. Yes, that's the right computer. On investigation, the 2 tasks do seem to have run consecutively and not together. Two models over in 3 seconds according to the messages. In the Tasks window both crashes appeared to happen simultaneously. You can see I'm accustomed to crunching in slow motion on CPDN; I've never previously seen anything happen like this in a flash of lightning. Anyway, at least it means boinc didn't misbehave (in this respect!). Thanks for the help. |
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
The LHC server seems to consider these two 1-second results a great success. Over - Success - Done. Exit code 0. I can find no crash or error messages or codes and the tasks didn't go through the Ready to report stage. Maybe they were bewitched on Hallowe'en. |
Send message Joined: 3 Jan 07 Posts: 124 Credit: 7,065 RAC: 0 |
The LHC server seems to consider these two 1-second results a great success. Over - Success - Done. Exit code 0. I can find no crash or error messages or codes and the tasks didn't go through the Ready to report stage. These are just results where the simulation determined that the beam could not make it around the track. Sometimes that happens in less than a second, other times it may take a minute or two... I had 4 of these out of my allotted 10 so far for today... Now, if only they could get them to validate instead of sitting as pending... |
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
Thanks for the info, Brian. I'm a newbie on LHC and had no idea what could cause this. I've looked at the results from other members who received the same workunit and exactly the same thing happened to them. I now know that my computer didn't behave badly, which is a relief. |
Send message Joined: 12 Sep 06 Posts: 13 Credit: 47,187 RAC: 0 |
Hello, Quite remarkable i should say.. The only strange event i ever recall is that my 8 x cpu compaq server crunched 9 x Wu's successfully. :0) The wu's were all from different projects. My server somehow gained a phantom physical cpu rather than a phantom/ghost WU. How i dont know ... Kind regards, John Gray |
Send message Joined: 7 Oct 06 Posts: 114 Credit: 23,192 RAC: 0 |
:) LoL! mo.v you should have kept a back up copy ;) like we do at CPDN. Regards Masud. |
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
Jon Boy, if you mean an extra record of your computer on your server web pages for the project, this extra record is generated when you restore a backup. To get rid of the superfluous record, you go to the detailed page for one of the computer records and at the bottom you'll see the Merge button. This merges the different records of the same computer into one. Merge is better than Delete. When you've done that you need to update all your projects on all your computers to help the project servers work out again who you are. Superluous records of the same computers were also generated with one of the new boinc versions. These can't always be merged. Only records with identical descriptions will merge. I have 2 descriptions of this same old computer; I expect they were generated by different versions of boinc. Because the descriptions aren't identical it's a waste of time trying to merge them. |
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
KAMasud I've never before seen anything so different from CPDN! Next time I'll know not to panic. My first 20 or so tasks from LHC ran perfectly. |
Send message Joined: 12 Sep 06 Posts: 13 Credit: 47,187 RAC: 0 |
Hello, No i don't mean that at all...(or anything remotely near to what your on about)- Even though i understand what your on about. My server has 8x physical cpus installed... So my server can run/crunch/work on 8x wu's at a time... ( 1x cpu = 1x running instance ) But for some reason unknown to me it was working on 9 wu's..all at the same time. Technically this is impossible - thats why its strange. I could have 20 or 30 or 100 wu's in a list waiting *** ( BUT it can only run 8 at atime ) *** as the server is ( 8x way...) Kind regards, John |
Send message Joined: 16 Oct 07 Posts: 8 Credit: 344 RAC: 0 |
At first I thought both tasks were active together on the single core here. But when I looked in the BM messages at the exact second when each event took place, it turned out that the first task crashed in one second, then the second task got the CPU to itself for the following second. Did you actually see 9 tasks listed as running in the Tasks tab, all at the same time? Did any of the tasks crash or terminate in a flash? If so, I wonder whether your BM messages might reveal something similar to what happened here. But if you saw 9 X Running, that's most bizarre. (Maybe the manufacturer added a 9th core as a freebie!) |
Send message Joined: 12 Sep 06 Posts: 13 Credit: 47,187 RAC: 0 |
At first I thought both tasks were active together on the single core here. But when I looked in the BM messages at the exact second when each event took place, it turned out that the first task crashed in one second, then the second task got the CPU to itself for the following second. Hello, Yes indeed 9 WU's were indeed running in the tasks tab all at the same time. They all completed successfully also.Uploaded without problems and recieved the customary credit also. The 9 wu's concerened were all from different projects also.( lhc,seti,rosetta bla bla bla) The only reason i even noticed was that - i was waiting for them all to finish so i could to my annual maintenance on them and shut them down for a bit of an economy drive (electric) :0) I have: 2 x 8 cpu Compaq data centre servers 7 x dual core Compaq workstations 1 x quad (2x physical & 2x virtual-hyper threading) core compaq workstation 34 CPU's alltogether.. |
Send message Joined: 24 Nov 06 Posts: 76 Credit: 7,953,478 RAC: 10 |
The 9 wu's concerened were all from different projects also.( lhc,seti,rosetta bla bla bla) Was one of the projects DepSpid? Dublin, California Team: SETI.USA |
Send message Joined: 12 Sep 06 Posts: 13 Credit: 47,187 RAC: 0 |
The 9 wu's concerened were all from different projects also.( lhc,seti,rosetta bla bla bla) Hello, No sorry .I dont recall exactly what 9 they were but it was around the time i started crucnching/added superlink. Kind Regards, John |
Send message Joined: 24 Nov 06 Posts: 76 Credit: 7,953,478 RAC: 10 |
The 9 wu's concerened were all from different projects also.( lhc,seti,rosetta bla bla bla) No it wasn't DepSpid? Or No you don't remember? I ask because DepSpid is a non-CPU Intensive application, which means you can run up to 10 DepSpid tasks at the same time as other projects run one per core. So with an 8-core machine, you can be running up to 18 total tasks at the same time (8 normal + 10 DepSpid). Dublin, California Team: SETI.USA |
©2024 CERN