Host messing up tons of results

Author	Message
Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 214,242,703 RAC: 23,240	Message 27374 - Posted: 12 Apr 2015, 6:56:14 UTC - in response to Message 27373. If units failed due to timeout you put other users job in waste bin as they lost their already executed tasks due to your timeout I think. Admins, correct me if it's not like that. No, users don't loose their Tasks. The WU will be sent out again to a different user, but the Project Needs it urgent back and they have to wait for this and wait and wait. Not good for the Project because it Needs the results in a short mannor Supporting BOINC, a great concept ! ID: 27374 · Reply Quote

Luigi R. Send message Joined: 7 Feb 14 Posts: 99 Credit: 5,180,005 RAC: 0	Message 27375 - Posted: 12 Apr 2015, 6:56:16 UTC - in response to Message 27373. Last modified: 12 Apr 2015, 7:02:36 UTC 32 tasks is the limit for my host. 32 tasks enduring ~80s (like this) would terminate in 320s. A great number reduces the probability of getting only flash-tasks. Is there a method to know how much time will a task (before running) get? Another reason is also because I've often seen there are not many available tasks. Although I do a little "bunker", I'm finishing work before deadline (except that time). P.S. the other machine (ID: 10356455) errors is cause of win8 failure after update, so no chance to cancel them. ;) [/OT] ID: 27375 · Reply Quote

[TA]Assimilator1 Send message Joined: 29 Nov 13 Posts: 59 Credit: 4,012,100 RAC: 0	Message 27376 - Posted: 12 Apr 2015, 10:12:03 UTC - in response to Message 27375. Setting a weeks worth of work is too long really, as you discovered when you got more 8hr WUs. I usually set a cache of 3-4 days. The only way I know to see how long a WU is going to take is to look at the time 'remaining' in the tasks list. So no you can't 'cherry pick' the longer WUs. And yea LHC often runs out of WUs, that's normal ;), which is why I have it running alongside other projects. Team AnandTech - WCG, Uni@H, F@H, MW@H, Ast@H, LHC@H, R@H, CPDN, E@H. Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RTX 3060 Ti 8GB, Win10 64bit 2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64 ID: 27376 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27377 - Posted: 12 Apr 2015, 10:13:41 UTC - in response to Message 27375. 32 tasks is the limit for my host. 32 tasks enduring ~80s (like this) would terminate in 320s. A great number reduces the probability of getting only flash-tasks. Is there a method to know how much time will a task (before running) get? I am afraid not; the point of these studies is to find out whether the particles will circulate for a long time or not. Another reason is also because I've often seen there are not many available tasks. Although I do a little "bunker", I'm finishing work before deadline (except that time). P.S. the other machine (ID: 10356455) errors is cause of win8 failure after update, so no chance to cancel them. ;) [/OT] Don't worry too much, it is just that as already discussed, cancelling WUs slows down the study because of rerun tasks going to the end of the queue. Eric. ID: 27377 · Reply Quote

alvin Send message Joined: 12 Mar 12 Posts: 128 Credit: 20,013,377 RAC: 0	Message 27384 - Posted: 15 Apr 2015, 19:43:23 UTC - in response to Message 27377. Last modified: 15 Apr 2015, 19:48:20 UTC new host generates inconclusives, mostly 2-10 seconds http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=9934863 Grubix and this is yours http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=9874961&offset=0&show_names=0&state=3&appid= lets see what could we do about it ID: 27384 · Reply Quote

Ray Murray Volunteer moderator Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,869,905 RAC: 58	Message 27385 - Posted: 16 Apr 2015, 8:21:57 UTC - in response to Message 27369. 14 errors for me on this host apparently caused by CMS-dev. Sixtrack wu fails with LHC@home 1.0 \| Aborting task.....exceeded disk limit: 6860.42MB > 572.20MB before it gets a chance to start. CMS appears to run fine and I'm not seeing any debris left behind after it finishes. ID: 27385 · Reply Quote

Grubix Send message Joined: 3 Jul 08 Posts: 20 Credit: 8,281,604 RAC: 0	Message 27386 - Posted: 16 Apr 2015, 9:28:46 UTC - in response to Message 27384. Hi Costa, thanks for the hint. I have also seen it last night and I was very surprised. Yesterday I could not do anything, because the computer is not at my home. This morning I have turned the computer off for a few minutes. The error has remained. The invalid tasks have almost not a "pni" nor a "sse". The tasks with "pni" or "sse" are valid. Eric, if the host is interesting for your diagnostics, feel free to give me appropriate instructions. But I have only Weekdays access to the host. Bye, Grubix. ID: 27386 · Reply Quote

alvin Send message Joined: 12 Mar 12 Posts: 128 Credit: 20,013,377 RAC: 0	Message 27387 - Posted: 16 Apr 2015, 9:38:36 UTC - in response to Message 27386. check that you have enough space check there are no other cpu-consuming tasks after Eric replied to you you might also reset project, reinstall BOINC, etc. ID: 27387 · Reply Quote

[TA]Assimilator1 Send message Joined: 29 Nov 13 Posts: 59 Credit: 4,012,100 RAC: 0	Message 27388 - Posted: 16 Apr 2015, 17:57:17 UTC - in response to Message 27387. Other CPU intensive tasks shouldn't cause any errors. Team AnandTech - WCG, Uni@H, F@H, MW@H, Ast@H, LHC@H, R@H, CPDN, E@H. Main rig - Ryzen 3600, MSI B450 Gm Pro C AC, 32GB DDR4 3200, RTX 3060 Ti 8GB, Win10 64bit 2nd rig - i7 4930k @4.1 GHz, 16 GB DDR3 1866, HD 7870XT 3GB(DS), Win7 64 ID: 27388 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27390 - Posted: 18 Apr 2015, 2:42:18 UTC - in response to Message 27385. Hi Ray; it seems more than a coincidence that at least one, maybe two other clients, have reported the same error. Seems to be cross talk with CMS or CMS dev............. Eric. ID: 27390 · Reply Quote

alvin Send message Joined: 12 Mar 12 Posts: 128 Credit: 20,013,377 RAC: 0	Message 27391 - Posted: 18 Apr 2015, 3:38:40 UTC Last modified: 18 Apr 2015, 3:38:55 UTC half inconclusives, quarter invalids, only 10% valid http://lhcathomeclassic.cern.ch/sixtrack/results.php?hostid=10137504 ID: 27391 · Reply Quote

Ray Murray Volunteer moderator Send message Joined: 29 Sep 04 Posts: 281 Credit: 11,869,905 RAC: 58	Message 27392 - Posted: 18 Apr 2015, 17:08:21 UTC - in response to Message 27390. Last modified: 18 Apr 2015, 17:27:55 UTC I have noticed that when CMS finishes, it doesn't quite do a full cleanup and leaves a disk image .vdi in the slot. I guess that Sixtrack tries to use that slot, thinking it is empty but finds the .vdi which pushes it over the permitted size, causing the error. I have reset LHC 1.0 just in case anything else may have become corrupted but I'll need to wait for the next batch of work to see if that fixes it. Until a fix can be found I'll only let CMS and sixtrack run when the other isn't and will delete the relevant slot to clear the debris. Where I did this on one machine this morning after a CMS wu finished, Sixtrack is running fine again. For now, the two projects don't seem to play well together so I will keep them separated. ID: 27392 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27393 - Posted: 19 Apr 2015, 2:36:58 UTC - in response to Message 27392. Thanks Ray; I have forwarded your message to admins. Eric. ID: 27393 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27394 - Posted: 19 Apr 2015, 2:46:12 UTC - in response to Message 27368. See Message 27392 from Ray and my reply. Eric. ID: 27394 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27395 - Posted: 19 Apr 2015, 2:50:34 UTC - in response to Message 27369. Thanks Magic; somehow missed the significance of your message. I had rather a busy time when it arrived. Ray confirms. Eric. ID: 27395 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27396 - Posted: 19 Apr 2015, 2:50:35 UTC - in response to Message 27369. Thanks Magic; somehow missed the significance of your message. I had rather a busy time when it arrived. Ray confirms. Eric. ID: 27396 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27397 - Posted: 19 Apr 2015, 2:50:36 UTC - in response to Message 27369. Thanks Magic; somehow missed the significance of your message. I had rather a busy time when it arrived. Ray confirms. Eric. ID: 27397 · Reply Quote

Magic Quantum Mechanic Send message Joined: 24 Oct 04 Posts: 1234 Credit: 79,171,872 RAC: 135,439	Message 27398 - Posted: 19 Apr 2015, 19:39:46 UTC - in response to Message 27397. Thanks Magic; somehow missed the significance of your message. I had rather a busy time when it arrived. Ray confirms. Eric. Any time Eric, I did wonder if anyone would catch that and just now remembered to check back here. As far as those .avi's hanging around in the VB that also happens often with Atlas tasks but since I am used to that I clean it up every day since I run Atlas/VB and vLHC/VB 24/7 I did run some CMS and the first 3 tasks worked but started to mess around with all my other tasks (except the vLHC for some reason) Then with a new wrapper the next 2 CMS tasks failed when I started them with my other tasks running so I decided to stop testing CMS for now but it does appear that CMS will run if you aren't running other tasks like I do. When I stopped running CMS I was still getting other Atlas tasks and even GPU's to crash and what I ended up doing to fix that was a complete remove of all files connected to CMS on this pc and after that it all went back to normal and in fact I added some LHC tasks (since this is an 8-core) and ran those at the same time and it was all back to normal. (I guess you have to email me to get me to stay up to date sometimes ) Volunteer Mad Scientist For Life ID: 27398 · Reply Quote

Eric Mcintosh Volunteer moderator Project administrator Project developer Project tester Project scientist Send message Joined: 12 Jul 11 Posts: 857 Credit: 1,619,050 RAC: 0	Message 27399 - Posted: 20 Apr 2015, 2:54:40 UTC - in response to Message 27398. Thanks AGAIN, and especially for the additional info. I was just too lazy/busy to mail you personally... :-) Will do so in future. Eric. ID: 27399 · Reply Quote

William C Wilson Send message Joined: 11 Sep 08 Posts: 25 Credit: 384,225 RAC: 0	Message 27412 - Posted: 27 Apr 2015, 10:51:53 UTC - in response to Message 27356. Thanks for a hint. Wiill be trying this in a few days as now overloaded with work. Strange, CPU Bios setting are the same. but will try. Machine now is all 64 bit including apps. Will try compat settings. Hope it Works and will get back to you with results. William C Wilson SÃ£o Paulo Brazil ID: 27412 · Reply Quote

LHC@home