Thread 'most unpolite host of the day'

Author	Message
bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 36085 - Posted: 27 Jul 2018, 14:00:59 UTC Check this one. Close to 600 bad tasks in 7 days. Each one of those blown tasks is a ~350MB download. Why does LHC allow this to go on when they can cut off such users? Maybe LHC has too much bandwidth and needs to burn some off? Meanwhile don't drain your ATLAS task cache because it will take 2 hours of constant hammering on the server to get more tasks. ID: 36085 · Reply Quote

Yeti Volunteer moderator Send message Joined: 2 Sep 04 Posts: 468 Credit: 224,935,712 RAC: 887	Message 36174 - Posted: 1 Aug 2018, 14:47:21 UTC - in response to Message 36085. Check this one. Close to 600 bad tasks in 7 days. Each one of those blown tasks is a ~350MB download. Why does LHC allow this to go on when they can cut off such users? Maybe LHC has too much bandwidth and needs to burn some off? Meanwhile don't drain your ATLAS task cache because it will take 2 hours of constant hammering on the server to get more tasks. Something seems to be broken, because the server should limit the number of WUs for hosts like this to 1 per day Supporting BOINC, a great concept ! ID: 36174 · Reply Quote

Magic Quantum Mechanic Send message Joined: 24 Oct 04 Posts: 1320 Credit: 98,906,977 RAC: 117,488	Message 36192 - Posted: 2 Aug 2018, 1:01:40 UTC Yeah in the past if you got that many Invalids in a row you got that 24 hour delay before getting new tasks. Looking at those Invalids and the computer info I would say he tried running too many Atlas tasks at the same time with not enough Ram (15.9 GB) And it also could be OC'd since I noticed that K ( i7-5820K ) As we know when people have hosts with 12 threads they like to run them all at the same time at 100% CPU and Memory (I always check the Task Manager when running all the threads to make sure it isn't maxed out memory) But it seems to be taking a break now https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10458080 ID: 36192 · Reply Quote

bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 36194 - Posted: 2 Aug 2018, 1:40:35 UTC And that seems to be just the tip of the iceberg. Poking around the other day I ran across a user with a name like "gridcoin" who has ~400 hosts. Looking through his results for just 10 minutes I saw ~2,000 failed ATLAS tasks on ~5 hosts. I bet there's minimum 1,000 hosts doing that. Anybody else see now why dl speed is down to a trickle? We've heard from one admin who claims their throughput is the same... well, yeah.... but a significant portion of that seems to be tasks going to hosts that blow the task in 10 minutes and then get another task, rinse and repeat 100 times a day. ID: 36194 · Reply Quote

vseven Send message Joined: 22 Jan 18 Posts: 32 Credit: 2,756,359 RAC: 0	Message 36201 - Posted: 2 Aug 2018, 13:08:22 UTC I hate to admit it but I have a bad host myself that I cannot figure out. I haven't said anything because it does report valid tasks but its like 60% valid to 40% invalid. Its a 24 thread machine with 64Gb of RAM running three 8 core WU's at a time. The weird thing is in the stderr output for the invalids it says everything was successful. Going to try and reduce the number of cores allowed to 22 so there are some spares for the machine itself and see if that makes a difference. My other two hosts have 1 or 2 invalids to hundreds of valid tasks which is what doesn't make sense to me. Same version of BOINC, VirtaulBox, OS, etc. ID: 36201 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2760 Credit: 305,624,080 RAC: 128,443	Message 36203 - Posted: 2 Aug 2018, 13:35:54 UTC - in response to Message 36201. As mentioned in another thread you may consider to make your logs visible for other volunteers. This would make it easier to give a qualified answer. To do so, navigate to https://lhcathome.cern.ch/lhcathome/prefs.php?subset=project and check the box near "Should LHC@home show your computers on its web site?". ID: 36203 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 958 Credit: 785,252,087 RAC: 106,912	Message 36205 - Posted: 2 Aug 2018, 15:57:19 UTC - in response to Message 36194. I'm think for gridcoin it because the users are automatically added to this project without really knowing so they don't realize they need virtualbox. ID: 36205 · Reply Quote

bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 36207 - Posted: 2 Aug 2018, 17:17:48 UTC - in response to Message 36205. I'm think for gridcoin it because the users are automatically added to this project without really knowing so they don't realize they need virtualbox. OIC... gullible users auto added to difficult project by some misguided, uncaring group admin. What could possibly go wrong with that? He needs a PM from bronco. ID: 36207 · Reply Quote

vseven Send message Joined: 22 Jan 18 Posts: 32 Credit: 2,756,359 RAC: 0	Message 36208 - Posted: 2 Aug 2018, 17:45:52 UTC - in response to Message 36203. Last modified: 2 Aug 2018, 17:46:26 UTC As mentioned in another thread you may consider to make your logs visible for other volunteers. This would make it easier to give a qualified answer. To do so, navigate to https://lhcathome.cern.ch/lhcathome/prefs.php?subset=project and check the box near "Should LHC@home show your computers on its web site?". I know how to but I rather not expose all of my machines. But I can post examples. Marked valid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203431393 Marked invalid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203456578 I'm think for gridcoin it because the users are automatically added to this project without really knowing so they don't realize they need virtualbox. OIC... gullible users auto added to difficult project by some misguided, uncaring group admin. What could possibly go wrong with that? He needs a PM from bronco. As someone that is on team Gridcoin I can tell you nothing is automatically added. Now if they use "Charity Engine" then yes project are added without the users knowledge. Its a nice scam the Charity Engine people have going. ID: 36208 · Reply Quote

PDW Send message Joined: 7 Aug 14 Posts: 27 Credit: 10,001,506 RAC: 0	Message 36209 - Posted: 2 Aug 2018, 18:31:22 UTC - in response to Message 36208. I know how to but I rather not expose all of my machines. But I can post examples. Marked valid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203431393 Marked invalid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203456578 How long has Atlas been running valid tasks with less than 3 minutes CPU time for 293.50 credits ? I thought they took many hours for about twice that much credit ! ID: 36209 · Reply Quote

computezrmle Volunteer moderator Volunteer developer Volunteer tester Help desk expert Send message Joined: 15 Jun 08 Posts: 2760 Credit: 305,624,080 RAC: 128,443	Message 36210 - Posted: 2 Aug 2018, 18:39:58 UTC - in response to Message 36208. OK. Now I see what happened - I would call it cheating. Sorry, in this case there's no support from my side. ID: 36210 · Reply Quote

Toby Broom Volunteer moderator Send message Joined: 27 Sep 08 Posts: 958 Credit: 785,252,087 RAC: 106,912	Message 36211 - Posted: 2 Aug 2018, 18:46:50 UTC - in response to Message 36208. Sorry, I read something over on the main boinc forums about issues with gridcoin and the use of account mangers in a non-standard manner. Looks like I bade a bad assumption ID: 36211 · Reply Quote

PDW Send message Joined: 7 Aug 14 Posts: 27 Credit: 10,001,506 RAC: 0	Message 36212 - Posted: 2 Aug 2018, 18:58:59 UTC - in response to Message 36211. So that host https://lhcathome.cern.ch/lhcathome/results.php?hostid=10522400&offset=0&show_names=0&state=4&appid= at the moment has 785 VALID tasks that used about 3 minutes of CPU time for a credit of about 300 each. Would an Admin like to confirm that they are valid results please ? I see the VMs have 8 CPUs assigned but the run time is barely enough to spin up the VM let alone do any valid work. ID: 36212 · Reply Quote

vseven Send message Joined: 22 Jan 18 Posts: 32 Credit: 2,756,359 RAC: 0	Message 36214 - Posted: 2 Aug 2018, 19:52:58 UTC - in response to Message 36212. Last modified: 2 Aug 2018, 19:55:33 UTC I have a second host with 40 threads that runs five 8 CPU WU's at a time, also with plenty of RAM and disk space. It has 380 valid and 3 invalid. WU take anyhere from 400 seconds up to one I see at 32000 seconds. I even ran a "reset project" on both hosts to make sure everything was correct and both re-downloaded the VDI file fresh. Same results on both. OK. Now I see what happened - I would call it cheating. Sorry, in this case there's no support from my side. So attaching to a project, letting it download its own files, and letting it run with zero modifications to anything is cheating? How so? ID: 36214 · Reply Quote

PDW Send message Joined: 7 Aug 14 Posts: 27 Credit: 10,001,506 RAC: 0	Message 36215 - Posted: 2 Aug 2018, 20:15:15 UTC - in response to Message 36214. So this one isn't yours then ? https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10522402 It only has 136 valid and 3 invalid and 77 error. That is also returning valid results with low run times although it has done at least one I would expect to be a real valid result... https://lhcathome.cern.ch/lhcathome/show_host_detail.php?hostid=10522402 Its run time was 10,495.46 seconds and CPU time of 72,408.16 seconds for a credit of 295.61. At least its credit return is much smaller for the short runs. For the record I am not accusing you of cheating, I do not know either way if you are or not. My concern is that the project is happily marking results as valid with a high credit score when I don't believe they should be valid. I will wait for an Admin to respond, they may not appear until working hours though. It is a project oversight problem, I don't know if: 1) They don't care 2) They don't know how to check 3) They don't have time to check No idea what science your 'valid' results might be contributing, if an Admin comes back and says they are okay I won't argue any further but I don't see why everyone else is taking so many hours for so little credit. ID: 36215 · Reply Quote

vseven Send message Joined: 22 Jan 18 Posts: 32 Credit: 2,756,359 RAC: 0	Message 36216 - Posted: 2 Aug 2018, 20:18:36 UTC - in response to Message 36215. I don't know either which is why I posted when I saw this thread. Lol....if I was somehow cheating why would I draw attention to myself? I was more hoping to figure out why I had so many invalid results so I could fix it....if it can be fixed. Like I said I've done nothing to modify anything and I've tried resetting the project which didn't change anything. Only difference is the one returning a lot of invalid is a older server (5 years old) which to me would mean it should be overall slower. The one with almost all valid results is newer (2 years old). ID: 36216 · Reply Quote

PDW Send message Joined: 7 Aug 14 Posts: 27 Credit: 10,001,506 RAC: 0	Message 36217 - Posted: 2 Aug 2018, 20:27:11 UTC - in response to Message 36216. I understand what you are saying but if you look at other users who don't hide their computers and see their Atlas results you will quickly realise that your valid results do not look normal. Don't bother looking at my computers, they are hidden but I haven't run any tasks for a while, I just came back to see what state the project was in and if it was worth running tasks again. ID: 36217 · Reply Quote

bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 36218 - Posted: 2 Aug 2018, 23:05:47 UTC - in response to Message 36208. I know how to but I rather not expose all of my machines. But I can post examples. Marked valid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203431393 It's marked valid but if you use your webbrowser's search function (ctrl-f) to search the stderr output for "HITS" you will notice it finds nothing. That means 2 things: 1) it did not record an EVTtoHITS error (error 165) which is a sure indication the task failed to return any useful work 2) it did not record the usual confirmation of a successful EVTtoHITS conversion So the stderr output is inconclusive regarding HITS. Fortunately there is another way to learn whether or not a HITS file was generated/returned. Search for "pandaid" in the stderr and for the task above you'll find this: Starting ATLAS job. (PandaID=4008322627 taskID=14661314) You can use the PandaID code in that line, in this case 4008322627, to search the panda database for a definitive answer to whether or not your task succeeded. Combine the code with the base URL to get the full URL like this: https://bigpanda.cern.ch/job?pandaid=4008322627 The first time you do it you'll be asked to register a username and password. It's free and easy. Then the above URL will take you to the panda record for your result and you will notice it actually failed despite LHC@home saying it validated. Marked invalid: https://lhcathome.cern.ch/lhcathome/result.php?resultid=203456578 That one has no mention of HITS nor does it have the "Starting ATLAS job.(PandaID=" line so naturally it failed and was appropriately marked invalid. As someone that is on team Gridcoin I can tell you nothing is automatically added. Now if they use "Charity Engine" then yes project are added without the users knowledge. Its a nice scam the Charity Engine people have going. Yes, I saw warnings about it on the BOINC forums. Seems people take great pride in being deplorable these days. Whatever. Now that I know your userID I'll plug it into my script and see if you have more fubar hosts. Not accusing you of cheating. If you are then I think maybe I know how you're doing it and I really couldn't care less. The credits are meaningless. I care only about the science being done and how much bandwidth fubar hosts might be wasting. ID: 36218 · Reply Quote

bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 36219 - Posted: 2 Aug 2018, 23:16:28 UTC - in response to Message 36216. Lol....if I was somehow cheating why would I draw attention to myself? Not accusing you of cheating, just answering the question you asked. Some cheaters need to do more than just cheat. Their sick little egos drive them to make sure everybody else knows they are cheating. It makes them feel powerful and in control when they know that everybody knows they're cheating and cannot be stopped. ID: 36219 · Reply Quote

bronco Send message Joined: 13 Apr 18 Posts: 443 Credit: 8,438,885 RAC: 0	Message 36220 - Posted: 3 Aug 2018, 1:27:08 UTC - in response to Message 36218. I care only about the science being done and how much bandwidth fubar hosts might be wasting. Actually I also care very much that users come here, volunteer their computers and pay for the electricity thinking that the money they're spending is accomplishing something worthwhile and trusting that LHC@home will do what other projects do and tell them when they are wasting their time and money. Instead they smile and lie and say "Valid" when nothing useful has been done at all. That's a breach of trust, plain and simple. Like I said in another thread, no wonder nobody trusts scientists anymore. ID: 36220 · Reply Quote