Message boards :
Number crunching :
Microsoft KB3206632 from 16/12/15
Message board moderation
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 2240 Credit: 173,894,884 RAC: 3,757 |
Microsoft installed KB3206632 at 16/12/15. Therefor is a Cumulative Update for Win 10 Version 1607 for x64 based Systems from 16/12/20. Have made a Test with Theory_2016_11_02.vdi and it.... IS WORKING now!! https://support.microsoft.com/de-de/help/4004227/windows-10-update-kb3206632 |
Send message Joined: 14 Jan 10 Posts: 1417 Credit: 9,440,106 RAC: 1,109 |
Was that causing the problems in the "Missing heartbeat file errors" thread? |
Send message Joined: 12 Feb 14 Posts: 72 Credit: 4,639,155 RAC: 0 |
That update is running on my system already, and Theory Simulation is not working on my machine. I am wondering if this update broke Theory Simulation because my computer was able to handle the Theory Simulation before that update. It fixed a DHCP bug, but might have broken the ability of VirtualBox to be able to perform twice NAT on DNS requests originating from VirtualBox guests when it is set to use what it calls its DNS proxy mode. Are you saying that uninstalling this update allowed Theory Simulation to work? |
Send message Joined: 14 Jan 10 Posts: 1417 Credit: 9,440,106 RAC: 1,109 |
Did you also installed this update: https://support.microsoft.com/en-us/kb/3213522 |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
Looking at KB3206632 it states:
That update is KB3213522 and it states: This update fixes an issue that was introduced in the December 13, 2016 release (KB3206632) in which virtualization-based security (VBS) does not start, and features that rely on VBS, such as Credential Guard and shielded virtual machines (VMs), stop functioning. It seems that these are the cause and solution. Please let us know if you have applied the KB3206632 update and are still getting errors. |
Send message Joined: 12 Feb 14 Posts: 72 Credit: 4,639,155 RAC: 0 |
KB3206632 was applied as a routine Windows Update patch after I reformatted my computer. I had errors after that patch. Applying KB3213522 fixed the fault and allowed the work unit to start processing normally. I think that this fix should be put up on the news section so that others with Windows 10 faults like this one will be able to find the fix. |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,685,889 RAC: 15,649 |
That made no difference on my 4 Windows 10 crunchers here......Valid tasks before and after. |
Send message Joined: 28 Dec 08 Posts: 339 Credit: 4,863,195 RAC: 602 |
I am running the 64bit version of Win10 with KB3206632 and there is no difference in my system either. The project seems to be running fine. |
Send message Joined: 2 May 07 Posts: 2240 Credit: 173,894,884 RAC: 3,757 |
Have three AMD-PC and they finished LHC-Tasks (CMS or LHC) after 10 or 11 Min. since 16/12/15 up to yesterday. After this KB update yesterday they run LHC-Tasks correct. Of course with https://lhcathome.cern.ch/lhcathome and not with the old vLHCathome adress. Don't know why this KB update help for this issue. Hope this update will help the LHC-Project. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I made the update on my Windows 10 PC after having suspended all LHCb tasks and I lost one of them. They are now running again, together with a SETI@home Beta GPU task on my GTX 1050 board with its Pascal microprocessor, very fast. Tullio |
Send message Joined: 27 Sep 08 Posts: 844 Credit: 690,810,297 RAC: 110,512 |
I had update from 14th Dec and never saw any issue on my computer(s)?? |
Send message Joined: 24 Oct 04 Posts: 1172 Credit: 54,685,889 RAC: 15,649 |
Since I never have used linux or mac maybe I should give tips on those OS's In my 6 years doing these VB tasks I have seen Windows tips coming from people who only run linux and never looking at the facts that we have in the stats page here for the hosts that do most of the work.........but I have only been here for 13 years so what do I know? This info on how Windows 10 Updates work obviously is not from anyone that actually tested it on a Windows 10 OS I officially quit. Volunteer Mad Scientist For Life |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I get automatic updates on my Windows 10 PC but this time I went looking for it using Cortana, downloaded it and installed. I am a Linux user and SuSE asks me if I want to install updates. Last time an update installed a Tumbleweed Beta OS on my Leap 42.1 which canceled also the contents of a USB cartridge which contained data that I wanted to save. So Windows is not the only culprit. Tullio |
Send message Joined: 21 Aug 07 Posts: 46 Credit: 1,503,835 RAC: 0 |
Computer 9926211 has been getting heartbeat errors for quite some time. I saw the notice about KB3206632 earlier this week and installed the KB3213522 update on 1/9/2017. But, unfortunately, the update seemed to have had no effect on the problems with Computer 9926211 (an AMD processor). OTOH, Computer 10342612 (an Intel processor) had also been getting heartbeat errors (but only occasionally) prior to the update. And since then, all VBOX tasks have finished OK. |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
What do you see in the console? |
Send message Joined: 21 Aug 07 Posts: 46 Credit: 1,503,835 RAC: 0 |
What do you see in the console? I haven't looked at the console while failed tasks have been in process. All I can say is that all VBOX tasks are failing on Computer 9926211 and they all seem to process OK for 10 to 15 minutes before getting the heartbeat error. I'll be happy to download one and watch it on the console if you will tell me specifically what you are interested in. |
Send message Joined: 27 Sep 08 Posts: 844 Credit: 690,810,297 RAC: 110,512 |
I looked in to my computer a little I see the following: Overall error rate of 16% CMS = 16%, LHCb = 25% SixTrack = 1% & Theory = 15% I took the last 20 errors: Theory = 20% (condor exit after n sec), 45% (couldn't connect on 9618), 35% (no ping typically with DC_NOP failed!) LHCb = 30% (condor exit after n sec), 45% (couldn't connect on 9618), 15% (no ping typically with DC_NOP failed!), 10% (VM Heartbeat file) CMS = 10% (condor exit after n sec), 45% (couldn't connect on 9618), 20% (no ping typically with DC_NOP failed!), 10% (VM Heartbeat file) I don't know what error rate is acceptable for the project? for the 9618 error which is most common, the waste is ~2min and it seems like the task that is started next works as there isn't blocks of bad task. |
Send message Joined: 19 Feb 08 Posts: 708 Credit: 4,336,250 RAC: 0 |
I get 3 errors in 57 tasks on 2 64-bit PCs,one Windows 10 and one Linux, and a 32-bit Linux box. Most of my tasks are LHCb and SixTrack, I am waiting for any Atlas task. Tullio |
Send message Joined: 20 Jun 14 Posts: 380 Credit: 238,712 RAC: 0 |
Thanks for this feedback it is very useful.
What error rate is acceptable depends on your perspective and the metric. Taking your observed Theory failures as an example; 20% fail after ~10 mins, 45% after 2mins and 35% after 3mins. So out off 100 tasks this would be 3, 6.75 and 5.25 tasks respectively, hence 59.25mins. As successful tasks take on average 14 hours and 85% are successful, 1190 hours would be successfully delivered. Therefore the wall time efficiency is 1190/1191 = 99.91%. By contrast if only one of the tasks out off 100 idled for 18 hours, the efficiency would be 1386/1404 = 98.7%. The mantra fail fast, fail often applies here. It is better to do checks and fail a task at the start than to risk it failing at the end. Having said that we would still like to follow up on those failed tasks as they may indicate other issues even if we have protected ourselves against them. |
Send message Joined: 27 Sep 08 Posts: 844 Credit: 690,810,297 RAC: 110,512 |
Let me know if you need more info, overall thinks work well from my point of view, with at least 8cores, the small delays seem to have minimal effects. I would say ALTAS is more disruptive there is tasks that hang at 100% until I notice that there stuck. |
©2024 CERN