Message boards : Number crunching : Microsoft KB3206632 from 16/12/15
Message board moderation

To post messages, you must log in.

1 · 2 · Next

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 510
Credit: 16,206,578
RAC: 21,731
Message 28391 - Posted: 8 Jan 2017, 14:23:10 UTC

Microsoft installed KB3206632 at 16/12/15. Therefor is a

Cumulative Update for Win 10 Version 1607 for x64 based Systems from 16/12/20.

Have made a Test with Theory_2016_11_02.vdi and it.... IS WORKING now!!

https://support.microsoft.com/de-de/help/4004227/windows-10-update-kb3206632
ID: 28391 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 516
Credit: 3,553,640
RAC: 1,638
Message 28392 - Posted: 8 Jan 2017, 16:26:36 UTC
Last modified: 8 Jan 2017, 16:28:19 UTC

Was that causing the problems in the "Missing heartbeat file errors" thread?
ID: 28392 · Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 12 Feb 14
Posts: 69
Credit: 1,288,736
RAC: 509
Message 28393 - Posted: 8 Jan 2017, 17:45:52 UTC - in response to Message 28391.  

That update is running on my system already, and Theory Simulation is not working on my machine. I am wondering if this update broke Theory Simulation because my computer was able to handle the Theory Simulation before that update. It fixed a DHCP bug, but might have broken the ability of VirtualBox to be able to perform twice NAT on DNS requests originating from VirtualBox guests when it is set to use what it calls its DNS proxy mode. Are you saying that uninstalling this update allowed Theory Simulation to work?
ID: 28393 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 516
Credit: 3,553,640
RAC: 1,638
Message 28394 - Posted: 8 Jan 2017, 18:44:42 UTC

Did you also installed this update: https://support.microsoft.com/en-us/kb/3213522
ID: 28394 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 210
Credit: 212,653
RAC: 34
Message 28395 - Posted: 8 Jan 2017, 18:55:26 UTC - in response to Message 28393.  
Last modified: 8 Jan 2017, 19:00:24 UTC

Looking at KB3206632 it states:


This update contains an issue that affects virtualization-based security (VBS). The issue is fixed in the following update:


That update is KB3213522 and it states:

This update fixes an issue that was introduced in the December 13, 2016 release (KB3206632) in which virtualization-based security (VBS) does not start, and features that rely on VBS, such as Credential Guard and shielded virtual machines (VMs), stop functioning
.

It seems that these are the cause and solution. Please let us know if you have applied the KB3206632 update and are still getting errors.
ID: 28395 · Report as offensive     Reply Quote
Jesse Viviano

Send message
Joined: 12 Feb 14
Posts: 69
Credit: 1,288,736
RAC: 509
Message 28396 - Posted: 8 Jan 2017, 20:30:55 UTC - in response to Message 28395.  

KB3206632 was applied as a routine Windows Update patch after I reformatted my computer. I had errors after that patch. Applying KB3213522 fixed the fault and allowed the work unit to start processing normally. I think that this fix should be put up on the news section so that others with Windows 10 faults like this one will be able to find the fix.
ID: 28396 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 713
Credit: 21,079,515
RAC: 21,558
Message 28398 - Posted: 8 Jan 2017, 20:38:56 UTC

That made no difference on my 4 Windows 10 crunchers here......Valid tasks before and after.
ID: 28398 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 76
Credit: 705,545
RAC: 1,321
Message 28406 - Posted: 9 Jan 2017, 8:04:53 UTC

I am running the 64bit version of Win10 with KB3206632 and there is no difference in my system either. The project seems to be running fine.
ID: 28406 · Report as offensive     Reply Quote
maeax

Send message
Joined: 2 May 07
Posts: 510
Credit: 16,206,578
RAC: 21,731
Message 28413 - Posted: 9 Jan 2017, 19:21:20 UTC

Have three AMD-PC and they finished LHC-Tasks (CMS or LHC) after 10 or 11 Min. since 16/12/15 up to yesterday.

After this KB update yesterday they run LHC-Tasks correct.

Of course with https://lhcathome.cern.ch/lhcathome
and not with the old vLHCathome adress.

Don't know why this KB update help for this issue.

Hope this update will help the LHC-Project.
ID: 28413 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 530
Credit: 2,219,362
RAC: 446
Message 28418 - Posted: 10 Jan 2017, 13:32:18 UTC

I made the update on my Windows 10 PC after having suspended all LHCb tasks and I lost one of them. They are now running again, together with a SETI@home Beta GPU task on my GTX 1050 board with its Pascal microprocessor, very fast.
Tullio
ID: 28418 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 466
Credit: 141,238,942
RAC: 213,267
Message 28428 - Posted: 10 Jan 2017, 22:47:52 UTC

I had update from 14th Dec and never saw any issue on my computer(s)??
ID: 28428 · Report as offensive     Reply Quote
Profile MAGIC Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 713
Credit: 21,079,515
RAC: 21,558
Message 28429 - Posted: 11 Jan 2017, 2:06:32 UTC

Since I never have used linux or mac maybe I should give tips on those OS's

In my 6 years doing these VB tasks I have seen Windows tips coming from people who only run linux and never looking at the facts that we have in the stats page here for the hosts that do most of the work.........but I have only been here for 13 years so what do I know?
This info on how Windows 10 Updates work obviously is not from anyone that actually tested it on a Windows 10 OS

I officially quit.
Volunteer Mad Scientist For Life
ID: 28429 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 530
Credit: 2,219,362
RAC: 446
Message 28434 - Posted: 11 Jan 2017, 10:25:55 UTC

I get automatic updates on my Windows 10 PC but this time I went looking for it using Cortana, downloaded it and installed. I am a Linux user and SuSE asks me if I want to install updates. Last time an update installed a Tumbleweed Beta OS on my Leap 42.1 which canceled also the contents of a USB cartridge which contained data that I wanted to save. So Windows is not the only culprit.
Tullio
ID: 28434 · Report as offensive     Reply Quote
Stick

Send message
Joined: 21 Aug 07
Posts: 42
Credit: 769,330
RAC: 378
Message 28456 - Posted: 12 Jan 2017, 15:16:42 UTC

Computer 9926211 has been getting heartbeat errors for quite some time. I saw the notice about KB3206632 earlier this week and installed the KB3213522 update on 1/9/2017. But, unfortunately, the update seemed to have had no effect on the problems with Computer 9926211 (an AMD processor).

OTOH, Computer 10342612 (an Intel processor) had also been getting heartbeat errors (but only occasionally) prior to the update. And since then, all VBOX tasks have finished OK.
ID: 28456 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 210
Credit: 212,653
RAC: 34
Message 28458 - Posted: 12 Jan 2017, 15:58:30 UTC - in response to Message 28456.  

What do you see in the console?
ID: 28458 · Report as offensive     Reply Quote
Stick

Send message
Joined: 21 Aug 07
Posts: 42
Credit: 769,330
RAC: 378
Message 28463 - Posted: 12 Jan 2017, 18:18:24 UTC - in response to Message 28458.  

What do you see in the console?

I haven't looked at the console while failed tasks have been in process. All I can say is that all VBOX tasks are failing on Computer 9926211 and they all seem to process OK for 10 to 15 minutes before getting the heartbeat error. I'll be happy to download one and watch it on the console if you will tell me specifically what you are interested in.
ID: 28463 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 466
Credit: 141,238,942
RAC: 213,267
Message 28486 - Posted: 13 Jan 2017, 22:38:23 UTC

I looked in to my computer a little I see the following:

Overall error rate of 16%

CMS = 16%, LHCb = 25% SixTrack = 1% & Theory = 15%

I took the last 20 errors:

Theory = 20% (condor exit after n sec), 45% (couldn't connect on 9618), 35% (no ping typically with DC_NOP failed!)

LHCb = 30% (condor exit after n sec), 45% (couldn't connect on 9618), 15% (no ping typically with DC_NOP failed!), 10% (VM Heartbeat file)

CMS = 10% (condor exit after n sec), 45% (couldn't connect on 9618), 20% (no ping typically with DC_NOP failed!), 10% (VM Heartbeat file)


I don't know what error rate is acceptable for the project?

for the 9618 error which is most common, the waste is ~2min and it seems like the task that is started next works as there isn't blocks of bad task.
ID: 28486 · Report as offensive     Reply Quote
tullio

Send message
Joined: 19 Feb 08
Posts: 530
Credit: 2,219,362
RAC: 446
Message 28490 - Posted: 14 Jan 2017, 13:35:56 UTC

I get 3 errors in 57 tasks on 2 64-bit PCs,one Windows 10 and one Linux, and a 32-bit Linux box. Most of my tasks are LHCb and SixTrack, I am waiting for any Atlas task.
Tullio
ID: 28490 · Report as offensive     Reply Quote
Profile Laurence
Project administrator
Project developer

Send message
Joined: 20 Jun 14
Posts: 210
Credit: 212,653
RAC: 34
Message 28497 - Posted: 14 Jan 2017, 23:31:44 UTC - in response to Message 28486.  

Thanks for this feedback it is very useful.


CMS = 16%, LHCb = 25% SixTrack = 1% & Theory = 15%

I took the last 20 errors:

Theory = 20% (condor exit after n sec), 45% (couldn't connect on 9618), 35% (no ping typically with DC_NOP failed!)

I don't know what error rate is acceptable for the project?



What error rate is acceptable depends on your perspective and the metric. Taking your observed Theory failures as an example; 20% fail after ~10 mins, 45% after 2mins and 35% after 3mins. So out off 100 tasks this would be 3, 6.75 and 5.25 tasks respectively, hence 59.25mins. As successful tasks take on average 14 hours and 85% are successful, 1190 hours would be successfully delivered. Therefore the wall time efficiency is 1190/1191 = 99.91%. By contrast if only one of the tasks out off 100 idled for 18 hours, the efficiency would be 1386/1404 = 98.7%.

The mantra fail fast, fail often applies here. It is better to do checks and fail a task at the start than to risk it failing at the end.

Having said that we would still like to follow up on those failed tasks as they may indicate other issues even if we have protected ourselves against them.
ID: 28497 · Report as offensive     Reply Quote
Toby Broom
Volunteer moderator

Send message
Joined: 27 Sep 08
Posts: 466
Credit: 141,238,942
RAC: 213,267
Message 28500 - Posted: 15 Jan 2017, 9:50:50 UTC - in response to Message 28497.  

Let me know if you need more info, overall thinks work well from my point of view, with at least 8cores, the small delays seem to have minimal effects.

I would say ALTAS is more disruptive there is tasks that hang at 100% until I notice that there stuck.
ID: 28500 · Report as offensive     Reply Quote
1 · 2 · Next

Message boards : Number crunching : Microsoft KB3206632 from 16/12/15


©2018 CERN