Message boards : Number crunching : After windows update BOINC does not see all the CPU's
Message board moderation

To post messages, you must log in.

AuthorMessage
greg_be

Send message
Joined: 28 Dec 08
Posts: 223
Credit: 1,361,634
RAC: 87
Message 41562 - Posted: 14 Feb 2020, 9:12:12 UTC

I am down to both GPU's and 2 CPU cores after a windows update.

I have tried the Ncpu command, that was ignored
I have tried in app_config where I have a restriction, due to my participation in other projects, to force Boinc up to 4 tasks, that is ignored.

I checked Windows system processors tab, there are 16 cores seen, so it doesn't seem to be windows.

I uninstalled and reinstalled Boinc 64 bit and the latest Vbox.
No effect.

CPU settings in Boinc are 100%

So I am at a loss as to what is causing this problem.
Meanwhile I have 12 cores sitting idle.
ID: 41562 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 9 Jan 15
Posts: 84
Credit: 333,757,796
RAC: 293,526
Message 41568 - Posted: 14 Feb 2020, 16:25:56 UTC - in response to Message 41562.  
Last modified: 14 Feb 2020, 16:44:56 UTC

You could suspend LHC task to see if what is holding it up. Details info for your host 16 threads/processors and it should be correct it is same what boinc-client report in event log. A reinstall of client would not help if it already report correct amount.
The setup as it is now for your host you are able to do 2 GPU task (if default) and 1 or 2 atlas task.

2020-02-12 21:57:21 (11156): Setting Memory Size for VM. (14000MB)
2020-02-12 21:57:21 (11156): Setting CPU Count for VM. (4)

So if you have set to use 100% and 100%-90% of ram it should only run 2 gpus and 1 atlas. But if run CMS it could run 2 putask and 11 CMS task.

The good side is that your host got all Atlas task valid except canceled ones. What i could see on host a pattern show up and caused by high ram and reduced tasks concurrently running.

Boinc-client fetch a lot of atlas task as calculate that should be able to finished in time frame before deadline. But when host choose to start any atlas task it be hold up of lack of ram. This could be solved if started task from other application or project but as the host got plenty of atlas task it set focus on run these first. So boinc would be a zombie and waiting for ram and get high priority to atlas.

You could see that host hold task 3-4 days and non started task would be canceled when workunit got valid from other host.

Suggest to run default settings app_config or delete app_config completely. If you choose to keep app_config it would help if reduce ram then use <max_concurrent>N</max_concurrent> to make boinc-client to focus on other applications instead of waiting. Then you reduce max of fetched work on host. Amount of minimum 0.1 days and up to 0.2 work stored up. It is better that host get a fresh task when it need it then hold task for 3-4 days and not started.
Even if deadline on task would be far away other hosts would be able finished it and 1-4 days is probably time frame of these atlas. So why does the server send out same task to others and not wait until deadline is reached?
This still unknown to me server send out task that already have a "wingman" to early even if host actively running task. this could be brought up on another thread for discussion. As it is now Atlas only need 1 valid task to validate workunit and current setup keep production up. Some project require 2 valid task which reduce production to half.
If your task got canceled just ignore it or reduce amount of work stored on host. And also planned maintenance to computer set project to no new task.
ID: 41568 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 942
Credit: 6,295,842
RAC: 1,120
Message 41570 - Posted: 14 Feb 2020, 17:23:36 UTC - in response to Message 41562.  

Set Max # CPUs to 4 in your preferences.
ID: 41570 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 223
Credit: 1,361,634
RAC: 87
Message 41571 - Posted: 14 Feb 2020, 17:42:20 UTC - in response to Message 41570.  
Last modified: 14 Feb 2020, 17:44:32 UTC

Set Max # CPUs to 4 in your preferences.



That was never a problem before..odd that it is now.
I see windows installed to new C++ updates, I am thinking I uninstall one and try and and if not that uninstall the other.

It all happened after the windows updates. Before that everything was running fine.
But how does Windows C++ affect Boinc?


CPU = 4 has no effect either.
16 cores. 2 CMS jobs, no ATLAS
No other projects running on CPU
Just 1 task on each GPU like normal.
CPU is messed up.
ID: 41571 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 223
Credit: 1,361,634
RAC: 87
Message 41573 - Posted: 14 Feb 2020, 17:45:43 UTC
Last modified: 14 Feb 2020, 17:47:23 UTC

2/14/2020 6:42:40 PM | | Starting BOINC client version 7.14.2 for windows_x86_64
2/14/2020 6:42:40 PM | | log flags: file_xfer, sched_ops, task
2/14/2020 6:42:40 PM | | Libraries: libcurl/7.47.1 OpenSSL/1.0.2g zlib/1.2.8
2/14/2020 6:42:40 PM | | Data directory: C:\boinc data
2/14/2020 6:42:40 PM | | Running under account Greg
2/14/2020 6:42:41 PM | | CUDA: NVIDIA GPU 0: GeForce GTX 1080 (driver version 442.19, CUDA version 10.2, compute capability 6.1, 4096MB, 3553MB available, 9070 GFLOPS peak)
2/14/2020 6:42:41 PM | | CUDA: NVIDIA GPU 1: GeForce GTX 1050 Ti (driver version 442.19, CUDA version 10.2, compute capability 6.1, 4096MB, 3376MB available, 2274 GFLOPS peak)
2/14/2020 6:42:41 PM | | OpenCL: NVIDIA GPU 0: GeForce GTX 1080 (driver version 442.19, device version OpenCL 1.2 CUDA, 8192MB, 3553MB available, 9070 GFLOPS peak)
2/14/2020 6:42:41 PM | | OpenCL: NVIDIA GPU 1: GeForce GTX 1050 Ti (driver version 442.19, device version OpenCL 1.2 CUDA, 4096MB, 3376MB available, 2274 GFLOPS peak)
2/14/2020 6:42:41 PM | | Host name: DESKTOP-LFM92VN
2/14/2020 6:42:41 PM | | Processor: 16 AuthenticAMD AMD Ryzen 7 2700 Eight-Core Processor [Family 23 Model 8 Stepping 2]
2/14/2020 6:42:41 PM | | Processor features: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 htt pni ssse3 fma cx16 sse4_1 sse4_2 movebe popcnt aes f16c rdrandsyscall nx lm avx avx2 svm sse4a osvw skinit wdt tce topx page1gb rdtscp fsgsbase bmi1 smep
2/14/2020 6:42:41 PM | | OS: Microsoft Windows 10: Professional x64 Edition, (10.00.18363.00)
2/14/2020 6:42:41 PM | | Memory: 23.95 GB physical, 33.45 GB virtual
2/14/2020 6:42:41 PM | | Disk: 209.00 GB total, 123.29 GB free
2/14/2020 6:42:41 PM | | Local time is UTC +1 hours
2/14/2020 6:42:41 PM | | No WSL found.
2/14/2020 6:42:41 PM | | VirtualBox version: 6.1.2

-------------------

2/14/2020 6:42:41 PM | World Community Grid | Computer location: home
2/14/2020 6:42:41 PM | | General prefs: using separate prefs for home
2/14/2020 6:42:41 PM | | Reading preferences override file
2/14/2020 6:42:41 PM | | Preferences:
2/14/2020 6:42:41 PM | | max memory usage when active: 23300.62 MB
2/14/2020 6:42:41 PM | | max memory usage when idle: 24526.97 MB
2/14/2020 6:42:41 PM | | max disk usage: 144.12 GB
2/14/2020 6:42:41 PM | | (to change preferences, visit a project web site or select Preferences in the Manager)
2/14/2020 6:42:41 PM | | Setting up project and slot directories
2/14/2020 6:42:41 PM | | Checking active tasks
2/14/2020 6:42:41 PM | | Setting up GUI RPC socket
2/14/2020 6:42:41 PM | | Checking presence of 771 project files
ID: 41573 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 942
Credit: 6,295,842
RAC: 1,120
Message 41574 - Posted: 14 Feb 2020, 17:54:31 UTC - in response to Message 41571.  

Set Max # CPUs to 4 in your preferences.



That was never a problem before..odd that it is now.
.
.
CPU = 4 has no effect either.

It only has effect for new loaded tasks and depends what you have set in your app_config.xml too.
ID: 41574 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 223
Credit: 1,361,634
RAC: 87
Message 41584 - Posted: 14 Feb 2020, 21:28:12 UTC - in response to Message 41574.  
Last modified: 14 Feb 2020, 22:04:18 UTC

OK...never mind...whatever happened has disappeared without me doing a thing.
BOINC is now running all cores with all my projects.
Thanks anyway...Just another weird bug like my telephone company's TV system. There is a weird bug they can't figure out yet.
I guess I am just buggy!
ID: 41584 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 9 Jan 15
Posts: 84
Credit: 333,757,796
RAC: 293,526
Message 41590 - Posted: 14 Feb 2020, 23:48:47 UTC

I would expect the cause is the change of application running to application or a change of your settings done it.
windows update to Windows C++ or Net Framework would not have any affect to or glitch to boinc-client. If got an issue to network or setup based on vm it would pull it on stderr log.

It simple and suggested to run on default and only make changes if is required to host. My bet is the experience to your boinc-client is that it changed in setting or prio task running and it had an affect to allow to run more threads then before.

To avoid it would happen again it would be to stay with default or learn what difference to project and application would do when combine them. Eventlog provide a lot info and track it down or you would experience same issue soon again. Your issue is similar in several threads if changes to app_config need knowledge and understanding if like to use it.
A combination to several other project and also if a gpu is in use it could be complex system to follow. The amount of fetched work and settings would have big affect and rules on code to boinc-client on deadline and how it priorities task is not easy.

A good practise is to leave it client to handle it but if the host got setup got an abnormal value it would behave soand need some adjustment.
ID: 41590 · Report as offensive     Reply Quote
greg_be

Send message
Joined: 28 Dec 08
Posts: 223
Credit: 1,361,634
RAC: 87
Message 41592 - Posted: 15 Feb 2020, 7:40:45 UTC - in response to Message 41590.  

I would expect the cause is the change of application running to application or a change of your settings done it.
windows update to Windows C++ or Net Framework would not have any affect to or glitch to boinc-client. If got an issue to network or setup based on vm it would pull it on stderr log.

It simple and suggested to run on default and only make changes if is required to host. My bet is the experience to your boinc-client is that it changed in setting or prio task running and it had an affect to allow to run more threads then before.

To avoid it would happen again it would be to stay with default or learn what difference to project and application would do when combine them. Eventlog provide a lot info and track it down or you would experience same issue soon again. Your issue is similar in several threads if changes to app_config need knowledge and understanding if like to use it.
A combination to several other project and also if a gpu is in use it could be complex system to follow. The amount of fetched work and settings would have big affect and rules on code to boinc-client on deadline and how it priorities task is not easy.

A good practice is to leave it client to handle it but if the host got setup got an abnormal value it would behave soand need some adjustment.


Nothing was running in priority and again very odd that no messages showed in event log. NOTHIING at all to say there was a problem. The event log I copied was everything for startup, after that it is all the various project messages (well just LHC and 2 GPU's) Everything appeared normal. The amount of work in queue was normal for a 16 core machine with 2 gpu's with 1.5 day holding period. The fact that it works now without any extras other than where I assign certain tasks to certain GPU's and the need to throttle back ATLAS because if it tries to run two tasks then one dies a bad death from no memory.

But even more interesting was that BOINC manager ignored all forced CPU usage commands.
It ignored any and all commands for cc_config even with reboots. So it is really bizarre what caused this, because there are absolutely no traces of what the problem was and also weird that it ignored all forced commands.

But whatever it was, its gone, so I just move on and call it a weird unexplained glitch.
ID: 41592 · Report as offensive     Reply Quote

Message boards : Number crunching : After windows update BOINC does not see all the CPU's


©2020 CERN