Message boards : CMS Application : Had ~100 failures on CMS 50
Message board moderation

To post messages, you must log in.

AuthorMessage
far

Send message
Joined: 27 May 11
Posts: 5
Credit: 8,275,928
RAC: 9,123
Message 43262 - Posted: 24 Aug 2020, 4:00:32 UTC

Hi Team,
Noticed that a machine wasn't using all of it's CPU power and tracked back to something with the CMS tasks.
They have been failing for a while but also preventing other tasks from utilising the PC's resources properly:
https://lhcathome.cern.ch/lhcathome/results.php?hostid=10516481
I've disabled CMS so other projects can ramp the PC up to 100% CPU again, but would be great if you can spot anything up with it so I can re-enable it.
The machine had lots or resources free but for some reason this project was preventing them being used.
Eg 32 threads but BOINC put other projects in a "Waiting for memory" state when there was heaps free, plus was only seeing ~32% of CPU being used.

If there are logs or any assistance I can provide please let me know,
Thanks, Far
ID: 43262 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Avatar

Send message
Joined: 15 Jun 08
Posts: 1608
Credit: 94,645,541
RAC: 98,637
Message 43264 - Posted: 24 Aug 2020, 7:00:21 UTC - in response to Message 43262.  

Each CMS VM allocates 1 CPU core and 2 GB RAM.
Your computer has 32 cores and 32 GB RAM.
This would allow you to run up to 16*) CMS VMs concurrently and would leave 16 cores idle.
In addition each CMS task makes heavy use of disk I/O and network, both don't need much CPU.


*) Less in reality - even if the BOINC client is configured to use 100% RAM - since the OS and other processes also require RAM.
ID: 43264 · Report as offensive     Reply Quote
far

Send message
Joined: 27 May 11
Posts: 5
Credit: 8,275,928
RAC: 9,123
Message 43275 - Posted: 24 Aug 2020, 23:18:34 UTC - in response to Message 43264.  

Thanks, that helps understand the restricted usage of available resources. Wish I could have afforded 64GB of RAM.

However all the CMS50 tasks were failing anyway :-(

If there are logs or anything that is needed to check why please let me know. In case it's a factor, the version of VirtualBox is more recent than the one distributed with Boinc being 6.1.12r139181 (Qt5.6.2)
ID: 43275 · Report as offensive     Reply Quote
Profile Ictineu Franclort

Send message
Joined: 23 Nov 15
Posts: 4
Credit: 120,285
RAC: 21
Message 43910 - Posted: 15 Dec 2020, 2:08:56 UTC

Hello,
same problem. Theoretically the computer has resources, but it waits for memory with CMS and blocks the loading of other projects. In the project properties comes out this: Application
CMS Simulation 50.00 (vbox64)
First name
CMS_1538738_1607610862.401591
State
Waiting for memory
Received
Thursday, December 10, 2020, 5:31:55 PM
Deadline for reporting
Saturday, January 9, 2021, 5:31:54 PM
Estimated computation
1,000,000 GFLOPs
CPU time
10:13:26
CPU time since last control
---
Time elapsed
10:24:46
Estimated time remaining
1d 04:47:11
Fraction performed
58.399%
Virtual memory size
281.63 MB
Work block size
2.79 GB
Directors
slots / 1
Progress rate
5.760% per hour
Executable
vboxwrapper_26196_x86_64-pc-linux-gnu.
I don't know anymore. It seems to me that I will try to disable CMS, to see what happens. I'll tell you
ID: 43910 · Report as offensive     Reply Quote
Profile Ictineu Franclort

Send message
Joined: 23 Nov 15
Posts: 4
Credit: 120,285
RAC: 21
Message 43911 - Posted: 15 Dec 2020, 2:34:46 UTC

Hola de nou, doncs en avortar la tasca del CMS, el BOINC a començat a acceptar i executar nous treballs.
ID: 43911 · Report as offensive     Reply Quote
Profile Ictineu Franclort

Send message
Joined: 23 Nov 15
Posts: 4
Credit: 120,285
RAC: 21
Message 43912 - Posted: 15 Dec 2020, 2:35:29 UTC - in response to Message 43911.  

Hi again, for by aborting the work of the CMS, the BOINC has begun to accept and execute new work.
ID: 43912 · Report as offensive     Reply Quote
Gunde

Send message
Joined: 9 Jan 15
Posts: 141
Credit: 412,029,326
RAC: 329,161
Message 43913 - Posted: 15 Dec 2020, 3:00:53 UTC - in response to Message 43910.  
Last modified: 15 Dec 2020, 3:11:47 UTC

Memory 5.8 GB on both computers with 8 core system is bare minimum to handle is and few sixtrack task.
You have CMS task running and got valid but aborted last one. Probably it wet other task on waiting for ram.

Please uncheck box for native task and test application. You client got many task failed because CVMFS is not installed. If you want to run virtualbox you only get these task by uncheck native but would suggest to run sixtrack and maybe theory until you added more memory.
ID: 43913 · Report as offensive     Reply Quote
Profile Ictineu Franclort

Send message
Joined: 23 Nov 15
Posts: 4
Credit: 120,285
RAC: 21
Message 43926 - Posted: 15 Dec 2020, 18:11:34 UTC - in response to Message 43913.  

Hello Gunde.
I have disabled CMS, ATLAS and native spots; as you told me, and BOINC has started running Theory. I'm also running the GPUGRID project in BOINC, maybe it's too much ?. With Kubuntu 18.04, I had no such issues. They came to me as a result of switching to Kubuntu 20.04, although the change is well worth it. I plan to upgrade the RAM to 24GB, but it won’t be this year. I go to the Linux section, to ask about CVMFS installation issues, thanks.
ID: 43926 · Report as offensive     Reply Quote

Message boards : CMS Application : Had ~100 failures on CMS 50


©2021 CERN