Message boards : Number crunching : "Giving up catch-up attempt.."
Message board moderation

To post messages, you must log in.

AuthorMessage
Profile ritterm
Avatar

Send message
Joined: 30 May 08
Posts: 93
Credit: 5,160,246
RAC: 0
Message 28299 - Posted: 30 Dec 2016, 21:15:06 UTC

My 8-core, 16GB RAM host is running 2 LHCb and 2 CMS tasks right now (alongside no other BOINC tasks requiring significant RAM) and I'm seeing a lot of "Giving up catch-up attempt.." log entries in each one. In some cases there are only a few at a time, but others show the messages coming repeatedly for several hours. So far, each task seems so be moving along with log entries indicating jobs starting and finishing.

Are these messages indicative of anything that I should be worried about? Are they specific to LHCb or CMS? I'm not seeing them on my other 16GB RAM host that's running 2 ATLAS and 2 Theory jobs.
ID: 28299 · Report as offensive     Reply Quote
Profile ritterm
Avatar

Send message
Joined: 30 May 08
Posts: 93
Credit: 5,160,246
RAC: 0
Message 28307 - Posted: 2 Jan 2017, 14:43:17 UTC
Last modified: 2 Jan 2017, 14:47:30 UTC

I'm wondering if these messages are an indication that the host is overloaded with VM tasks. I've found that (1) the number of "giving up" messages is far fewer when running only 3 VM tasks and non-existent when running only 2 and (2) there's a big difference between the running and CPU times of completed tasks (often 10K-12K seconds). The mix of tasks doesn't seem to matter (although I didn't try running 3-4 Theory tasks).
ID: 28307 · Report as offensive     Reply Quote
Profile Magic Quantum Mechanic
Avatar

Send message
Joined: 24 Oct 04
Posts: 1117
Credit: 49,725,007
RAC: 13,926
Message 28308 - Posted: 2 Jan 2017, 17:40:15 UTC

I just noticed you also have a Theory task that did the same thing.

Are you suspending and trying to restart the tasks or doing reboots?

I see it is linux so I don't know how or if you have to do any updates.

But what you have been getting usually means the VB will not restart.

You have plenty of Ram

I have an 8-core with only 8GB ram and it runs 4 tasks here and two of the 2-core tasks at vLHC-dev so all 8 cores are running and I never have any problems.

I do only run Theory tasks.
Volunteer Mad Scientist For Life
ID: 28308 · Report as offensive     Reply Quote
Profile ritterm
Avatar

Send message
Joined: 30 May 08
Posts: 93
Credit: 5,160,246
RAC: 0
Message 28343 - Posted: 4 Jan 2017, 18:02:50 UTC - in response to Message 28308.  

Are you suspending and trying to restart the tasks or doing reboots?

No. My BOINC hosts pretty much run 24/7.

I see it is linux so I don't know how or if you have to do any updates.

I install updates pretty much as soon as the become available. I'm running VM 5.1.10 and haven't upgraded, but maybe I'll try that.

It's a shame that I don't seem to be able to run 4 VMs concurrently and efficiently on this host. However, it is a home-built system that has acted quirky on some projects. I may have to resign myself to running only two VMs at a time... :-(
ID: 28343 · Report as offensive     Reply Quote

Message boards : Number crunching : "Giving up catch-up attempt.."


©2024 CERN