Message boards : CMS Application : "Waiting to run,Suspended: VM job unmanageable, restarting later."
Message board moderation

To post messages, you must log in.

AuthorMessage
Jim1348

Send message
Joined: 15 Nov 14
Posts: 454
Credit: 12,357,562
RAC: 6,944
Message 32923 - Posted: 29 Oct 2017, 14:54:54 UTC

I get the subject message in BoincTasks for a job that ran about 1 hour and was only 5.43% complete.

It does not seem to be the same type of error as the others relating to WMAgent or Condor, so I thought I should post it separately. I will try rebooting in an hour or so if it does not restart.
https://lhcathome.cern.ch/lhcathome/result.php?resultid=163023518

This is on a Win7 4-bit machine with BOINC 7.8.3 and VirtualBox 5.2.0. It is limited to running four LHC jobs at once (CMS, LHCb and Theory), and there is plenty of memory (6.4 GB available out of 24 GB).
ID: 32923 · Report as offensive     Reply Quote
Crystal Pellet
Volunteer moderator
Volunteer tester

Send message
Joined: 14 Jan 10
Posts: 968
Credit: 6,362,338
RAC: 288
Message 32924 - Posted: 29 Oct 2017, 15:52:10 UTC - in response to Message 32923.  
Last modified: 29 Oct 2017, 15:54:02 UTC

The CMS application is still using the older vboxwrapper_26196.
Cause you upgraded to VirtualBox 5.2 there seems to be an issue with it in combination with the older wrapper.
You probably don't see that problem with Theory and LHCb tasks, cause they are using a newer wrapper.
I think this wrapper is a redesigned wrapper by Laurence Field where the API is replaced by VBoxManage commands like it is done for the Linux and Mac wrappers.

Simple advice: Don't ask CMS-tasks until the wrapper is updated or downgrade your VirtualBox version.

For your current waiting task: Downgrade VirtualBox and restart BOINC (no reboot needed) or abort the task.
ID: 32924 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 454
Credit: 12,357,562
RAC: 6,944
Message 32925 - Posted: 29 Oct 2017, 16:07:42 UTC - in response to Message 32924.  

The CMS application is still using the older vboxwrapper_26196.
Cause you upgraded to VirtualBox 5.2 there seems to be an issue with it in combination with the older wrapper.

Yes, I was thinking the same thing. I will go back to 5.1.30. Thanks.
ID: 32925 · Report as offensive     Reply Quote
Erich56

Send message
Joined: 18 Dec 15
Posts: 1291
Credit: 23,324,631
RAC: 4,365
Message 32927 - Posted: 29 Oct 2017, 16:59:49 UTC - in response to Message 32924.  

The CMS application is still using the older vboxwrapper_26196.
Cause you upgraded to VirtualBox 5.2 there seems to be an issue with it in combination with the older wrapper.

hi guys, I just want to inform you - whatever it's worth - that I upgraded to VBox 5.2 last week, and CMS tasks are being processed properly.
ID: 32927 · Report as offensive     Reply Quote
Jim1348

Send message
Joined: 15 Nov 14
Posts: 454
Credit: 12,357,562
RAC: 6,944
Message 32928 - Posted: 29 Oct 2017, 19:24:36 UTC - in response to Message 32927.  
Last modified: 29 Oct 2017, 19:37:15 UTC

hi guys, I just want to inform you - whatever it's worth - that I upgraded to VBox 5.2 last week, and CMS tasks are being processed properly.

Most of mine worked too. But that last one did not. After going back to 5.1.30, I still got the same errors on the two remaining CMS work units, but that was probably because they started out on 5.2.0, and that is how they are described in the stderr.txt file. So I would need to download new ones for a real test.

Unfortunately, I am now out of disk space and will have to stop LHC on this machine. It was just a test anyway, and I normally use my Linux machines for it.

EDIT: It is possible that the lack of disk space triggered the original error, but I can't determine that at this point.
ID: 32928 · Report as offensive     Reply Quote

Message boards : CMS Application : "Waiting to run,Suspended: VM job unmanageable, restarting later."


©2020 CERN