Questions and Answers : Windows : vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
Message board moderation

To post messages, you must log in.

Previous · 1 · 2

AuthorMessage
maeax

Send message
Joined: 2 May 07
Posts: 2071
Credit: 156,150,187
RAC: 105,782
Message 45734 - Posted: 22 Nov 2021, 12:37:11 UTC - in response to Message 45733.  

CMS is running mostly 11-13 hours. So, one step is solved, but...
ID: 45734 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,957,310
RAC: 136,899
Message 45735 - Posted: 22 Nov 2021, 13:59:04 UTC - in response to Message 45733.  

It looks like the task completed 1 subtask and didn't get a 2nd subtask.
That's not a classical failure but it makes a task less efficient.

The reason might be the long delay of nearly 7 h shown here:
2021-11-21 10:32:41 (10380): VM state change detected. (old = 'running', new = 'paused')
2021-11-21 17:23:43 (10380): VM state change detected. (old = 'paused', new = 'running')


Tasks not getting a new subtask from WMagent shut down after 10 min but the decision matrix and the timeout settings are hidden deep in the server side processes and not shown in the task logs here.



If you want to reduce the pause/resume cycles you may increase the project's basic priority (compared to other projects), e.g. 500/100, and then limit the number of running tasks via <project_max_concurrent> in an app_config.xml.
This method would need a while and some testing to find out the required values.
ID: 45735 · Report as offensive     Reply Quote
skydivingnerd

Send message
Joined: 8 Apr 21
Posts: 23
Credit: 31,852,519
RAC: 56,106
Message 45736 - Posted: 22 Nov 2021, 14:19:40 UTC - in response to Message 45735.  

It looks like the task completed 1 subtask and didn't get a 2nd subtask.
That's not a classical failure but it makes a task less efficient.

The reason might be the long delay of nearly 7 h shown here:

2021-11-21 10:32:41 (10380): VM state change detected. (old = 'running', new = 'paused')
2021-11-21 17:23:43 (10380): VM state change detected. (old = 'paused', new = 'running')

The task delay was my doing. I run BOINCTasks at home to keep track of my hosts and discovered that it can execute actions based on project or task critera. In order to not miss a CMS work-unit timing out and failing I created a BOINCTask event to suspend a CMS task after a few minutes. The next two CMS tasks the Win10 host is working on will not have that happen to them. I'd also removed the <project_max_concurrent> setting after my BOINC client work-unit fetch went haywire and downloaded several hundred tasks. I'll add that back into LHC@Home and limit CMS VBox to two concurrent work-units.
ID: 45736 · Report as offensive     Reply Quote
computezrmle
Volunteer moderator
Volunteer developer
Volunteer tester
Help desk expert
Avatar

Send message
Joined: 15 Jun 08
Posts: 2386
Credit: 222,957,310
RAC: 136,899
Message 45737 - Posted: 22 Nov 2021, 15:19:21 UTC - in response to Message 45736.  

The log from this task looks fine:
https://lhcathome.cern.ch/lhcathome/result.php?resultid=334126568

Just a very minor issue (in fact, not really an issue):
2021-11-21 21:32:38 (2764): Guest Log: [INFO] Testing connection to cern.ch
2021-11-21 21:32:53 (2764): Guest Log: [DEBUG] Status run 1 of up to 3: 1

The first line reports that a small packet had been sent to www.cern.ch port 80, just to test if port 80 is not blocked by a firewall along the route.
The timeout to wait for a reply is set to 15 s, lots of time for the packet to travel around the world and back several times, but somewhere it got lost.
In that case the VM repeats the test (up to 3 times) before it writes a more verbose debug message and shuts down the VM (because port 80 is essential).
As this situation happens very rarely it may point out that the LAN or internet connection of that computer was temporarily on a very high load.
ID: 45737 · Report as offensive     Reply Quote
Previous · 1 · 2

Questions and Answers : Windows : vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND


©2024 CERN