Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
Message board moderation
Previous · 1 · 2
Author | Message |
---|---|
Send message Joined: 2 May 07 Posts: 2243 Credit: 173,902,375 RAC: 2,454 |
CMS is running mostly 11-13 hours. So, one step is solved, but... |
Send message Joined: 15 Jun 08 Posts: 2532 Credit: 253,776,449 RAC: 34,524 |
It looks like the task completed 1 subtask and didn't get a 2nd subtask. That's not a classical failure but it makes a task less efficient. The reason might be the long delay of nearly 7 h shown here: 2021-11-21 10:32:41 (10380): VM state change detected. (old = 'running', new = 'paused') 2021-11-21 17:23:43 (10380): VM state change detected. (old = 'paused', new = 'running') Tasks not getting a new subtask from WMagent shut down after 10 min but the decision matrix and the timeout settings are hidden deep in the server side processes and not shown in the task logs here. If you want to reduce the pause/resume cycles you may increase the project's basic priority (compared to other projects), e.g. 500/100, and then limit the number of running tasks via <project_max_concurrent> in an app_config.xml. This method would need a while and some testing to find out the required values. |
Send message Joined: 8 Apr 21 Posts: 23 Credit: 45,263,576 RAC: 42,463 |
It looks like the task completed 1 subtask and didn't get a 2nd subtask. The task delay was my doing. I run BOINCTasks at home to keep track of my hosts and discovered that it can execute actions based on project or task critera. In order to not miss a CMS work-unit timing out and failing I created a BOINCTask event to suspend a CMS task after a few minutes. The next two CMS tasks the Win10 host is working on will not have that happen to them. I'd also removed the <project_max_concurrent> setting after my BOINC client work-unit fetch went haywire and downloaded several hundred tasks. I'll add that back into LHC@Home and limit CMS VBox to two concurrent work-units. |
Send message Joined: 15 Jun 08 Posts: 2532 Credit: 253,776,449 RAC: 34,524 |
The log from this task looks fine: https://lhcathome.cern.ch/lhcathome/result.php?resultid=334126568 Just a very minor issue (in fact, not really an issue): 2021-11-21 21:32:38 (2764): Guest Log: [INFO] Testing connection to cern.ch 2021-11-21 21:32:53 (2764): Guest Log: [DEBUG] Status run 1 of up to 3: 1 The first line reports that a small packet had been sent to www.cern.ch port 80, just to test if port 80 is not blocked by a firewall along the route. The timeout to wait for a reply is set to 15 s, lots of time for the packet to travel around the world and back several times, but somewhere it got lost. In that case the VM repeats the test (up to 3 times) before it writes a more verbose debug message and shuts down the VM (because port 80 is essential). As this situation happens very rarely it may point out that the LAN or internet connection of that computer was temporarily on a very high load. |
©2024 CERN