1)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45736)
Posted 22 Nov 2021 by skydivingnerd Post: It looks like the task completed 1 subtask and didn't get a 2nd subtask. The task delay was my doing. I run BOINCTasks at home to keep track of my hosts and discovered that it can execute actions based on project or task critera. In order to not miss a CMS work-unit timing out and failing I created a BOINCTask event to suspend a CMS task after a few minutes. The next two CMS tasks the Win10 host is working on will not have that happen to them. I'd also removed the <project_max_concurrent> setting after my BOINC client work-unit fetch went haywire and downloaded several hundred tasks. I'll add that back into LHC@Home and limit CMS VBox to two concurrent work-units. |
2)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45733)
Posted 22 Nov 2021 by skydivingnerd Post: I have a completed work-unit and more in progress! https://lhcathome.cern.ch/lhcathome/result.php?resultid=334074794 |
3)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45730)
Posted 21 Nov 2021 by skydivingnerd Post: I went and looked that over and see what you've found. I believe that is when I updated VBox to 6.1.28. I did see the Glidin error a few days ago and began looking into that, wondering how to troubleshoot port connectivity from the VM perspective. I also suspect this to be the reason why glidein fails. I've had port 8000 open since I started the project, but TCP/1094 was removed as I was told it was no longer needed from the LHC@Home FAQ port list https://lhcathome.web.cern.ch/test4theory/my-firewall-complaining-which-ports-does-project-use I've added both TCP/9094 and TCP/1094 back into my BOINC aliases list and now have a running CMS VBox task on my Win10 host. The task has been running for about an hour now and is using ~86% of a CPU. Much different from before where the task would end at <21 minutes and consume no more than 3-4% CPU. Can the LHC@Home port listing be re-validated to ensure all used ports are on the FAQ? |
4)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45727)
Posted 19 Nov 2021 by skydivingnerd Post: Maybe there is a lot of junk media left over from crashed VM's, since it looks like the hard disk was lost. I experienced issues that left junk in the .\slots\ folder and it was one of the first troubleshooting steps I did when I started digging into this on my Win10 client. The VM crash persists through multiple VirtualBox versions, 6.1.12, 6.1.16, 6.1.28 and ensuring the BOINC data directory .\slots\ folders and the VBox manager are clean. So far, I've not gotten far on the Virtual Box forums. There are some suggestions on the underlying configuration of the LHC@Home VM, but I'm not in anyway qualified to raise these to the project. As far as I can search on the LHC@Home forums, I can't find anything like what I'm experiencing. If it goes on much longer with no appreciable suggestions on troubleshooting on the contents of the log files, I may have to just remove the project from my Win10 host. I don't want to do that. But I won't have a choice if I can't troubleshoot the issue and just keep sending failing work-units back. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10687301&offset=0&show_names=0&state=6&appid=11 |
5)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45705)
Posted 15 Nov 2021 by skydivingnerd Post: I made a guestimate and was waiting on my host when it received a CMS task. I've captured the VBox, VBoxHardening, VBoxUI, and vbox_trace logs from the ./slots/ folder the task was running in. I'm posting to the VirtualBox forums for help in troubleshooting the logs. https://forums.virtualbox.org/viewtopic.php?f=3&t=104465 If anyone knows how to review and troubleshoot VBox logs, I've uploaded them to a workdrive space https://workdrive.zoho.com/folder/pgoec33ffeb6d9e36461c9a953c076976b93c R/S Scott |
6)
Questions and Answers :
Windows :
Vbox logs from failed LHC tasks - capturing vbox.log and vboxhardening.log
(Message 45704)
Posted 15 Nov 2021 by skydivingnerd Post: I made a guestimate and was waiting on my host when it received a CMS task. I've captured the VBox, VBoxHardening, VBoxUI, and vbox_trace logs from the ./slots/ folder the task was running in. I'm posting to the VirtualBox forums for help in troubleshooting the logs. https://forums.virtualbox.org/viewtopic.php?f=3&t=104465 |
7)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45688)
Posted 13 Nov 2021 by skydivingnerd Post: So removing Anti-virus did not solve the issue. This task just failed at 1349 EST. https://lhcathome.cern.ch/lhcathome/result.php?resultid=332726303 |
8)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45679)
Posted 13 Nov 2021 by skydivingnerd Post: That is from your CMS-task with no success: I'm not understanding how that can help. The heartbeat time frame is a variable of the project/task. It's not a user configurable setting that would need to be "corrected" to ensure tasks complete. |
9)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45675)
Posted 13 Nov 2021 by skydivingnerd Post: That's just it, no I do not see it. I'm going down the path of troubleshooting the vbox.log and vboxhardening.log of the individual VMs when they are running. I've also uninstalled my AntiVirus, rebooted, and ensured its cleared out. The last CMS task my host got yesturday was during a time when Rosetta@home was not sending tasks and my host only had a couple of them running when the CMS task started. https://lhcathome.cern.ch/lhcathome/result.php?resultid=332568799 The CMS task stderr log shows that it there was 26374 MB of memory available when it started. Memory size: 32673 MByte Memory available: 26374 MByte This task still failed in the same way as the past several days despite a much lower CPU and memory load. Note: I have a companion thread to this one on vbox.log and vboxhardening.log capture. https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5754 |
10)
Questions and Answers :
Windows :
Vbox logs from failed LHC tasks - capturing vbox.log and vboxhardening.log
(Message 45674)
Posted 13 Nov 2021 by skydivingnerd Post: I thought I'd create a new thread here as I'm not finding any related posts on the LHC boards. In attempting to troubleshoot the issues my Win10 host is having in thread https://lhcathome.cern.ch/lhcathome/forum_thread.php?id=5751, I'm looking more into how to troubleshoot vbox guest instances and their interaction with host machines. The individual VM vbox.log and vboxhardening.log are created inside the "<BOINC directory>/slots/n/..." and removed/deleted when a task fails. I know that part of the VBoxSVC.log is copied into the LHC task stderr output under the "Hypervisor System Log:" and I believe a part of the individual vm vbox.log is under the "VM Trace Log:" section. (I'm not sure of the VM TRace log info since I have not looked into the vbox.log of a running CMS task.) It does not look like any of the vboxhardening.log info is captured in the task stderr output. For my host issue above, I need to get the whole log files for troubleshooting with VirtualBox forums. The only option right now is to hover over my client, waiting for a CMS task to be assigned, and watching the vm log files in the ~20 minutes it's running. - Is there a way to capture/copy the whole log file to another directory when a task fails? - Can this functionality be added to the LHC features or capabilities backlog for development? - Could it be a debug option enabled through one of the BOINC client config files? (no idea if that is feasible or not) |
11)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45669)
Posted 12 Nov 2021 by skydivingnerd Post: Thanks. I've set Rosetta to not fetch new tasks so only LHC CMS tasks will run. I'll see if that does it. If it does, is this an issue of needing more system memory so everything can run in memory without drive swapping? Or is this a limitation on floating point calculations of the processor? R/S Scott |
12)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45665)
Posted 11 Nov 2021 by skydivingnerd Post: I just reset the project and will know what that does tomorrow afternoon. On the permissions front, my user account runs the BOINC process and and Full Control rights to the C:\ProgramData\BOINC and C:\ProgramData\BOINC\slots directories. I can see my user account owns the processes for BOINC Mgr and the running tasks via Process Explorer. My host has a Samsung 860 EVO 1TB SSD and resource monitor shows disk I/O from 1-4% with BOINC running 16 threads of Rosetta@home tasks. In Use memory is at ~17GB out of 32GB in the system. When I was attempting to run more than two CMS tasks a while back, I would get notices in BOINC that a CMS task was paused, waiting for memory. That said, I've had 14 Rosetta and 2 CMS tasks running concurrently on this host. The past few days LHC@home has only assigned a single CMS task to this host due to the number of failures. R/S Scott |
13)
Questions and Answers :
Windows :
vBox could not find machine - ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND
(Message 45663)
Posted 11 Nov 2021 by skydivingnerd Post: My Win10 host is running BOINC 7.16.20 with vBox 6.1.16 plus the 6.1.16 Extension pack. I've been encountering this issue where my CMS tasks are failing because vBox is unable to find the VM to run the task. I've uninstalled and reinstalled BOINC and vBox several times with the issue persisting. I'm unsure how to continue troubleshooting this. Most recent failed task https://lhcathome.cern.ch/lhcathome/result.php?resultid=332187349 Tasks page for my host https://lhcathome.cern.ch/lhcathome/results.php?hostid=10687301&offset=0&show_names=0&state=6&appid=11 I can see in the BOINC logs that the VM is started in a project slot 2021-11-10 14:25:35 (15216): forwarding host port 52026 to guest port 80 2021-11-10 14:25:35 (15216): Enabling remote desktop for VM. 2021-11-10 14:25:36 (15216): Enabling shared directory for VM. 2021-11-10 14:25:36 (15216): Starting VM using VBoxManage interface. (boinc_62fd0aaede4ddcf6, slot#16) 2021-11-10 14:25:41 (15216): Successfully started VM. (PID = '2380') 2021-11-10 14:25:41 (15216): Reporting VM Process ID to BOINC. 2021-11-10 14:25:41 (15216): Guest Log: BIOS: VirtualBox 6.1.16 2021-11-10 14:25:41 (15216): Guest Log: CPUID EDX: 0x178bfbff 2021-11-10 14:25:41 (15216): Guest Log: BIOS: ata0-0: PCHS=16383/16/63 LCHS=1024/255/63 2021-11-10 14:25:41 (15216): VM state change detected. (old = 'poweredoff', new = 'running') But then a short time later, in the vBox hypervisor logs show that the registered VM cannot be found. 17:35:01.292412 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={d0a0163f-e254-4e5b-a1f2-011cf991c38d} aComponent={VirtualBoxWrap} aText={Could not find a registered machine named 'boinc_62fd0aaede4ddcf6'}, preserve=false aResultDetail=0 The VM trace logs show that it takes about 20 minutes for it to error out and initiate a shutdown of the VM even though it logs it starting. 2021-11-10 14:25:36 (15216): Command: VBoxManage -q sharedfolder add "boinc_62fd0aaede4ddcf6" --name "shared" --hostpath "C:\ProgramData\BOINC\slots\16/shared" Exit Code: 0 Output: 2021-11-10 14:25:40 (15216): Command: VBoxManage -q startvm "boinc_62fd0aaede4ddcf6" --type headless Exit Code: 0 Output: Waiting for VM "boinc_62fd0aaede4ddcf6" to power on... VM "boinc_62fd0aaede4ddcf6" has been successfully started. 2021-11-10 14:25:42 (15216): Command: VBoxManage -q controlvm "boinc_62fd0aaede4ddcf6" cpuexecutioncap 100 Exit Code: 0 Output: 2021-11-10 14:45:43 (15216): Command: VBoxManage -q controlvm "boinc_62fd0aaede4ddcf6" poweroff Exit Code: 0 Output: 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100% 2021-11-10 14:45:43 (15216): Command: VBoxManage -q snapshot "boinc_62fd0aaede4ddcf6" list Exit Code: -108 Output: This machine does not have any snapshots 2021-11-10 14:45:44 (15216): Command: VBoxManage -q bandwidthctl "boinc_62fd0aaede4ddcf6" remove "boinc_62fd0aaede4ddcf6_net" Exit Code: 0 Output: What can be looked at next to nail this error down? R/S Scott |
14)
Message boards :
CMS Application :
getting 'Error while computing' for CMS tasks
(Message 45629)
Posted 7 Nov 2021 by skydivingnerd Post: I removed vBox 6.1.12, rebooted, reinstalled 6.1.12 and the issue persists. Since the vBox logs still show notices for a version difference from 6.1.12 and 6.1.16, I uninstalled 6.1.12 and installed 6.1.16. Rebooting and ensuring all the system files were gone. I do not have any extension packs configured for use. Do I need to configure that? I've previously read that the vBox extension pack was not needed. Is that incorrect? I'm waiting for a few tasks to download to see if the issue persists in the vBox logs. If so, I'll open a new thread under the Windows board. |
15)
Message boards :
CMS Application :
getting 'Error while computing' for CMS tasks
(Message 45625)
Posted 6 Nov 2021 by skydivingnerd Post: The main BOINC directory is at C:\Program Files\BOINCwith the data being at C:\ProgramData\BOINC Virtual box is at C:\Users\Scott\.VirtualBox Looking into the vBox logs, I do see that it's pointing out the differing versions and lack of permissions to the vm storage. Could it not have liked the software downgrade? I'll download the current versions again and reinstall them. 03:59:21.738008 Saving settings file "C:\ProgramData\BOINC\slots\9\boinc_00ca13701b3d268d\boinc_00ca13701b3d268d.vbox" with version "1.16-windows" 03:59:21.886440 ERROR [COM]: aRC=E_FAIL (0x80004005) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={This machine does not have any snapshots}, preserve=false aResultDetail=0 03:59:21.926647 Saving settings file "C:\ProgramData\BOINC\slots\9\boinc_00ca13701b3d268d\boinc_00ca13701b3d268d.vbox" with version "1.16-windows" 03:59:22.180939 Saving settings file "C:\Users\Scott\.VirtualBox\VirtualBox.xml" with version "1.12-windows" 03:59:28.134972 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={d0a0163f-e254-4e5b-a1f2-011cf991c38d} aComponent={VirtualBoxWrap} aText={Could not find a registered machine named 'boinc_c84b3e5bbd6d461e'}, preserve=false aResultDetail=0 03:59:28.389054 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:28.725563 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:28.893063 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:29.647620 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:30.054998 Saving settings file "C:\Users\Scott\.VirtualBox\VirtualBox.xml" with version "1.12-windows" 03:59:30.161742 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:30.172140 Saving settings file "C:\Users\Scott\.VirtualBox\VirtualBox.xml" with version "1.12-windows" 03:59:30.418205 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:30.703352 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:30.962257 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:31.220827 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:31.493533 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:31.769940 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:32.026155 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:32.284644 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:32.544079 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:32.800604 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:33.059791 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:33.316373 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:33.577000 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:33.833723 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:34.091140 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 0 on port 0 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:34.091339 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:34.995533 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:35.100846 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 0 on port 0 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:35.101101 Saving settings file "C:\Users\Scott\.VirtualBox\VirtualBox.xml" with version "1.12-windows" 03:59:35.117080 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:35.349780 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 0 on port 1 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:35.350066 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 0 on port 1 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:35.353458 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:35.605351 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:35.866400 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:36.375597 Saving settings file "C:\ProgramData\BOINC\slots\8\boinc_c84b3e5bbd6d461e\boinc_c84b3e5bbd6d461e.vbox" with version "1.16-windows" 03:59:36.630528 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 0 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:36.630634 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 0 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:36.631122 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 1 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:36.631188 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={No storage device attached to device slot 1 on port 1 of controller 'Hard Disk Controller'}, preserve=false aResultDetail=0 03:59:36.635096 ERROR [COM]: aRC=E_FAIL (0x80004005) aIID={85632c68-b5bb-4316-a900-5eb28d3413df} aComponent={SessionMachine} aText={This machine does not have any snapshots}, preserve=false aResultDetail=0 03:59:36.886577 Launched VM: 79687136 pid: 10988 (0x2aec) frontend: headless name: boinc_c84b3e5bbd6d461e 03:59:37.339774 ERROR [COM]: aRC=VBOX_E_OBJECT_NOT_FOUND (0x80bb0001) aIID={d0a0163f-e254-4e5b-a1f2-011cf991c38d} aComponent={VirtualBoxWrap} aText={Could not find a registered machine named 'boinc_0665d7a71cb3cdc3'}, preserve=false aResultDetail=0 03:59:37.591751 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:37.929053 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 03:59:38.100938 ERROR [COM]: aRC=E_ACCESSDENIED (0x80070005) aIID={ad47ad09-787b-44ab-b343-a082a3f2dfb1} aComponent={MediumWrap} aText={The object is not ready}, preserve=false aResultDetail=0 |
16)
Message boards :
CMS Application :
getting 'Error while computing' for CMS tasks
(Message 45624)
Posted 6 Nov 2021 by skydivingnerd Post: I did downgrade the BOINC and vBox versions while attempting to troubleshoot the runaway task downloads I was encountering. I found, through the Rosetta@home forum, that BOINC has issues with calculating task queue depth with the <max_concurrent> or <project_max_concurrent> flags in project app_config files. I thought it could have been the upgrade of BOINC and vBox I did a while back, so I downgraded them. I've also removed those settings from my Win10 host app_config. I have rebooted since then and just now had my client upload more failed results to LHC@home. This task is one of the ones that failed just in the last 15-20 minutes. https://lhcathome.cern.ch/lhcathome/result.php?result_name=CMS_3108593_1635827713.939210_0 I've also restricted the project from getting new tasks for the time being until I can get this fixed. R/S Scott |
17)
Message boards :
CMS Application :
getting 'Error while computing' for CMS tasks
(Message 45619)
Posted 6 Nov 2021 by skydivingnerd Post: I'm getting a lot of computation errors for CMS vBox64 tasks on my Win10 host. https://lhcathome.cern.ch/lhcathome/results.php?hostid=10687301&offset=0&show_names=0&state=6&appid= I'm not sure where to start in troubleshooting this. My other Linux based hosts are doing ok (aside from sporadic comms issues with my FW and SNORT). I'm running BOINC client 7.16.11 with vBox 6.1.12. Below is the app_config file from my Win10 host and the output of one of the failed tasks below that. <app_config> <!-- <app> <name>ATLAS</name> <fraction_done_exact/> </app> <app_version> <app_name>ATLAS</app_name> <plan_class>vbox64_mt_mcore_atlas</plan_class> <avg_ncpus>4.0</avg_ncpus> <cmdline>>--nthreads 4 --memory_size_mb 3800</cmdline> </app_version> <app_version> <app_name>ATLAS</app_name> <plan_class>vbox64_mt_mcore_atlas</plan_class> <avg_ncpus>8.0</avg_ncpus> <cmdline>>--nthreads 8 --memory_size_mb 5000</cmdline> </app_version> --> <!-- <app> <name>Theory</name> <fraction_done_exact/> </app> <app_version> <app_name>Theory</app_name> <plan_class>vbox64_theory</plan_class> <avg_ncpus>1.0</avg_ncpus> <cmdline>--nthreads 1</cmdline> </app_version> --> <app> <name>CMS</name> <fraction_done_exact/> </app> <app_version> <app_name>CMS</app_name> <plan_class>vbox64</plan_class> <avg_ncpus>1.0</avg_ncpus> <cmdline>--nthreads 1 --memory_size_mb 2048</cmdline> </app_version> </app_config> Here is the task output from one of the failed tasks from my Win10 host. <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> The global filename characters, * or ?, are entered incorrectly or too many global filename characters are specified. (0xd0) - exit code 208 (0xd0)</message> <stderr_txt> 2021-11-06 11:31:28 (16296): Detected: vboxwrapper 26202 2021-11-06 11:31:28 (16296): Detected: BOINC client v7.16.11 2021-11-06 11:31:28 (16296): Detected: VirtualBox VboxManage Interface (Version: 6.1.12) 2021-11-06 11:31:29 (16296): Detected: Heartbeat check (file: 'heartbeat' every 1200.000000 seconds) 2021-11-06 11:31:29 (16296): Successfully copied 'init_data.xml' to the shared directory. 2021-11-06 11:31:30 (16296): Create VM. (boinc_05d65e3058a799c6, slot#9) 2021-11-06 11:31:31 (16296): Setting Memory Size for VM. (2048MB) 2021-11-06 11:31:31 (16296): Setting CPU Count for VM. (1) 2021-11-06 11:31:31 (16296): Setting Chipset Options for VM. 2021-11-06 11:31:31 (16296): Setting Boot Options for VM. 2021-11-06 11:31:32 (16296): Setting Network Configuration for NAT. 2021-11-06 11:31:32 (16296): Enabling VM Network Access. 2021-11-06 11:31:32 (16296): Disabling USB Support for VM. 2021-11-06 11:31:32 (16296): Disabling COM Port Support for VM. 2021-11-06 11:31:33 (16296): Disabling LPT Port Support for VM. 2021-11-06 11:31:33 (16296): Disabling Audio Support for VM. 2021-11-06 11:31:33 (16296): Disabling Clipboard Support for VM. 2021-11-06 11:31:33 (16296): Disabling Drag and Drop Support for VM. 2021-11-06 11:31:34 (16296): Adding storage controller(s) to VM. 2021-11-06 11:31:34 (16296): Adding virtual disk drive to VM. (vm_image.vdi) 2021-11-06 11:31:34 (16296): Adding VirtualBox Guest Additions to VM. 2021-11-06 11:31:34 (16296): Adding network bandwidth throttle group to VM. (Defaulting to 1024GB) 2021-11-06 11:31:35 (16296): forwarding host port 49866 to guest port 80 2021-11-06 11:31:35 (16296): Enabling remote desktop for VM. 2021-11-06 11:31:35 (16296): Required extension pack not installed, remote desktop not enabled. 2021-11-06 11:31:35 (16296): Enabling shared directory for VM. 2021-11-06 11:31:36 (16296): Starting VM using VBoxManage interface. (boinc_05d65e3058a799c6, slot#9) 2021-11-06 11:31:39 (16296): Successfully started VM. (PID = '15392') 2021-11-06 11:31:39 (16296): Reporting VM Process ID to BOINC. 2021-11-06 11:31:39 (16296): Guest Log: BIOS: VirtualBox 6.1.12 2021-11-06 11:31:39 (16296): Guest Log: CPUID EDX: 0x178bfbff 2021-11-06 11:31:39 (16296): Guest Log: BIOS: ata0-0: PCHS=16383/16/63 LCHS=1024/255/63 2021-11-06 11:31:39 (16296): VM state change detected. (old = 'poweredoff', new = 'running') 2021-11-06 11:31:39 (16296): Detected: Web Application Enabled (http://localhost:49866) 2021-11-06 11:31:39 (16296): Preference change detected 2021-11-06 11:31:39 (16296): Setting CPU throttle for VM. (100%) 2021-11-06 11:31:40 (16296): Setting checkpoint interval to 600 seconds. (Higher value of (Preference: 600 seconds) or (Vbox_job.xml: 600 seconds)) 2021-11-06 11:31:41 (16296): Guest Log: BIOS: Boot : bseqnr=1, bootseq=0032 2021-11-06 11:31:41 (16296): Guest Log: BIOS: Booting from Hard Disk... 2021-11-06 11:31:43 (16296): Guest Log: BIOS: KBD: unsupported int 16h function 03 2021-11-06 11:31:43 (16296): Guest Log: BIOS: AX=0305 BX=0000 CX=0000 DX=0000 2021-11-06 11:32:03 (16296): Guest Log: vgdrvHeartbeatInit: Setting up heartbeat to trigger every 2000 milliseconds 2021-11-06 11:32:03 (16296): Guest Log: vboxguest: misc device minor 56, IRQ 20, I/O port d020, MMIO at 00000000f0400000 (size 0x400000) 2021-11-06 11:32:04 (16296): Guest Log: VBoxService 5.2.6 r120293 (verbosity: 0) linux.amd64 (Jan 15 2018 14:51:00) release log 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000085 main Log opened 2021-11-06T15:32:03.999422000Z 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000168 main OS Product: Linux 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000193 main OS Release: 4.14.232-19.cernvm.x86_64 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000215 main OS Version: #1 SMP Fri Apr 30 17:12:25 CEST 2021 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000247 main Executable: /usr/sbin/VBoxService 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000247 main Process ID: 2153 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.000248 main Package type: LINUX_64BITS_GENERIC 2021-11-06 11:32:04 (16296): Guest Log: 00:00:00.001532 main 5.2.6 r120293 started. Verbose level = 0 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Mounting the shared directory 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Shared directory mounted, enabling vboxmonitor 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Sourcing essential functions from /cvmfs/grid.cern.ch 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Testing connection to cern.ch 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Testing connection to VCCS 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Testing connection to HTCondor 2021-11-06 11:32:13 (16296): Guest Log: [INFO] Testing connection to WMAgent 2021-11-06 11:32:14 (16296): Guest Log: [INFO] Testing connection to Frontier 2021-11-06 11:32:14 (16296): Guest Log: [INFO] Got a proxy from the local BOINC client 2021-11-06 11:32:14 (16296): Guest Log: [INFO] Will use it for CVMFS and Frontier 2021-11-06 11:32:14 (16296): Guest Log: [INFO] Reloading and probing the CVMFS configuration 2021-11-06 11:32:18 (16296): Guest Log: [INFO] Probing /cvmfs/cvmfs-config.cern.ch... OK 2021-11-06 11:32:18 (16296): Guest Log: [INFO] Probing /cvmfs/grid.cern.ch... OK 2021-11-06 11:32:20 (16296): Guest Log: [INFO] Probing /cvmfs/oasis.opensciencegrid.org... OK 2021-11-06 11:32:20 (16296): Guest Log: [INFO] Probing /cvmfs/singularity.opensciencegrid.org... OK 2021-11-06 11:32:20 (16296): Guest Log: [INFO] Probing /cvmfs/cms-ib.cern.ch... OK 2021-11-06 11:32:20 (16296): Guest Log: [INFO] Probing /cvmfs/cms.cern.ch... OK 2021-11-06 11:32:20 (16296): Guest Log: [INFO] Excerpt from "cvmfs_config stat": VERSION HOST PROXY 2021-11-06 11:32:20 (16296): Guest Log: [INFO] 2.7.2.0 http://s1bnl-cvmfs.openhtc.io http://192.168.150.1:3128 2021-11-06 11:32:20 (16296): Guest Log: [INFO] Reading volunteer information 2021-11-06 11:32:21 (16296): Guest Log: [INFO] Requesting an X509 credential from LHC@home 2021-11-06 11:32:22 (16296): Guest Log: [INFO] CMS application starting. Check log files. 2021-11-06 11:52:22 (16296): Guest Log: [ERROR] glidein exited with return value 1. 2021-11-06 11:52:22 (16296): Guest Log: [DEBUG] Volunteer: scotth (787857) 2021-11-06 11:52:22 (16296): Guest Log: [INFO] Shutting Down. 2021-11-06 11:52:52 (16296): VM Completion File Detected. 2021-11-06 11:52:52 (16296): VM Completion Message: glidein exited with return value 1. . 2021-11-06 11:52:52 (16296): Powering off VM. 2021-11-06 11:52:52 (16296): Successfully stopped VM. 2021-11-06 11:52:52 (16296): Deregistering VM. (boinc_05d65e3058a799c6, slot#9) 2021-11-06 11:52:52 (16296): Removing network bandwidth throttle group from VM. 2021-11-06 11:52:53 (16296): Removing VM from VirtualBox. 11:52:58 (16296): called boinc_finish(208) </stderr_txt> ]]> |
18)
Questions and Answers :
Wish list :
Set max. WU per day to 1 for hosts who do not deliver valid results.
(Message 45614)
Posted 5 Nov 2021 by skydivingnerd Post: All, My bad. It's been a while since I've been monitoring my machines. Other life issues has taken priority. I've detached that client from LHC for the time being until I can devote time to rebuilding it. R/S Scott H |
19)
Questions and Answers :
Windows :
Windows vbox64 CMS Simulation tasks failing - VM unable to validate X509 credential from LHC@home
(Message 44945)
Posted 13 May 2021 by skydivingnerd Post: Great news on identifying the firewall port page needs updating. I now have three completed CMS Simulation tasks for my Win10 host! https://lhcathome.cern.ch/lhcathome/result.php?resultid=316425963 https://lhcathome.cern.ch/lhcathome/result.php?resultid=316423089 https://lhcathome.cern.ch/lhcathome/result.php?resultid=316428800 The only modification I've made since your previous post was in adding port 3126 to my rule allowing it out. I saw that in the error log of one of my failed work units when you quoted it back in my post. I have not made any additional changes on that. Was the CMS Simulation VM updated? Additionally, from your info on the port usage
I'll remove those from my allowed outbound ports for the LHC@Home traffic. Speaking to your comment here:
I've always had my FW rule configured to allow the identified ports out to any IP. I specifically added the CVMFS IPs I found to my Snort PASS list to ensure any of them did not get blocked by a signature hit. Now that the host is working, yes, I will be looking up the Squid configuration and setting it up in pfSense to get my clients from reaching all the way out. Thank you! R/S Scott |
20)
Questions and Answers :
Windows :
Windows vbox64 CMS Simulation tasks failing - VM unable to validate X509 credential from LHC@home
(Message 44943)
Posted 13 May 2021 by skydivingnerd Post:
I didn't think it really would... but gave it a shot anyway just to be sure for myself.
I don't have a local Squid proxy configured on, or for, my hosts. All my other hosts (except one, which will be getting an OS rebuild soon) running native work units are reaching out for their images. My ISP connection handles the traffic easily. I'm just working on getting all my hosts running correctly, then will be configuring Squid proxy on my firewall and then making config changes on each host. Then working out any issues on that...
Below is one of the many links I found when I was initially setting up LHC@Home and getting native work units to run correctly. I've configured a port alias in pfSense to handle it all, with the exception of my existing rules for port 80 and 443. The FW rule allowing all the traffic is configured for TCP only vice TCP/UDP. https://lhcathome.web.cern.ch/test4theory/my-firewall-complaining-which-ports-does-project-use Here is my port list: 3125 Common - CVMFS 8000 ATLAS - HTTP 8080 ATLAS - HTTP 23128 ATLAS - HTTP 3127:3128 ATLAS - HTTP Proxy 5222 ATLAS - XMPP 9094 ATLAS - TCP 9618 Theory, CMS, LHCb - Condor 4080 CMS - WMAgent 8080 CMS - Frontier 8443 LHCb - DIRAC 9133:9149 LHCb - DIRAC 9166 LHCb - DIRAC 9196:9199 LHCb - DIRAC I've also been chewing through my Snort logs the past several weeks, identifying and suppressing signature alerts for LHC@Home traffic. I've got a nice list of IP addresses LHC@home communicates with. A few of what I believe are the more critical CVMFS IP addresses I've added to an "External Server" alias list and configured that on the Snort Pass list to prevent any alerting on those. Here are the CVMFS entries I have in the alias: 104.21.88.130 LHC@Home - s1f'nal/bnl/unl/cern/ral'-cvmfs.openhtc.io 172.67.179.99 LHC@Home - s1f'nal/bnl/unl/cern/ral'-cvmfs.openhtc.io 158.39.48.38 LHC@Home - atlas-db-squid1.grid.uiocloud.no I'm still stuck on the response I saw in the packet capture from the LHC@Home CMS Simulation VM. It actively rejected the server side Certificate Authority as invalid. I still believe this is a LHC server side issue unless someone can validate that I'm the only one with this issue. R/S Scott |
©2024 CERN